Chapter VII Constrained Least Squares, Penalty Functions, and Blue's

Chapter VII Constrained Least Squares, Penalty Functions, and Blue's

Chapter VII ~ C O N S T R A I N E D LEAST SQUARES, P E N A L T Y F U N C T I O N S , A N D BLUE’S (7.1) Penalty Functions In many applications, we ...

274KB Sizes 0 Downloads 17 Views

Chapter VII ~

C O N S T R A I N E D LEAST SQUARES, P E N A L T Y F U N C T I O N S , A N D BLUE’S

(7.1) Penalty Functions

In many applications, we have seen that it is necessary to compute a weighted least squares estimator subject to linear equality constraints. That is, it is necessary to find a value of x which minimizes (7.1.1)

( z - H X ) V - 2 (z - Hx)

subject to the constraints (7.1.2)

GX = u

where V is a known positive-definite matrix, z and u are given vectors, and H and G are given rectangular matrices. A very general result in the theory of minimization which is associated with the “penalty function method,” asserts that the value of x which minimizes (7.1.3)

h(x)

+P g 2 ( x )

[call it x(A)], converges (as L + 0) to (7.1.4)

120

VII

CONSTRAINED LEAST SQUARES, PENALTY FUNCTIONS, AND

BLUE’S

if certain mild continuity restrictions are met, and that xo minimizes h ( x ) subject to the constraint g ( x ) = 0 (Butler and Martin [l]). Furthermore,

+ A-2g2(x(A))] = h(xo)

Iim[h(x(li))

(7.1.5)

1-0

so that the minimal value of (7.1.3) converges to the minimal value of h on the constraint set. The term ( i L - 2 ) g 2 ( xis ) called a “penalty function” because the minimization of h ( x ) A - 2 g 2 (x) suffers when x lies outside the constraint set [ g ( x ) = 01 if A is small. In the case at hand, if we let

+

(7.1.6)

h(x) =

(Z-Hx)TV-2(Z-Hx)

and

(7.1.7)

g2(x)= (u-Gx)~(u-Gx)

and if x(L) is a value of x which minimizes (7.1.8)

(Z

- H x ) V~ -

(Z

-H x )

+ 1-’ (U - G x ) ~ (-uGx)

then it is reasonable to expect that xo = limx(A)

(7.1.9)

1-0

exists and minimizes (7.1.6) subject to g 2 ( x ) = 0 (i-e., Gx = u). Instead of invoking the general theorem, we will produce a self-contained proof based upon the material developed so far: (7.1.10) Theorem: Let H be an n x m matrix, G be a k x m matrix and V be an n x n positive-definite matrix. Let [order ( n + k ) x (n+k)] and

where z is a given n-vector and u is a given k-vector. Let f(A) =

Then (a)

[ P - ’ ( L ) A ] +P-l@)2.

no.) is the vector of minimum norm among those which minimize

(2 - fix)’

P - 2 (A)(2 - fix) = ( 2 - Hx)T v - 2 ( z - H x ) + 1 - 2 (u- G X ) T ( U - Gx).

CONSTRAINED LEAST SQUARES, PENALTY FUNCTIONS, AND

BLUE'S

121

(b) limA+o:(I) = xo always exists. (c) Among all vectors which minimize IIu- Gxlj', xois the one of minimum norm among those which minimize ( 2 - H x ) ~ V - '(2- H x ) . (d) If u E B(G), then the set of x's which minimize IIu- Gx1I2 is identical with the set of x's which satisfy the constraint Gx = u. In this case, xo minimizes (z- H x ) V~ P 2( z - H x ) subject to the constraint Gx = u. Furthermore, Proof of theorem: (a) ( Z " - f i ~ ) ~ p - ~ ( A ) ( Z - f= i xlI8-'z-8-'Axll2 ) and part (a) follows from (3.4). (b) and (c): Let

(7.1.10.1)

F = V-'H

(7.1.10.2)

w

=

v-'z.

Then by (3.8.1) Z(A)

=

[V-qA)R]+V - l z

= [FTF+ A-2GTG]+ [FTw

BY (4.9), [ F T F + K 2 G T C ] + = (FTF)'

+A-2GTu].

+ A2(Z-F+F)[(G+G)++ J(A)](I-F+F)T,

where

F

=

F(Z-G+G)

and

J(A)

=

O(A')

as

A

-+ 0.

Thus

(7.1.10.3) Z ( I ) = (FTF)+FTw+ I-'(PTF)+GTu

+ (I-F+F)[(G'G)+ + O ( A Z ) ] ( I - F + F ) T C T u as

A

--*

0.

But

FGT = F ( I - G + G ) G T

(7.1.10.4)

=

F[G(Z-G+G)IT

=

0

so B [ G T ] c N(F) = N(F+T) = N ( F + F ) = N[FTF)+] (3.11.5)

and so (7.1.10.5)

(FTF)+GT= 0

F+TGT = 0.

and

Therefore (7.1.10.6) ( I - F f F ) ( G T G ) +( I - F+F)TGT= (I- F+F)(GTG)+CT =

( I - F'F) G +

(3.8.1)

122

VII

CONSTRAINED LEAST SQUARES, PENALTY FUNCTIONS, AND

BLUE’S

and by (3.13.10) (7.1.10.7)

(FTF)+FT= F+

Combining (7.1.10.3)-(7.1.10.6), (7.1.10.8)

2(A) = F + w

+ (Z-F+F)[G+u+O(A2)]

as

A

+0

so

I(A)

(7.1.10.9)

= xo

+ o(A’)

as

A -+ 0

where (7.1.10.10)

x0 = F’w =

+ (Z-F+F)G+u

F + ( w - FG’u)

+ G+u.

By (3.12.7), the latter is exactly the vector of minimum norm which minimizes Ilw-Fxll’ among those vectors which also minimize Iju- Gx11’. This proves (b) and (c), since

/Iw - Fxll ’ = (2- H X ) ~ I / - ’ ( Z Hx). -

(d)

lim [Z - AI(A)ITV-’

=

(A)[Z - W?((n)]

+

lim { 11 w- Fxo O(A’)II

1+0

+ A-’

IIu- Gxo

+ O(A’)>l’}.

Ifu E g ( G ) , thenx’must satisfy theequationGx = uifitminimizes IIGx-ull’, so that the last term tends to zero as A 40 while the first tends to / I w - Fxo]j2. (7.2) Constrained Least Squares Estimators as Limiting Cases of BLUE’S

By (6.1.12c), I (A), defined in (7.1. lo), coincides with the BLUE for x when observations of the form (7.2.1)

Z=Rx+f

are used to estimate x, where B is a vector random variable with mean zero and covariance P2(E.), where P(A)is defined in (7.1.10). Thus, (7.1.10) shows that any constrained weighted least squares estimator can be viewed as the limiting case of a BLUE, some of whose observations are extremely reliable (i.e., have extremely low residual variances). To put it another way, constrained, weighted least squares estimators can be approximated arbitrarily well, by treating the constraints as fictitious “observations” which are

CONSTRAINED LEAST SQUARES AS LIMITING CASES OF

BLUE’S

123

extremely accurate, and computing the BLUE for x using both the “real” observations (pretending they have covariance V ) and the “fictitious” ones (pretending they have covariance A2Z, with A’ small). (7.2.2) Exercise: If V has covariance

the BLUE for x is given by (7.2.2.1)

Z(0) = R+[Z--

qo)(Q“v(O))+]z

where Q“ = (Z-fifi’), (6.1.12). Is it true that Z(0) coincides with xo = limL+o%(A)? (This would mean that constrained least squares estimates are obtainable as BLUE’S by treating the constraints as “perfectly noiseless” observations. See Zyskind and Martin [I] ; Goldman and Zelen [I].)