Journal of Statistical Planning and Inference 45 (1995) 413427
ELSEVIER
journal of statistical planning and inference
Breakdown points for designed experiments C h r i s t i n e H . Mi.iller Department of Mathematics, Free University of Berlin, Arnimallee 2-6, D-14195 Berlin, Germany Received 18 August 1993
Abstract For designed experiments we define the breakdown point of an estimator without allowing contaminated experimental conditions (explanatory, independent variables) because they are given by a fixed design. This provides a different definition of the breakdown point as it was used in former literature. For a wide class of estimators which we call h-trimmed weighted Lp estimators and which includes high breakdown point estimators as the LMS estimator and the LTS estimators among others, we derive the breakdown point for situations which often appear in designed experiments. In particular, we derive the breakdown point for replicated experimental conditions and show that a design which maximizes the breakdown point should minimize the maximal number of experimental conditions which lie in a subspace of the parameter space. This provides a new optimality criterion for designs. This new optimality criterion leads to designs which are very different from the classical optimal designs. Two examples demonstrate the different behaviour. A M S Subject Classification: Primary 62G35; secondary 62J05, 62K05.
Key words': Linear model; Breakdown point; Least median of squares; Least trimmed squares; Lp estimator; Optimal design
1. Introduction A general linear m o d e l y=Xfl+z, with Y:=(Yl . . . . . YN)Te~ N, X : = ( X 1 . . . . . x N ) T : = ( X ( t l ) . . . . ,X(tN)) TeEN×r a n d z = (zl . . . . . Z N ) T ~ N is considered, where y, are o b s e r v a t i o n s , t , ~ T are e x p e r i m e n t a l c o n d i t i o n s , x : T--+ ~" is a k n o w n 'regression' function, fie ~ is an u n k n o w n p a r a m e t e r vector a n d z, are errors. F o r simplicity, we also call x, the e x p e r i m e n t a l c o n d i t i o n s or the design points. Similarly, we also call X the design of the experiment. I)378-3758/95/$09.50 © 1995--Elsevier Science B.V. All rights reserved. SSDI 0 3 7 8 - 3 7 5 8 ( 9 4 ) 0 0 0 8 6 - 7
414
C.H. Mfiller/Journal of Statistical Planning and InJerence 45 (1995) 413-427
We assume that the errors z, may contain some gross errors which provide outlying observations y,. In this situation the unknown parameter fl should be estimated robustly which means that the estimate/~ should not be baised too much by some few outlying observations. In the past many robust estimators were proposed and several criteria to measure their robustness properties were introduced; see, for example, the books of Huber (1981) and Hampel et al. (1986). One of those robustness criteria is the breakdown point of an estimator which is the smallest fraction of outliers that can carry the estimate/~ over all bounds. This robustness criterion is treated at full length in the book of Rousseeuw and Leroy (1987). But there and also in other publications concerning the breakdown point in linear models (regression models) it is assumed that the experimental conditions (explanatory, independent variables) are random and in particular they also may contain some gross errors. Therefore, also a common assumption, in particular in the book of Rousseeuw and Leroy (1987), is that the experimental conditions are 'in general position' which means that any p vectors of the N vectors x a , . . . , xN are linearly independent. But for planned experiments it is not a realistic assumption that the experimental conditions have some gross errors. Moreover, the assumption that the experimental conditions should be in general position excludes, in particular, replications of experimental conditions. But it is often the aim of a planned experiment to reduce the number of different experimental conditions so that the experimental conditions are replicated several times. For example, for estimating the constant term and the slope of a linear regression function the optimal design (in many senses) is the design which puts the half of the observations at t--- - 1 and the other half of the observations at t = 1 if the experimental region is restricted to [-- 1, 1]. Only Coakley (1991), Coakley and Mili (1992) and Mill and Coakley (1993) derive the breakdown point of several estimators for situations with replicated experimental conditions. But in these papers they also assume that the experimental conditions are with errors so that by calculating the breakdown point the experimental conditions are also contaminated. If the experimental conditions are without errors we have a really different situation. This demonstrates, for instance, the figure on p. 11 in Rousseeuw and Leroy (1987) which shows the different behaviour of the least absolute value estimator (L1 estimator) for an outlier in the observations (y-direction) and in the experimntal conditions (xdirection). Thus, the aim of the paper is to derive the breakdown point for a broad class of estimators in the situation that the experimental conditions are given by a design so that they are without errors and with possible replications. In particular, we show how the breakdown point depends on the design and how a design should be if the breakdown point should be high. In Section 2 we define the breakdown point for error-free experimental conditions and derive some general properties of it. In Section 3 we introduce the class of estimators we will use and in Section 4 we investigate the breakdown points for them. We call the used estimators h-trimmed weighted Lp estimators. This class of
C.H. Miiller/Journal (?[ Statistical Planning and 1nl'erence 45 (1995) 413 427
Z~l5
estimators not only includes high breakdown point estimators as the least median of squares (LMS) estimator and least trimmed squares (LTS) estimators but also the L1 estimator, the least-squares (LS) estimator and many other estimators. The estimators used by Coakley (1991), Coakley and Mili (1992) and Mili and Coakley (1993) are still more general but all their relevant estimators are ;already included in the class of h-trimmed weighted Lp estimators. Moreover, the proofs for the h-trimmed weighted Lp estimators are very simple (see in particular Remark 8.1). In Section 5 we discuss the consequences of the results for designing experiments. In particular, we define a new optimality criterion for designs by maximizing the breakdown point and show that the breakdown point maximizing designs are easy to obtain if there are no restrictions to the number of different experimental conditions. Namely, these breakdown point maximizing designs are those for which the experimental conditions are in general position, so that in the former literature intuitiv,zly the optimal designs for the breakdown point were assumed. If the number of different experimental conditions is restricted then a more complicated optimization problem arises. This is demonstrated in Section 6 by two examples (linear regression and quadratic regression). These examples also show that the classical optimal designs such as A- and D-optimal designs (see, for example, Fedorov, 1972 or Silvey, 19801 provide very low breakdown points and may have a different behaviour with respect to the breakdown point. In Section 7 we disuss how to solve the conflict between high efficient designs and high breakdown point designs. In Section 8 all proofs are given. These proofs use some ideas of the proofs given in Rousseeuw and Leroy (1987) for the LMS estimator and the LTS estimator, but in a more simple and general way (see in particular Lemma 8.2 and Remark 8.1).
2. The breakdown point in designed experiments The breakdown point of an estimator is the smallest amount of contamination that can cause the estimated value arbitrarily far from the estimated value which is obtained without contamination. In its original definition proposed by Donoho and Huber (1983) and used in many publications it is assumed that the experimental conditions (explanatory, independent variables) are random and in particular they may also contain some outliers. But for designed experiments it is not a realistic assumption that the experimental conditions have gross errors. Therefore, here we introduce a new definition of the breakdown point, the breakdown pointjbr contamination-l'ree experimental conditions, while we call the original breakdown point as
breakdown point Jor contaminated experimental conditions. The breakdown point for contaminated experimental conditions is based on the set of samples where at most M of the observations and experimental conditions are
416
C.H. Mfiller/ Journal of Statistical Planningand Inference45 (1995) 413-427
contaminated, i.e. on ~M(y, X) := {(fi,)()E~ N x Wv×r; there exist m(1)< . . - < m ( N - M ) with (y,.ti), x,.(o) = (Y.,ti), xm,)) for all i ~ { 1, ..., N - m } }. Definition 2.1 (Breakdown point for contaminated experimental conditions). For contaminated experimental conditions the breakdown point of an estimator fi: ~U X ~N xr_..~.~ at the sample ( y , X ) 6 [ ~ N × ~ N ×r is defined as
e*(fi, y,X):=min{M; M~N and
sup I/~(Y,-~)-/l(y,X)l= ~ } (y,R)~'M(y,x)
and the breakdown point of the estimator is e*(/~) := rain {e* (/1, y, X); (y, X)e ~U × ~N ×r} (see also Rousseeuw and Leroy, 1987; Coakley, 1991; Mili and Coakley, 1993). In contrast to the breakdown point for contaminated experimental conditions the breakdown point for contamination-free experimental conditions is based on the set of samples where at most M of the observations are contaminated while all experimental conditions are unchanged, i.e. it is based on
°3tM(y) := { fie ~N there exist m(1) < . . . < m(N-- M) with y.,(1)=y,.(i)for all i~{1 ..... N--M}}. Definition 2.2 (Breakdown point for contamination-free experimental conditions). For contamination-free experimental conditions the breakdown point of an estimator /~: ~U x ~ N × r ~ W at the sample ( y , X ) ~ N x ~U×r is defined as
e(fi, y,X):=min{M; M~N and sup I f i ( y , X ) - f i ( y , X ) , = ~ } reaM(y) and the breakdown point of the estimator at the design X is
e(fi, X ):=min{e(fi, y,X); ye~N}. There is an obvious relation between the two types of breakdown points which is given by the following lemma. Lemma 2.1. For any estimator fi we have
e(£ y,X)>~e*(fi, y,X) for all (y,X)E~N × R N×r, and in particular e(fi, X)>~e*(fi) for all X~ff~N×L
C.H. Miiller/dournal of Statistical Planning and tr~[erence 45 (1995) 413 427
417
In the following it is shown that the breakdown point for contamination-free experimental conditions has similar properties as the breakdown point for contaminated experimental conditions. In particular, for regression equivariant estimators the same upper bound for the breakdown point can be derived. Thereby an estimator/~ is called regression equivariant if it satisfies
fl(y+ XO, X ) = ~ ( y , X ) + O for all y ~ N x ~U×r and 0~[~ ~ (see Rousseeuw and Leroy, 1987, p. 116). The upper bound for the breakdown point of regression equivariant estimators at the sample (y, X) depends only on the maximum number of experimental conditions which lie in a ( r - 1)-dimensional subspace of ~ ' , i.e. on
N(X) :=max{Nv; V is a (r-1)-dimensional subspace of ~ } , where Nv"•= ~ .N = a lv(x.). If N ( X ) = r - 1 then we say that the experimental conditions are in general position (cf. Rousseeuw and Leroy, 1987, p. 117). Moreover, set [ z ] : = m a x { n e N ; n ~ z } for any zeN +
Theorem 2.1. If the estimator fl is regression equivariant then for all (y, X)~ ~N × ~x ~r we have
iLy,
1
/
1]
This is the same upper bound which Mili and Coakley (1993) (Theorem 3.1) derived for regression equivariant estimators when the experimental conditions are also contaminated. If the experimental conditions are in general position, i.e. N ( X ) = r-- 1 then the upper bound results in
which had been already derived by Rosseeuw and Leroy (1987) for contaminated experimental conditions.
3. Trimmed weighted Lfestimators In this section we define a class of regression equivariant estimators for which the breakdown point for contamination-free experimental conditions can be calculated with more precision. This class of estimators is the class of h-trimmed weighted Lp estimators. To define these estimators we need the following notation: If lyl--x~fll .... ,[YN--xXufl[ are the absolute values of the residuals then lY,~l)--x.~l)fllx..... ly,~m_x.~mfllx denote the ordered residuals, i.e. we have T l Y,~) - x . .Td ~ l ~< "'" ~<1Y,~N)--X.~N)fll.
C.H. Mfiller/Journal of Statistical Planning and b(erence 45 (1995) 413 427
418
Definition 3.1 (Trimmed weighted Lp estimator). An estimator fi is called h-trimmed weighted Lp estimator if it satisfies h
fi(y,X)=argmin Z ai[Yn(i)-- Xn(1)fl[ T p ~"
for all
(y,X)~N×
(3.1)
i=l
~N×r, where /,h~{1 . . . . . N} and p, ai>O.
The class of trimmed weighted Lp estimators includes many relevant estimators as the least-squares (LS) estimator, the least absolute value estimator (L1 estimator), the least median of squares (LMS) estimator and least trimmed squares (LTS) estimators. For the definitions of the L1 estimator, the LMS estimator and the LTS estimator, see for example, Rousseeuw and Leroy (1987). Thereby the LS estimator is a N-trimmed w e i g h t e d L 2 estimator with I= 1 and ai= 1 and the L1 estimator is a N-trimmed weighted L1 estimator with l = 1 and ai = 1. The LMS estimator is a [ ~ z ] - t r i m m e d weighted L2 estimator w i t h / = ½ ( N + 1) and ai= 1 for odd N and l=½N and ai=½ for even N, and a LTS estimator is a h-trimmed weighted L2 estimator with l = 1 and ai = 1. In Mili and Coakley (1993) more general estimators, the so-called D-estimators, are regarded. But the examples of the D-estimators which they give are the h-trimmed weighted L1 estimators and the h-trimmed weighted L 2 estimators so that it is an open question whether the class of D-estimators contains a relevant estimator which is not a h-trimmed weighted Lp estimator. A special class of trimmed weighted L1 estimators are the rank-based estimators which H6ssjer (1991) regarded and which in particular are asymptotically normal with convergence rate of ,,/N. By generalizing Theorem 1 in Rousseeuw and Leroy (1987, p. 113) which concerns only the LMS estimator we can show that always a solution of(3.1) exists so that the h-trimmed weighted Lp estimator exists always. Moreover, we can give a bound for the h-trimmed weighted Lp estimator which depends on the sample (y, X). For that set R:=max{py.[},
K:=max{lx.[},
S : = ,k(2max{ai}(h-I+~
1)f/p
and
p(X) :=½inf{r >0;
there exists a ( r - 1)-dimensional subspace V of N' so that V ~ covers at least N(X)+ 1 of the x.}.
Thereby for any set V c ~r we define V ~:= {veNt; I v - Vo[~ z for some roe V}. Because of the definition of N(X) we have p ( X ) > 0 .
fiN(X)<~ h - 1 thenfor all ye NN the h-trimmed weighted Lv estimator ~h exists and satisfies Theorem 3.1.
i~h(y,X)[~ KR(S+ I) p(X) 2
C.H. MfiUer/Journal of Statistical Planninq and Inlerence 45 (1995) 413 427
419
If h<,N(X) then there are observations, for example, y=Xfl, so that the set of solutions of (3.1) is unbounded (see Mili and Coakley, 1993, Theorem 6.4). Therefore, in the following we always assume h>N(X).
4. Breakdown points of trimmed weighted Lfestimators The first theorem provides a lower bound for the breakdown point of a h-trimmed weighted Lp estimator.
Theorem 4.1. Any h-trimmed weighted Lp estimator fih with h > N(X ) satisfies g( dh, Y,
X ) > ~ l m i n { N - h + 1,h- N(X)I
.lor all (y, X)~ ~N × ~N ×r, and in particular
g(dh, X ) > ~ l m i n { S - h + l , h - N(X)}. For h - N ( X ) < ~ N - - h + l , i.e. h<<.½(N+N(X)+I), the lower bound is attained by observations satisfying an exact fit, i.e. for observations y satisfying y = Xfl.
Theorem 4.2. If N(X) < h <~½(N + N(X )+ 1) then any h-trimmed weiqhted Lp estimator fia satisfies ^ 1 a(flh, Xfl, X)=~(h-N(X))
for all flE~ r, and in particular ^
1
a(flh, X ) = ~ ( h - N ( X ) ) . In particular, for the LMS estimator, i.e. for h = [ ~ 2 ] , we get 1
N
|f the exponential conditions are in general position, i.e. N ( X ) = r - 1 ,
then we have
which coincides with the result given by Theorem 2 in Rousseeuw and Leroy (1987, p. 118) for contaminated experimental conditions.
C.H.Mi~ller/Journalof StatisticalPlanningand Inference45 f1995)413-427
420
Because the h-trimmed weighted L v estimators are regression equivariant (see Mili and Coakley, Theorem 4.1, 1993) we also get as a corollary form Theorem 2.1 an upper bound for the breakdown point.
Corollary 4.1. Any h-trimmed weighted Lp estimator flh satisfies
I IN-N(X)+ I ] -
2
for all ( y , X ) e ~ N x RN×r A special case appears when the lower and the upper bounds coincide. This is the case if and only if h satisfies
Hence, for those values of h we get the h-trimmed weighted Lp estimators with the highest breakdown point. This provides the following theorem which coincides with Theorem 6.1 given in Mili and Coakley (1993) for contaminated experimental conditions. Theorem 4.3.
Any h-trimmed weighted Lp estimator flh with
[N+N(X)+I] " 2 <~h<~L[-N+N(X)+2] satisfies e(fih'X)=I [N-N(X)+ I ] If N(X)= r - 1 , i.e. the experimental conditions are in general position, then any h-trimmed weighted Lp estimator/~h with
N+r] . [-N+r+l]
I satisfies
Because I-N7
[-r+17
[-[-N+r]
[-N+r+l]
J-
]
C.H. Miiller/Journal of Statistical Planning and In[erence 45 (1995) 413 427
421
this result coincides with Theorem 6 in Rousseeuw and Leroy (1987, p. 132) which is given for the LTS estimator and contaminated experimental conditions. The breakdown point for h > [ N +N(X) z + 2 ] is much more difficult to calculate. In particular, it does not only depend on N(X) but also on the position of the experimental conditions xl . . . . . xN. For example, the L1 estimator in a simple linear regression model can have a breakdown point greater than i but also equal to 1IN for designs with N different experimental conditions, i.e. for N(X)= 1. Thereby a breakdown point greater than ~-can be achieved by a design with equispaced experimental conditions while the breakdown point at a very asymmetric design can be 1/N. This behaviour is similar to the behaviour of the asymptotic breakdown point for contaminated experimental conditions as discussed in Hampel et al. (1986, p. 328).
5. Designs maximizing the breakdown point Theorem 2.1 shows that the upper bound
gmax(X):=l[g--g(X)+l.
1
for the breakdown point of regression equivariant estimators is improved if N(X) is made smaller. Also according to Theorems 4.2 and 4.3 the breakdown point of any h-trimmed weighted L, estimator with h ~< I-½(N+ N(X)+2)] is improved if N(X)is made smaller. Moreover, the breakdown point emax(X) of the h-trimmed weighted Lp estimator with the highest breakdown point namely the h-trimmed weighted Lp estimator with
I-N+N~)+lJ<.h<[N+N(?)+21, increases if N(X) decreases. Hence, for maximizing the breakdown point within a given set A of designs X the quantity N(X) should be minimized within A. This leads to the following optimality criterion for designs.
Definition 5.1 (Breakdownpoint maximizing designs). A design X*~A is called breakdown point maximizing in A if it minimizes N(X) within all X~A, i.e.
X*=argmin{N(X); XeA}. The following obvious lemma shows that designs with N(X)=r-1, i.e. with experimental conditions in general position, are beakdown point maximizing. Hence, the often used assumption that the experimental conditions are in general position (see Rousseeuw and Leroy, 1987) ensured that the breakdown points are as high as possible.
422
C.H. Miiller/Journal of Statistical Planning and Inference 45 (1995) 413-427
Lemma 5.1. If A contains a design X with N ( X ) = r - 1 then X is a breakdown point maximizing design in A. The condition N ( X ) = r - 1 in particular means that all N experimental conditions are different. But in designed experiments often the number of different experimental conditions is restricted by some bound L, say, which is less than N. Hence, the set of possible designs is restricted to AL := {X~A; card({xl ..... xN})~ L}, so that in this set AL a breakdown point maximizing design should be found. Moreover, the classical optimal designs, which usually minimize some function of the covariance matrix of the LS estimator (see, for example, Fedorov, 1972 or Silvey, 1980), are also often based on some few different experimental conditions. Hence, we have a conflict between efficiency and high breakdown point. But the optimality criterion based on the breakdown point causes not only a conflict with classical optimality criteria for designs but also with optimality criteria which were proposed for robust estimation by using other robustness criteria. For example, we can define a robustness criterion based on the asymptotic bias which is caused by contaminated observations where the amount of contamination decreases with N-1/2. Then the classical A-optimal designs, which minimize the trace of the covariance matrix of the LS estimator, also have some optimality properties for robust estimation. For instance, at the A-optimal designs the trace of the asymptotic ovariance matrix of robust estimators is minimized under the side condition that the asymptotic bias is bounded by some bound b (see Miiller, 1992). Moreover, in Lemma 1 of Mfiller (1992) it was also shown that at these A-optimal designs the maximum asymptotic bias is minimized. Hence, the main conflict by choosing designs arises between high breakdown point and high efficiency, while high efficiency is also connected with other robustness properties. This conflict and also the problems which arise by restricting the number of different experimental conditions are demonstrated in the following examples.
6. Examples Example 6.1 (Linear regression). In the linear regression model y,=- flo + fllt. + z,, where x ( t ) = ( l , t) v and fl--(flo, ill) T a classical optimal design for the experimental region Z={x(t); t ~ [ - 1 , 1 ] } is X = ( x l ..... xu) T with x , = x ( - 1 ) for n<~½N and x. = x(1) for n > ½N. In particular, this design is A-optimal in A = y ~ (at least for large N) and hence also optimal for robust estimation in the sense of M/filer (1992). For this
C.tt. Miiller/Journal c?f Statistical Planning and In[erence 45 (1995) 413 427
423
design we get N ( X ) = [½(N+ 1)] and hence
1 VN-[(N+
g'max(X) = ~ [
=Ilium2] [~[~]
~-
1)/2] + 1]
]
if N is even, if N is odd.
For a design X* with N different design points tl . . . . . t , e [ - 1, 1] we get N(X *)= 1 so that this design is breakdown point maximizing in A = AN with emax(X*)= (l/N)[N/2]. If only at most L different experimental conditions are possible, any design X~ with at most [(N + L-- 1)/L] replications at every one of L different design points Zl ..... Zr is breakdown point maximizing in AL with
1 [N--[(N+L--1)/L]+I] Grnax(XL) = N Example 6.2
2
"
(Quadratic regression). Consider
the quadratic regression model
3,. =/30 +/3,t. +/32t.~ + z., where x(t)=(1,t, t2)T, /3=(/3o,//1,/32) x and Z={x(t); t e [ - 1 , 1]} is the experimental region. A classical A-optimal design in A = Zu (at least for large N) is X a = (X 1. . . . . X ~ )T with x. = x ( - 1) for n ~<]-N, x, = x(0) for ¼N < n ~<43N and x, = x(l) for n > 43N. For this design we get N(XA)= [(3N+3)/4] and thus
I:max(XA) = ~ Instead of the A-optimal design we also can use a D-optimal design, which minimizes the determinant of the covariance matrix of the LS estimator, i.e. a design XD=(X, . . . . . xx) T with x . = x ( - - 1) for n<~½N, x , = x ( 0 ) for }N~N. For this design we get N(XD)=[(2N+2)/3] and thus
This shows that the D-optimal design provides a higher breakdown point than the A-optimal design. Again as for linear regression every design X* with N different design points t, . . . . . tN provides a breakdown point maximizing design in A = A~. with Zma~(X*)=(1/N)[(N--1)/2]. The breakdown point maximizing designs in A~. with 3 ~
1 r N - {(N + c - I f / L ] - E(N + / . - 21/L] + 11 2 ]"
e.... ( X r ) = ~ [
C.H. Miiller/Journal of Statistical Planning and Inference 45 (1995) 413 427
424
7. Conclusion
The conflict between efficiency and high breakdown point also appears by choosing an appropriate estimator because high breakdown point estimators as the LMS estimator or the LTS estimator are often not very efficient. The conflict for estimators can be solved by using, for example, one-step G M estimators with high breakdown point initial estimator and score function (influence function) providing high asymptotic efficiency (see Simpson et al., 1992). But for designs a solution is more difficult. One possibility to solve this conflict for designs may be constrained problems. For example, it could be reasonable to maximize the efficiency subject to a lower found for the breakdown point or to maximize the breakdown point subject to a lower bound for the efficiency and an upper bound for the number L of different experimental conditions. Then a solution could be to use L different design points near to the design points of the classical optimal design. For example, in linear regression (see Example 6.1) the half of the L different design points should be chosen near to t = - 1 and the other half near to t = 1. But as a referee of this paper pointed out a high breakdown point does not mean that the bias caused by a contamination amount less than the breakdown point is small. This bias is bounded but still it could be very large as was demonstrated in Martin et al. (1989) for the asymptotic breakdown point for contaminated experimental conditions. Therefore, additional considerations are necessary for finding reasonable designs with respect to efficiency and outlier robustness.
8. Proofs
Although the proof of Theorem 2.1 is almost the same as of Theorem 3.1 in Mili and Coakley (1993), which is given for contaminated experimental conditions, we present this proof for completeness. Proof
of
Theorem
2.1. Without
loss
of generality,
we
can
assume
that
XN-N(X)+~ . . . . . XN lie in a (r--1)-dimensional subspace of ~r. Now assume e(fl, y , X ) > ( 1 / N ) M with M = [ ½ ( N - - N ( X ) + I ) ] . Then there exists a k e ~ with I/~(Y,X)-/~(37, X)I ~
for
.~l=(Yl'3t-xlTu ..... yM+xTv,
all
JT~M(y).
YM+I . . . . . yN) T a n d
In
particular,
this
holds
for
T
. ~ 2 = ( Y l . . . . . YM,YM+I--XM+lV . . . . .
Ys--x~v) T where v is any vector in ~" which is orthogonal to {xN-stx)+l ..... xs}, i.e. vXx. = 0 for n > N - - N(X). Thereby .~2 lies in ~M(Y) because N--N(X)--M=[½(N--N(X))]<~M. Because of 371=372+Xv and the regression equivariance of fl we have fl(371,X) =/~(372, X) + v. But this provides the contradiction
Ivl =lfl(~,x)-fl(y~,X)l <~lf l ( p 2 , X ) - fl(y,X)l + l f l ( y , X ) - fl(yx,X)l <<.2k for all v which are orthogonal to {XN-N(X)+I,--.,XN}.
[]
C.H. Miiller/Journal of Statistical Plannina and Inference 45 (1995) 413 427
425
The proof of Theorems 3.1 and 4.1 is based on the following lemma. For that let V~:={xe~'; xTfl=0} and p > 0 . Lemma 8.1. Any x¢ V~ with [x l <~k satisfies
IxT/~I >(p2/k)lfl[.
Proof. If Xv denotes the projection of x on Va we have Ixvl 2 = Ix2 [ - - I x - Xv [2 < k 2 - p2. This implies Ix+/~l
Ix[ Ifll
Ixl2=lx-xvl2+lx~,[ 2,
i,e.
1
= cos(/~, x) =
x/1 +(tan(fl, X))2 1
1
p
= ~ / l +lxv[Z/Ix--xvl2>~/-l +(k2--p2)/p2--k" Hence, we get 2
Lemma 8.2. Let M = m i n { N - - h + 1, h - N ( X ) }
>>-1 and y6NU. Then for all ~e~M every h-trimmed weighted Lp estimator fin exists and satisfies
I{Y)
_
Proof. Let ~=(~51 . . . . . y N ) T be any corrupted sample by retaining at least N - - M + 1 =- N - - rain {N-- h + 1, h - N(X)} + 1 ~>h observations. Because at least h observations are retained we have h
ail~.(i~iP<<,max{al}(h-l+ 1)R p.
(8.1)
i=l
Now regard any fl~ ~ ' with 1ill > KR(S + l)/p(X) 2. Then the subspace V~(x) contains at most N(X) experimental conditions x,. Thus, at least N - min { N - h + 1, h - N ( X ) I + I-N(X)>~N-h+I experimental conditions at which the observations are retained do not lie in V~{x). According to Lemma 8.1 they satisfy Ix.+/~l > ( p ( X ) 2 / K ) l f l [ >/(S't-1)R, and therefore for any a i they satisfy
ail g.-x~.fllP=aiiy.-xT, fl[ p ~ai(lx~[31-ly.I) p > ai((S + 1)R - R) p = aISPR p 2 max {a~} (h - 1+ 1) Rp ~>min{a~} min{a~}
= 2 m a x { a ~ } ( h - l + 1)R p.
C.H. Miiller/Journal of Statistical Plannin 9 and lnJerence 45 (1995) 413 427
426
In particular, at least one of 137.(~)-- X.(0 T t[ . . . . . [37.(h)-- X.(h)t[ satisties
ai]Y.(1)--xT.(i)fllP> Zmax{al}(h-l+ 1)R p, so that with (8.1) we have h
h
aily.(i)-x.(i)fll > 2 ~ ai137.,) i=l
i--l
for a l l / / ~ " with Ifll > K R ( S + 1)/p(X) 2. Hence, only a sequence in the compact set given by all lfl [~
[]
R e m a r k 8.1. The proof of L e m m a 8.2 is a simplification and generalization of the proofs of Theorem 1 on p. 113, Theorem 2 on p. 118 and Theorem 6 on p. 132 in Rousseeuw and Leroy (1987). These proofs concern only the LMS and the LTS estimators and use additionally geometrical arguments. Thereby the proof of L e m m a 8.2 does not use the fact that the experimental conditions are not contaminated. Hence, it also can be used for contaminated experimental conditions. In particular, it provides for h-trimmed weighted Lp estimators simple and more formal proofs of Theorems 6.1, 6.2 and 6.5 and Mili and Coakley (1993). Proof of Theorem 4.2. Without loss of generality, we can assume that xl . . . . .
XN(x) lie
in a ( r - 1)-dimensional subspace of R r. Let v be any vector which is orthogonal to {xl . . . . . XN(x)}, i.e. x~v=O for n<~N(X). Define 37 by 37,=x~(fl+v) for n<~h and 37,= x~ fl for n > h which ensures 37~h_ N(x)(Xfl). Because h
a, l
x . (fl + v)lp = 0.
i:l
we have fih(y,X)=fl+v as well as flh(Xfl, X)=fl" Hence, we get sup{ [flh(X fl, X ) -- flh(JT,X)l; .96~h- U(x)(X fl) } ~>sup{Iv]; VE~ r with x~.v=O for n<~N(X)}= ~. C o m p a r e also the proof of Theorem 6.3 in Mili and Coakley (1993).
[]
Acknowledgements
The author thanks Clint W. Coakley for providing his latest papers. She is also very grateful to the referees for their helpful comments and suggestions.
C.H. Mfiller/Journal c~f Statistical Planning and ln[erence 45 (1995) 413 427
427
References Coakley, C.W. (1991). Breakdown points under simple regression with replication. Technical Report No. 91-21, Department of Statistics, Virginia Polytechnic Institute and State University. Coakley, C.W. and L. Mili (1993). Exact fit points under simple regression with replication. Statist. Probah. Lett. 17, 265-271. Donoho, D.L. and P.J. Huber (1983). The notion of breakdown point. In: P.J. Bickel, K.A. Doksum and J.L. Hodges, Jr., Eds., A Festschrift]br Erich L. Lehmann, Wadsworth, Belmont, CA, 157-184. Fedorov, V.V. (1972). Theory of Optimal Experiments. Academic Press, New York. Hampel, F.R., E.M. Ronchetti, P.J. Rousseeuw and W.A. Stahel (1986). Robust Statistics The Approach Based on Influence Functions. Wiley, New York. H6ssjer, O. (1994). Rank-based estimates in the linear model with high breakdown point. J. Amer. Statist. Assoc. 89, 149-158. Huber, P.J. (1981). Robust Statistics. Wiley, New York. Martin, R.D., V.J. Yohai and R.H. Zamar (1989). Min-max bias robust regression. Ann. Statist 17, 1608- 1630. Mill, k. and C.W. Coakley (1993). Robusl estimation in structured linear regression. Technical Report No. 93-13, Department of Statistics, Virginia Polytechnic Institute and State University. Mfiller, Ch.H. (1994). Optimal designs for robust estimation in conditionally contaminated linear models. J. Statist. Plann. Inference 38, 125 140. Rousseeuw, P.J. and A.M. Leroy (1987). Robust Regression and Outlier Detection. Wiley, New York. Silvey, S.D. (1980). Optimal Design. Chapman and Hall, London. Simpson, D.G., D. Ruppert and R.J. Carroll (1992). On one-step GM estimates and stability of inferences in linear regression. J. Arner. Statist. Assoc. 87, 439-450.