On the irregular behavior of LS estimators for asymptotically singular designs

On the irregular behavior of LS estimators for asymptotically singular designs

ARTICLE IN PRESS Statistics & Probability Letters 76 (2006) 1089–1096 www.elsevier.com/locate/stapro On the irregular behavior of LS estimators for ...

191KB Sizes 0 Downloads 16 Views

ARTICLE IN PRESS

Statistics & Probability Letters 76 (2006) 1089–1096 www.elsevier.com/locate/stapro

On the irregular behavior of LS estimators for asymptotically singular designs$ Andrej Pa´zmana, Luc Pronzatob, a

Department of Applied Mathematics and Statistics, Mlynska´ Dolina, 84248 Bratislava, Slovakia Laboratoire I3S, CNRS/UNSA, Baˆt. Euclide, Les Algorithmes, 2000 route des Lucioles, BP 121, 06903 Sophia-Antipolis, Cedex, France

b

Received 25 February 2005; received in revised form 31 October 2005 Available online 27 December 2005

Abstract Optimum design theory sometimes yields singular designs. An example with a linear regression model often mentioned in the literature is used to illustrate the difficulties induced by such designs. The estimation of the model parameters y, or of a function of interest hðyÞ, may be impossible with the singular design x . Depending on how x is approached by the empirical measure xn of the design points, with n the number of observations, consistency is achieved but the speed of pffiffiffi convergence may depend on xn and on the value of y. Even in situations where convergence is in 1= n and the asymptotic distribution of the estimator of y or hðyÞ is normal, the asymptotic variance may still differ from that obtained from x . r 2005 Elsevier B.V. All rights reserved. PACS: 62K05; 62E20 Keywords: Singular design; Optimum design; Asymptotic normality; Consistency; LS estimation

1. Introduction We consider the following linear regression model Zðx; yÞ ¼ y1 x þ y2 x2 ¼ f > ðxÞy

(1)

¯ þ ek where y¯ is the (unknown) with fðxÞ ¼ ðx; x2 Þ> , x 2 X ¼ ½1; 1 and observations yk ¼ Zðxk ; yÞ true value of the model parameters and the errors ek are i.i.d. with zero mean and variance s2 . We shall take s ¼ 1 throughout the paper. We shall denote y^ n the LS estimator of y obtained from the observations y1 ; y2 ; . . . ; yn ; xn will denote the empirical measure of the associated design points x1 ; x2 ; . . . ; xn . $ The research of the 1st author has been supported by the VEGA-grant nb. 1/0264/03. The research of the 2nd author has been supported in part by the IST Programme of the European Community under the PASCAL network of Excellence IST2002506778. This publication only reflects the authors views. Corresponding author. E-mail addresses: [email protected] (A. Pa´zman), [email protected] (L. Pronzato).

0167-7152/$ - see front matter r 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.spl.2005.12.010

ARTICLE IN PRESS A. Pa´zman, L. Pronzato / Statistics & Probability Letters 76 (2006) 1089–1096

1090

We shall also denote Z MðxÞ ¼ fðxÞf > ðxÞxðdxÞ X

the information matrix for a design measure x. We assume that y¯ 1 X0 and y¯ 2 o0 and are interested in the estimation of hðyÞ ¼ 

y1 , 2y2

(2)

the value of x where Zðx; yÞ is maximal. When the design space is X ¼ ½1; 1, the optimum design measure x ¼ xy for the estimation of hðyÞ (c-optimality) has its support included in f1; 1g, the weight of each point depending on the value of hðyÞ (a standard situation in nonlinear problems). One can show, see, e.g., Silvey (1980, p. 57), that 8 1 1 1 > > if hX ; < þ 2 4h 2 (3) x ð1Þ ¼ 1 1 > > if 0php ; : þh 2 2 so that when hðyÞ ¼ 12 the optimum design is singular with x ð1Þ ¼ 1 (and x ð1Þ ¼ 0). Therefore, if we know a ¯ is close to 1 we should put the design points, or the majority of them, close to 1. It is the priori that hðyÞ 2 purpose of this paper to show that depending how this is realized, the asymptotic behavior of y^ n , or of hðy^ n Þ, may have some unexpected features. Note that hðyÞ is not estimable when x is singular. Therefore, when hðy^ n Þ ¯ n ! 1, it means that the sequence x1 ; x2 ; . . . itself, not the obtained with the design xn converges to hðyÞ, limiting design x , is responsible for consistency. In particular, it thus seems legitimate to question the adjective ‘‘optimum’’ for x . The example considered is extremely simple but the conclusions are of general consequences: we show that it is only in very particular circumstances that approaching a singular ‘‘optimum’’ design conveys some optimal properties to the nonlinear LS estimation for which it was designed. The example has been chosen due to its frequent use in the optimum-design literature, see e.g. Silvey (1980), Ford and Silvey (1980), Ford et al. (1985) and Wu (1985). Here, the singularity of x is obtained for a particular value of hðyÞ. This should not give the reader the impression that singular ‘‘optimum’’ designs are exceptional. For instance, the estimation of hðyÞ ¼ y1 =ð2y2 Þ in the full quadratic regression model Zðx; yÞ ¼ y0 þ y1 x þ y2 x2 yields singular ‘‘optimum’’ designs for a full range of values of h, see Chaloner (1989), Fedorov and Mu¨ller (1997). See also Buonaccorsi and Iyer (1986) for the estimation of ratios of linear combinations of the parameters. ¯ is close to 1 and non-sequential Sections 2 and 3 concern the situation where we know a priori that hðyÞ 2  designs approaching the singular measure x are used: xn converges weakly to x in Section 2 whereas strong convergence is considered in Section 3. The iterative construction of the design is briefly discussed in Section 4. Throughout the paper we denote     0 1 1¼ and 0 ¼ . 0 1 2. xn converges weakly to x w

By weak convergence we mean convergence in distribution, which we denote !. Let x be the singular measure that puts weight 1 at x ¼ 1. Throughout the section we use the design measure xn constructed from ( 1 if i ¼ 2k  1; xi ¼ 1=4 1 þ ð1=kÞ if i ¼ 2k; w

for k ¼ 1; 2; . . . so that xn ! x , n ! 1.

ARTICLE IN PRESS A. Pa´zman, L. Pronzato / Statistics & Probability Letters 76 (2006) 1089–1096

1091

2.1. Consistency a:s:

From Corollary 1 of (Wu, 1980), u> y^ n ! u> y¯ for any u 2 R2 when S1 ðwÞ ¼ w ¼ ðw1 ; w2 Þ> a0. Here we obtain S1 ðwÞ ¼

1 X

ðw1 þ w2 Þ2 þ

k¼1

1 X

P1

2 > i¼1 ½w fðxi Þ

¼ 1 for all

fw1 ½1 þ 1=k1=4  þ w2 ½1 þ 1=k1=4 2 g2

k¼1

so that S ðwÞ ¼ 1 when w1 þ w2 a0. For w1 þ w2 ¼ 0 (and w1 a0 since wa0) we have 1

S1 ðwÞ ¼ w21

1 X

½1=k1=2 þ 1=k1=4 2 4w21

k¼1 > ^ n a:s: > ¯

1 X

1=k ¼ 1.

k¼1

a:s: a:s: ¯ n ! 1. Therefore, u y ! u y for any u 2 R2 so that y^ n ! y¯ and hðy^ n Þ ! hðyÞ,

2.2. Asymptotic normality of u> y^ n This paragraph is auxiliary to the investigation of the asymptotic distribution of hðy^ n Þ. Consider the case u ¼ 1. When the design x is used, all design points xi ¼ 1, but 1> y^ n is estimable in spite of the singularity of x since 1 is in the range of   1 1 Mðx Þ ¼ . 1 1 The variance of 1> y^ n , which we denote varx ð1> y^ n Þ, then satisfies nvarx ð1> y^ n Þ ¼ 1> M ðx Þ1 ¼ 1 with M any g-inverse of M. On the other hand, the variance of 1> y^ n for the design xn satisfies lim n varxn ð1> y^ n Þ ¼ 95a1> M ðx Þ1,

n!1

(4)

where n varxn ð1> y^ n Þ ¼ 1> M1 ðxn Þ1. Indeed, take n ¼ 2m, then ! m2 ðnÞ m3 ðnÞ Mðxn Þ ¼ m3 ðnÞ m4 ðnÞ P 1=4 i with mi ðnÞ ¼ ð1=nÞ½m þ m Þ . We then obtain (4) by direct calculations. The difference between k¼1 ð1 þ k n > limn!1 n varx ð1 y^ Þ and limn!1 n varxn ð1> y^ n Þ is due to the discontinuity of the function MðxÞ7!n varx ð1> y^ n Þ, see Pa´zman (1980). Next, following the same lines as in Huber (1973), we can show that Lindeberg’s condition is satisfied, and for any direction ua0 ¯ pffiffiffi u> ðy^ n  yÞ d n pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ! zu Nð0; 1Þ. 1 u> M ðxn Þu For u ¼ 1 it gives pffiffiffi > ^ n d 0 ¯ ! n1 ðy  yÞ z1 Nð0; 95Þ, 2 > 2 ¯ is in n1=4 but for a direction u such that (i.e., not parallel to 1) the convergence of u> ðy^ n  yÞ pffiffiffi ðu 1Þ a2kuk  > 1 > since u M ðxn Þu grows as n (note that u y is not estimable from the limiting design x ). In particular, one can check that pffiffiffi d ¯ ! z1 Nð0; 9 2=10Þ n1=4 u> ðy^ n  yÞ

for u ¼ ð0; 1Þ> or ð1; 0Þ> .

ARTICLE IN PRESS A. Pa´zman, L. Pronzato / Statistics & Probability Letters 76 (2006) 1089–1096

1092

pffiffiffi Hence, when u> y is estimable under the limiting design x , u> y^ n converges as 1= n but the limiting variance differs from u> M ðx Þu; when u> y is not estimable under x (u is not in the range of Mðx Þ), then u> y^ n converges as n1=4 . 2.3. Asymptotic normality of hðy^ n Þ Consider now the estimation of hðyÞ given by (2). When y¯ 1 þ y¯ 2 a0, we have   ¯ þ ðy^ n  yÞ ¯ > qhðyÞ þ op ð1Þ , hðy^ n Þ ¼ hðyÞ qy jy¯ ¯ has where qhðyÞ=qy ¼ 1=ð2y2 Þ½1; 2hðyÞ> , so that qhðyÞ=qyjy¯ is not parallel to 1. Therefore, n1=4 ½hðy^ n Þ  hðyÞ the same limiting distribution as 

1 1=4 d ¯ y^ n  yÞ ¯ ! n ½1; 2hðyÞð z2 Nð0; vy¯ Þ ¯ 2y 2

(5)

with pffiffiffi ¯  12 pffiffiffi 9 2 ½2hðyÞ 2 > 1 ¯ ¯ ¯ vy¯ ¼ 1=ð4y2 Þlimn!1 ð1= nÞ½1; 2hðyÞM ðxn Þ½1; 2hðyÞ ¼ . 2 10 4y¯ 2 hðy^ n Þ is thus asymptotically normal, but converges as n1=4 . When y¯ 1 þ y¯ 2 ¼ 0, we write  2  ¯ þ ðy^ n  yÞ ¯ > qhðyÞ þ 1 ðy^ n  yÞ ¯ > q hðyÞ þ op ð1Þ ðy^ n  yÞ ¯ hðy^ n Þ ¼ hðyÞ qy jy¯ 2 qyqy> jy¯ with qhðyÞ 1 ¼ 1 qy jy¯ 2y¯ 2

and

q2 hðyÞ 1 ¼ qyqy> jy¯ 2y¯ 22



0 1

 1 . 2

2 > ^n Direct calculations, pffiffiffi based on the eigenvector decomposition of the matrix q hðyÞ=ðqyqy Þjy¯ , show that hðy Þ converges as 1= n but is not asymptotically normal.

3. xn converges strongly to x By strong convergence we mean that limn!1 xn ðxÞ ¼ x ðxÞ for all x 2 X, x being the limiting discrete design. In this section we consider different simple examples of strongly converging xn and study the asymptotic properties of estimators. The first example corresponds to a design generated by an optimization algorithm. 3.1. Steepest descent algorithm Consider the steepest descent algorithm (Wynn, 1972) for the construction of an optimum design for the estimation of 1> y in model (1). The optimum design x on X ¼ ½1; 1 is singular with x ð1Þ ¼ 1 (and 1> y is estimable for x ). It is well known that the algorithm generates a sequence of points such that xn converges to the optimum, in the sense that limn!1 1> M1 ðxn Þ1 ¼ 1> M ðx Þ1. We show by elementary calculus that xn converges strongly to x , in contrast with the situation considered in Section 2. Take x1 ; x2 such that Mðx2 Þ is non-singular. By construction, Mðxk Þ is then non-singular for all k and the design sequence is such that   2 x , (6) xkþ1 ¼ arg max 1> M1 ðxk Þ 2 x2½1;1 x

ARTICLE IN PRESS A. Pa´zman, L. Pronzato / Statistics & Probability Letters 76 (2006) 1089–1096

1093

see Eq. 4.1 in Wynn (1972). Straightforward calculation shows that xkþ1 maximizes " #2 k k X X x2 ðx2i  x3i Þ þ x ðx4i  x3i Þ ¼ x2 ðxS 0k þ S k Þ2 i¼1

i¼1

Pk

with Sk ¼ i¼1 x3i ðxi  1Þ and S 0k ¼ ½1; 1 at x ¼ 1 and  1 if S k 40; xkþ1 ¼ 1 otherwise:

Pk

2 i¼1 xi ð1

 xi Þ. Note that S 0k 40. This function reaches its maximum in

When xkþ1 ¼ 1, Skþ1 ¼ 2 þ Sk so that Sk ultimately becomes positive and xjþ1 equals 1 for some j. When this happens, Sjþ1 ¼ Sj and xi ¼ 1 for all subsequent i, i ¼ j þ 1; j þ 2; . . . . The number of observations at xa1 is thus finite. The design measure xn converges strongly to x and limn!1 nvarð1> y^ n Þ ¼ 1> M ðx Þ1 ¼ 1. Notice the difference with (4). The method of steepest-descent for designing an optimal experiment for the estimation of hðyÞ in model (1) minimizes qhðyÞ=qy> M ðxÞqhðyÞ=qyjy¯ and is based on the iterations jy¯   2 x qhðyÞ 1 xkþ1 ¼ arg max M ðxk Þ 2 . (7) x2½1;1 qy> jy¯ x For hðyÞ given by (2), when y¯ 1 þ y¯ 2 a0 the limiting optimum design is non-singular and there are no difficulties. When y¯ 1 þ y¯ 2 ¼ 0, the iterations are given by (6) and xn converges strongly to x which is singular. Moreover, from the results above the number of observations at xa1 is finite. It is this type of situation that we investigate below in more details. In the rest of the section we consider the estimation of hðyÞ for different cases of measures that converge strongly to x . Suppose that m observations are performed at x ¼ z for some z 2 ½1; 1, za1, za0, and n  m at x ¼ 1. The LS estimator of y is then given by " !#   2 1 z 1 d g m nm n pffiffiffiffi y^ ¼ y¯ þ , (8) þ pffiffiffiffiffiffiffiffiffiffiffiffi z  z2 m 1 nm z P P pffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffi where dm ¼ xi ¼z ei = m and gnm ¼ xi ¼1 ei = n  m. They are independent and both tend to be distributed Nð0; 1Þ when m ! 1 and n  m ! 1. 3.2. Consistency a:s: a:s: ¯ as soon as m ! 1 and n ! 1. However, when n ! 1 with m fixed, We have y^ n ! y¯ (and hðy^ n Þ ! hðyÞ) then   1 dm a:s: # ^yn ! ^y ¼ y¯ þ 1 p ffiffiffiffi (9) z  z2 m 1

and y^ n is not consistent. hðy^ n Þ is then not consistent, except when y¯ 1 þ y¯ 2 ¼ 0. Indeed, in that case we obtain ¯ ¼ 1. Only this situation is investigated further when m is fixed. y^ #1 þ y^ #2 ¼ 0 so that hðy^ # Þ ¼ hðyÞ 2 3.3. Asymptotic distribution of hðy^ n Þ Case (a): m is fixed and y¯ 1 þ y¯ 2 ¼ 0. We can write   pffiffiffi ^ n pffiffiffi pffiffiffi ¯ ¼ n½hðy^ n Þ  hðy^ # Þ ¼ nðy^ n  y^ # Þ> qhðyÞ þ op ð1Þ , n½hðy Þ  hðyÞ qy jy^ #

ARTICLE IN PRESS A. Pa´zman, L. Pronzato / Statistics & Probability Letters 76 (2006) 1089–1096

1094

with qhðyÞ=qyjy^ # ¼ 1=ð2y^ #2 Þ. From (8) and (9), ! pffiffiffi z2 pffiffiffi ^ n ^ # n gnm pffiffiffiffiffiffiffiffiffiffiffiffi nð y  y Þ ¼ z  z2 n  m z 1 ! z3 N 0; ðz  z2 Þ2 d

z4

z3

z3

z2

!! ,

which gives pffiffiffi ^ n d 1 n ¯ ! , (10) n½hðy Þ  hðyÞ 2z p ffiffi ffi where nNð0; 1Þ and zNðy¯ 2 ; 1=½mðz  z2 Þ2 Þ are independent. hðy^ n Þ thus converges as 1= n but its limiting distribution is not normal and depends on the choice of z. Case (b): m ! 1 and m=n ! 0, n ! 1. Suppose first that y¯ 1 þ y¯ 2 a0. We can write   pffiffiffiffi pffiffiffiffi ^ n ¯ ¼ mðy^ n  yÞ ¯ > qhðyÞ þ op ð1Þ , m½hðy Þ  hðyÞ qy jy¯ ¯ > . From (8), since m=n ! 0, with qhðyÞ=qyjy¯ ¼ 1=ð2y¯ 2 Þ½1; 2hðyÞ    1 1 pffiffiffiffi ^ n 1 d ¯ ! mðy  yÞ z4 N 0; ðz  z2 Þ2 1 1 which gives ! ¯ 1 þ y¯ 2 Þ2 pffiffiffiffi ^ n ð y d ¯ ! z5 N 0; m½hðy Þ  hðyÞ . 4 4y¯ 2 ðz  z2 Þ2 pffiffiffiffi hðy^ n Þ is thus asymptotically normal and converges as 1= m. The limiting variance depends on z. Suppose now that y¯ 1 þ y¯ 2 ¼ 0. We obtain from (8), ! n pffiffiffi pffiffiffi ^ n y^ 1 1 ¯ n½hðy Þ  hðyÞ ¼ n  n  2y^ 2 2 rffiffiffiffiffiffiffiffiffiffiffiffi 1 gnm n dm 1 gnm z ¯ ¼  þ pffiffiffiffiffiffiffiffiffiffiffiffi y2  pffiffiffiffi nm 2 m z  z2 n  m z  z2 so that ! pffiffiffi ^ n 1 d ¯ ! z6 N 0; n½hðy Þ  hðyÞ . 2 4y¯ 2

(11)

(12)

In contrast with (5), (10) and (11), this is the unique case that leads to the expression used by Silvey (1980), with a speed of convergence that coincides with that obtained for non-singular designs. 4. Discussion Suppose that one knows a priori that y¯ 1 þ y¯ 2 is close to 0, and designs an experiment that tries to approach x which puts weight 1 at x ¼ 1, for estimating hðyÞ for this choice lies in pffiffiffi given by (2) in (1). The justification 2 the asymptotic result (12): hðy^ n Þ converges in 1= n, the asymptotic variance 1=ð4y¯ 2 Þ is the minimum over all ¯ ¼ 1. However, the results of Sections 2 and 3 give possible designs when y¯ 1 þ y¯ 2 ¼ 0, that is, when hðyÞ 2  evidence of the risk of using a design approaching x . 

 

¯ ¼ 1 but xn converges weakly to x , the limiting variance of hðy^ n Þ is larger than 1=ð4y¯ 2 Þ, see (4). When hðyÞ 2 2 ¯ ¼ 1 but the number of observations at xa1 is finite, as is the case of a design generated by the When hðyÞ 2 steepest descent algorithm, the limiting distribution of hðy^ n Þ is not normal, see (10).

ARTICLE IN PRESS A. Pa´zman, L. Pronzato / Statistics & Probability Letters 76 (2006) 1089–1096



1095

1 1 1 ¯ ¯ When hðyÞa would be 2, although close to 2 (and one cannot be sure that pffiffiffihðyÞ ¼ 2, otherwise no ^experiment n ^ needed), the speed of convergence of hðy Þ is slower than n, see (5), (11), and hðyn Þ may even be not consistent, see (9).

A first possibility to avoid these difficulties is to use a non-singular design, at the cost of a possible losspof ffiffiffi efficiency. For instance, a design xa that puts weight a at x ¼ 1 and 1  a at 1, 0oao1, ensures nconvergence of hðy^ n Þ, and ! pffiffiffi ^ n 1 d 1 ¯ ! z7 N 0; ¯ ¯ > n½hðy Þ  hðyÞ ½1; 2hðyÞM ðxa Þ½1; 2hðyÞ 2 4y¯ 2 as n ! 1. When one knows that y¯ 1 þ y¯ 2 is close to 0, one may then use xa with a close to 1. Its efficiency is given by effða; hÞ ¼

½1; 2hM1 ðxa Þ½1; 2h> , ½1; 2hM1 ðxa Þ½1; 2h>

where a ¼ a ðhÞ corresponds to x ð1Þ in (3). The function effða; hÞ is plotted in Fig. 1 for a 2 ½0:5; 1Þ, h 2 ½0:25; 0:75. Although effða; 12Þ quickly decreases when a moves away from 1, the loss of efficiency remains reasonable for small departures. In particular, x3=4 is maximin-efficient, see Silvey (1980, p. 59): it guarantees effð34 ; hÞX0:75 for any hX0, the minimum efficiency being obtained for h ¼ 0 and h ¼ 12. (Note the difference with (Schwabe, 1997) where y1 is not restricted to be positive. The maximin-efficient design is then x1=2 , it is Doptimal and its minimum efficiency is 0.5.) Another option consists in designing xn sequentially, that is, using algorithm (7) with qhðyÞ=qyjy^ k substituted for qhðyÞ=qy ¯ in the determination of xkþ1 . Strong consistency of y^ n is proved in Ford and Silvey (1980), xn jy

¯ and the asymptotic normality converges to the optimum design xy¯ for y, ! pffiffiffi ^ n 1 d   > ¯ ¯ ¯ n½hðy Þ  hðyÞ ! z8 N 0; 2 ½1; 2hðyÞM ðxy¯ Þ½1; 2hðyÞ 4y¯

(13)

2

n ! 1, is proved in Wu (1985). The asymptotic efficiency thus equals one. In particular, (13) remains valid when y¯ 1 þ y¯ 2 ¼ 0, and then coincides with (12). When feasible, sequential design thus appears as the natural remedy to the issues raised in Sections 2 and 3. However, some difficulties should not be underestimated. The proof in Wu (1985) of the asymptotic result (13) under a sequential design is very much problem specific. Strong consistency of the LS estimator in the linear model under a sequential design requires stronger

Fig. 1. Efficiency effða; hÞ.

ARTICLE IN PRESS 1096

A. Pa´zman, L. Pronzato / Statistics & Probability Letters 76 (2006) 1089–1096

conditions than M1 ðxn Þ=n ! 0, see Lai and Wei (1982). Bayesian imbedding permits to weaken those conditions (Sternby, 1977) (at the expense of obtaining strong consistency of the estimator for almost all values of y¯ with respect to some prior distribution), but its application to the sequential design of experiments (Hu, 1998) prohibits singular designs. We hope we have convinced the reader of the richness of possible asymptotic behaviors of estimators under asymptotically singular designs. Combining this with a sequential construction of the design raises many challenging issues. References Buonaccorsi, J., Iyer, H., 1986. Optimal designs for ratios of linear combinations in the general linear model. J. Statist. Plann. Inference 13, 345–356. Chaloner, K., 1989. Bayesian design for estimating the turning point of a quadratic regression. Comm. Statist .Theory Methods 18 (4), 1385–1400. Fedorov, V., Mu¨ller, W., 1997. Another view on optimal design for estimating the point of extremum in quadratic regression. Metrika 46, 147–157. Ford, I., Silvey, S., 1980. A sequentially constructed design for estimating a nonlinear parametric function. Biometrika 67 (2), 381–388. Ford, I., Titterington, D., Wu, C., 1985. Inference and sequential design. Biometrika 72 (3), 545–551. Hu, I., 1998. On sequential designs in nonlinear problems. Biometrika 85 (2), 496–503. Huber, P., 1973. Robust regression: asymptotics, conjectures and Monte Carlo. Ann. Statist. 1 (5), 799–821. Lai, T., Wei, C., 1982. Least squares estimates in stochastic regression models with applications to identification and control of dynamic systems. Ann. Statist. 10 (1), 154–166. Pa´zman, A., 1980. Singular experimental designs. Math. Operationsforsch. Statist., Ser. Statist. 16, 137–149. Schwabe, R., 1997. Maximin efficient designs. Another view at D-optimality. Statist. Probab. Lett. 35, 109–114. Silvey, S., 1980. Optimal Design. Chapman & Hall, London. Sternby, J., 1977. On consistency for the method of least squares using martingale theory. IEEE Trans. Automat. Control 22 (3), 346–352. Wu, C.-F., 1980. Characterizing the consistent directions of least squares estimates. Ann. Statist. 8 (4), 789–801. Wu, C., 1985. Asymptotic inference from sequential design in a nonlinear situation. Biometrika 72 (3), 553–558. Wynn, H., 1972. Results in the theory and construction of D-optimum experimental designs. J. Roy. Statist. Soc. B 34, 133–147.