Screening designs for model discrimination

Screening designs for model discrimination

ARTICLE IN PRESS Journal of Statistical Planning and Inference 140 (2010) 766–780 Contents lists available at ScienceDirect Journal of Statistical P...

323KB Sizes 0 Downloads 94 Views

ARTICLE IN PRESS Journal of Statistical Planning and Inference 140 (2010) 766–780

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi

Screening designs for model discrimination Vincent Agboto a, William Li b,, Christopher Nachtsheim b a b

Department of Family and Community Medicine, Meharry Medical College, Nashville, TN 37208, USA Carlson School of Management, University of Minnesota, Minneapolis, MN 55455, USA

a r t i c l e i n f o

abstract

Article history: Received 2 March 2009 Received in revised form 31 August 2009 Accepted 7 September 2009 Available online 12 September 2009

We introduce new criteria for model discrimination and use these and existing criteria to evaluate standard orthogonal designs. We show that the capability of orthogonal designs for model discrimination is surprisingly varied. In fact, for specified sample sizes, number of factors, and model spaces, many orthogonal designs are not model discriminating by the definition given in this paper, while others in the same class of orthogonal designs are. We also use these criteria to construct optimal two-level modeldiscriminating designs for screening experiments. The efficacy of these designs is studied, both in terms of estimation efficiency and discrimination success. Simulation studies indicate that the constructed designs result in substantively higher likelihoods of identifying the correct model. & 2009 Elsevier B.V. All rights reserved.

Keywords: Bayesian designs Coordinate exchange algorithm Design projections Model discrimination Model-robust design Non-regular designs

1. Introduction Screening experiments are widely used in industrial settings to identify the critical set of active factors from a muchlarger set of candidate factors. They are often employed during the early stages of a research project, when little information is available concerning the impact of various factors. Popular designs for screening experiments include regular orthogonal designs, in the form of resolution III fractional factorial designs, and non-regular orthogonal designs, often in the form of Plackett–Burman (PB) designs. When using resolution III fractional factorial designs or PB designs, the experimenter is asked to suspend temporarily his or her belief in interactions, with the promise that they will be dealt with in later experiments. Because follow-up experiments may or may not happen, and because the presence of one or more active two-factor interactions can seriously bias the analysis of the screening experiment, increased attention has been given recently to experiments that have the capability of screening main effects and key two-factor interactions simultaneously. One fruitful approach has been to better exploit the capabilities of non-regular designs, such as PB designs, for identifying active interactions in addition to main effects by considering their projective properties. For example, it has been demonstrated (Lin and Draper, 1992; Box and Bisgaard, 1993) that some PB designs contain a 23 full factorial and a 231 fractional factorial design when projected onto three factors. Cheng (1995) showed that for any non-regular design with run size not equal to eight, any projection onto four factors permits estimation of all main effects and two-factor interactions. Another avenue of attack has been the use of model-robust designs. Loosely stated, a design is model-robust if it maximizes a design criterion over a model space F of candidate models, rather than for just a single known, or assumed,

 Corresponding author.

E-mail addresses: [email protected] (V. Agboto), [email protected] (W. Li), [email protected] (C. Nachtsheim). 0378-3758/$ - see front matter & 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2009.09.005

ARTICLE IN PRESS V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

767

¨ model. Studies of model robustness date back to Lauter (1974), who proposed, among other criteria, the maximization of the average log-determinant of the information matrices, where the average is taken over the set of information matrices generated by model space. Cook and Nachtsheim (1982) developed linear-optimal model-robust designs. Sun (1993) extended these ideas with his development of the estimation capacity (EC) criterion. The estimation capacity of a design is simply the fraction of models comprised of all main effects and g two-factor interactions that are estimable. Li and Nachtsheim (2000) used estimation capacity and information capacity (IC) to construct model-robust factorial designs. Bingham and Li (2002) obtained optimal orthogonal designs using a revised (EC, IC)-criterion in the context of robust parameter designs. Tsai et al. (2000, 2007) studied three-level main-effects designs that are robust to model uncertainty. Other, recent contributions to this literature are discussed in Section 2. As noted by Jones et al. (2007) and others, one limitation of the model-robust approach is that, while all models in the model space may be estimable for a particular model-robust design, aliasing among estimable models may exist. Modeldiscriminating designs are intended to address this shortcoming. In this paper, we introduce new criteria for model discrimination and apply these and existing criteria to evaluate standard, two-level orthogonal designs for screening experiments. We consider two model spaces, referred to herein as MEPIg and PMSq . The MEPIg (Main Effects Plus g Interactions) model space is given by MEPIg ¼ fmodels with all m main effects and any g two-factor interactionsg:

ð1Þ

This model space was motivated by the work of Srivastava (1975), who classified factorial effects into three categories: (i) effects certain to be negligible, (ii) required effects, and (iii) remaining effects which may or may not be active. The model space for Srivastava’s search designs consisted of all models involving all effects of type (ii) plus g effects of type (iii). MEPIg is clearly a special case. MEPIg has been employed recently by Sun (1993); Li and Nachtsheim (2000); Jones et al. (2007), and Li (2006) in the construction of model-robust and model-discriminating designs. The number of models in MEPIg is  m nm ¼ ðg2 Þ . Another useful model space, PMSq (projective model space of dimension q), was recently considered by Loeppky et al. (2007). This space is defined as PMSq ¼ fmodels with qom main effects and all corresponding two-factor interactionsg:

ð2Þ

Unlike MEPIg , this model space does not assume that all first-order effects are important; however, if two main effects are present in a model, then the corresponding two-factor interaction must also be present. Loeppky et al. (2007) evaluated the model robustness of 16-run and 20-run orthogonal designs for the PMSq model space of (2). The number of models in PMSq is nm ¼ ðm q Þ. In Section 2, we review prior work in model discrimination and introduce new criteria. In Section 3, we employ key discrimination criteria to evaluate standard orthogonal designs for 12, 16, and 20 runs for the MEPIg and PMSq model spaces. In Section 4 we consider the construction of optimal model-discriminating designs. Then we discuss the efficacy of these designs in Section 5. Concluding remarks are given in Section 6. In what follows, we employ the following notation. A two-level design d having n runs and m factors is represented by the n  m design matrix Xd ¼ ½x1 ; . . . ; xn 0 , where each row of Xd is an m-vector whose elements are þ1 or 1. The n  p model matrix X is given by X ¼ ½fðx1 Þ; . . . ; fðxn Þ0 ; where the functional f indicates which effects are present in the model. For example, for the model consisting of all m main effects and all ðm 2 Þ interactions, we have f 0 ðxi Þ ¼ ð1; xi1 ; . . . ; xim ; xi1 xi2 ; xi1 xi3 ; . . . ; xi;m1 xim Þ: Throughout, we assume that the standard linear model assumptions are valid for at least one model in F : y ¼ Xb þ e; where b is a p  1 vector of unknown parameters and the error vector e has variance–covariance matrix s2 I.

2. Model discrimination criteria As already noted, for a given design d, two candidate models f i and f j may both be estimable, but the experimenter may have little ability (or power) for determining which is the correct model. Let Xi and Xj denote the n  pi and n  pj model matrices for design d corresponding to f i and f j . Criteria for model discrimination generally consider the degree to which model pairs can be distinguished. We consider a design d to be model-discriminating for model space F and criterion C if the following two conditions are met: 1. Design d has estimation capacity equal to 100% for model space F . 2. All model pairs have the potential to be discriminated by criterion C.

ARTICLE IN PRESS 768

V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

Overall discrimination criteria considered here will generally be written either as a weighted average of discrimination measures, taken over all nm ðnm  1Þ model pairs: C ¼ Si Sjai wij Cij ; where wij are model-pair weights, and Cij is a measure of model discrimination for model pairs f i and f j , or as a minimum discrimination measure: Cmin ¼ min Cij : iaj

In this section, we review existing criteria and introduce three new criteria for model discrimination. 2.1. Existing criteria Like model robustness, work in model discrimination is not new. Sequential designs for discriminating between two models were derived by Fedorov and Malyutov (1972); Fedorov and Uspensky (1975) and Atkinson and Fedorov (1975a). The sequential design of experiments for discriminating among three or more regression models was taken up by Atkinson and Fedorov (1975b) using a Bayesian approach. They summarized the properties of the designs in a generalized equivalence theorem. More recently, Ponce De Leon and Atkinson (1991) used numerical methods to obtain non-sequential optimum designs that satisfy an equivalence theorem, which can be used both for the construction of designs and for checking the optimality of proposed designs. Atkinson and Fedorov (1975b) generally focused on sequential designs for nonlinear regression models but gave some useful guidelines for how to proceed in the case of linear models. They noted that their Bayesian T-optimality criterion suggests various alternatives for model discrimination in the case of non-sequential designs for linear models. For two ðjÞ and bðjÞ , respectively, models f i and f j in F , let f i denote the vector of piðjÞ model terms that are in f i but not in f j , and let XðjÞ i i ðjÞ denote the model matrix and coefficient vector corresponding to f i . For design d, the non-centrality parameter for the j th model, when the i th model is true, is 0

DðjÞ ðdÞ ¼ bðjÞ MðjÞ bðjÞ ; i i i i

ð3Þ

where 0

MðjÞ ¼ XðjÞ ðI  Hj ÞXðjÞ ; i i i

ð4Þ

and Hj ¼ Xj ðXj0 Xj Þ1 Xj0 . Atkinson and Fedorov (1975b) proposed the following pairwise measure for model discrimination AFij ¼

1 piðjÞ

logjMðjÞ j: i

ð5Þ

For overall model discrimination criteria, Atkinson and Fedorov (1975b) suggested both a weighted average of the AFij and 1=pðjÞ i . They also proposed consideration of the maximization of the minimum (over i and j) determinant, jMðjÞ i j v X 1 i¼1

sj

logjMð0Þ j; i

is the sj  sj dispersion matrix of the complement of the j th model in the overall model formed by combining all where Mð0Þ i linearly independent terms in the v models. For MEPIg and PMSq , this criterion is similar to AFij , except that it is assumed that the true model would be comprised of all main effects and two-factor interactions. We do not pursue this criterion here. For brevity we refer to (5) as the AF model discrimination criterion. Note that AFij is not necessarily symmetric in i and j. Other authors have developed model discrimination criteria based on distance measures between predictive distributions for model pairs. Meyer et al. (1996) used the Kullback–Leibler information, while Bingham and Chipman (2007) employed the Hellinger distance. Jones et al. (2007) introduced three pairwise model discrimination measures: the subspace angle, the expected prediction difference, and the maximum prediction difference. In this article we consider the Expected Prediction Difference (EPD) criterion, which, for models f i and f j , is defined as Z EPDij ¼ EðJy^ i  y^ j J2 jJyJ ¼ 1Þ ¼ y0 Dij y dy; JyJ¼1

where Dij ¼ ðHi  Hj ÞðHi  Hj Þ. Assuming a prior uniform distribution on the unit sphere for the centered response vector, it is shown that EPDij ¼

1 TraceðDij Þ: n

Note that EPDij is symmetric in i and j.

ð6Þ

ARTICLE IN PRESS V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

769

2.2. New criteria In this section we introduce three new model discrimination criteria, namely, the As criterion, the Expected NonCentrality Parameter (ENCP) criterion, and the Bayesian Expected Prediction Difference (BEPD) criterion. The first two criteria can be thought of as extensions of the AF criterion, while the third will be shown to be related to both the AF and the EPD criteria. We note that Atkinson–Fedorov discrimination measure AFij is equivalent to Ds optimality for the single model ðjÞ comprised of terms in f j [ f i (i.e., terms in f j plus terms in f i that are not in f j ), and interest centers on the subset of . A natural variant of the Ds criterion is provided by As optimality. As a measure of pairwise model parameters bðjÞ i discrimination, we have Asij ¼

1 pðjÞ i

1 Trace½ðMðjÞ i Þ :

ð7Þ

ja0. We refer to (7) as the As model discrimination criterion. Note that the matrix MiðjÞ defined in (4) is invertible only if jMðjÞ i ¼ 0, neither AF ij nor Asij could be computed. When MðjÞ i , Both AF and As criteria are conservative, in the sense that jMiðjÞ j will be zero as long as there exists just one direction, bðjÞ i 0 ðjÞ ðjÞ M b ¼ 0. A less conservative, Bayesian measure of discrimination between for which the non-centrality parameter bðjÞ i i i two models can be obtained if we assume, a priori, that bNð0; s2b IÞ, and we consider the expected non-centrality parameter. In this case we have the following result, the justification for which is standard and is provided in the appendix. Theorem 1. Under the conditions described above, the expected non-centrality parameter is given by 0

MðjÞ bðjÞ g ¼ s2b Trace½MðjÞ : Eb fbðjÞ i i i i To standardize this measure, we take s2b ¼ 1 without loss of generality. Also, because the number of parameters varies with i and j, we employ the ‘‘equal interest’’ criterion option (Atkinson and Donev, 1992) and normalize by dividing by piðjÞ. The resulting expected non-centrality parameter criterion (ENCP) is ENCPij ¼

1 pðjÞ i

Trace½MðjÞ : i

ð8Þ

The ENCP criterion is closely related to the As criterion; however, the As criterion requires that MðjÞ i be invertible in order to discriminate between two models. The BEPD criterion is motivated by the EPD criterion of Jones et al. (2007). From (6), the squared Prediction Distance (PD) for any two models f i and f j and response vector y is given by PD ¼ y0 Dij y. Assume that the true model is f i and that, a priori, eNð0; s2e IÞ and bNð0; s2b IÞ. Then we have the following result, the proof of which is provided in the appendix. Theorem 2. Under the conditions described above, the expected prediction distance is given by 2 Eb;e ðPDÞ ¼ s2b Trace½MðjÞ i  þ se Trace½Dij :

Note that Eb;e ðPDÞ is a linear combination of two criteria already introduced, ENCP of (8) and EPD of (6). Clearly, when treatment effects are expected to be large relative to experimental error (i.e., for s2b =s2e -1), the expected prediction difference is equivalent to the ENCP criterion. When noise is expected to dominate, (i.e., for s2b =s2e -0) the expected prediction difference is equivalent to EPD. We define the Bayesian Expected Prediction Difference (BEPD) criterion as BEPDij ¼ aENCPij þ ð1  aÞEPDij ;

ð9Þ

where 0rar1 is a mixing parameter. 2.3. Summary of criteria considered In the next two sections, we use the following model discrimination criteria to evaluate orthogonal designs and to construct optimal model-discriminating designs: AF ¼

X 1 wij ðjÞ logjMðjÞ i j; p iaj i

AFmin ¼ min iaj

As ¼

1 pðjÞ i

jMðjÞ j; i

X 1 1 wij ðjÞ Trace½ðMðjÞ i Þ ; p iaj i

ð10Þ

ð11Þ

ð12Þ

ARTICLE IN PRESS 770

V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

Asmax ¼ max iaj

ENCP ¼

1 pðjÞ i

ð13Þ

X 1 wij ðjÞ Trace½MðjÞ ; i p iaj i

ENCPmin ¼ min iaj

EPD ¼

Trace½ðMðjÞ Þ1 ; i

1 piðjÞ

ð14Þ

Trace½MiðjÞ ;

ð15Þ

X 1 wij Trace½Dij ; n iaj

ð16Þ

1 EPDmin ¼ min Trace½Dij : iaj n

ð17Þ

Throughout, we employ uniform weights wij ¼ 1=nm ðnm  1Þ and take the summations over iaj. This is because the summands are generally symmetric in i and j only for the EPD criterion (16). The AF, As , and ENCP criteria, which are based ðdÞ, are not necessarily symmetric in i and j. on DðjÞ i For a design d to be model-discriminating over a given criterion, we require (in addition to 100% estimation capacity): j40 minjMðjÞ i

criteria;

ð18Þ

40 for the ENCP criteria; minTrace½MðjÞ i

ð19Þ

minTrace½Dij 40

ð20Þ

iaj

for the AF and A

s

iaj

iaj

for the EPD criteria:

Note that for a design d to be model-discriminating by the BEPD criterion, it is necessary that the design be modeldiscriminating by at least one of the ENCP and EPD criteria. We do not report criterion values for BEPD in what follows, since the BEPD criterion can be obtained from the ENCP and EPD values. We noticed ENCP and EPD generally agree with each on the ranking orthogonal designs. But we also observe several occasions when they do not agree each other. For instance, for 16  6 designs over MEPI1, ENCP ranks design #8 over design #5, whereas EPD does the opposite. The following result indicates that if a model-robust design is characterized as non-discriminating by any one of the EPD, ENCP, or BEPD criteria, then it will also be similarly characterized by the other criteria. (In this article we define a design to be model robust if all the models in the model space are estimable with this design.) Thus these three criteria are equivalent in terms of their propensity to identify a design as non-discriminating. The result also indicates that the AF and As criteria are more conservative model-discrimination criteria, as expected. The justification for the result is given in the appendix. Theorem 3. Assume that the design d is model robust. Then for design d and model pair ði; jÞ, j ¼ 0: EPDij ¼ 03ENCPij ¼ 03BEPDij ¼ 0 ) jMðjÞ i From Theorem 3, if a design d is model-discriminating over AF or As, then it is also model-discriminating over ENCP and EPD. On the other hand, if d is not model-discriminating over AF (or As ), then it may or may not be model-discriminating over ENCP and EPD. We now turn to the use of these criteria to evaluate the model discrimination properties of standard orthogonal designs.

3. Evaluating standard orthogonal designs Two-level orthogonal designs with small run sizes (e.g., nr20) are widely used in practice. Complete catalogs of all nonisomorphic orthogonal designs for n ¼ 12, 16, and 20 were obtained by Sun et al. (2008). For given n and m, there is usually more than one n  m orthogonal design, and the number of such designs can be quite large for n ¼ 16 and 20. For instance, for m ¼ 7, there are 55 orthogonal designs with 16 runs and 474 orthogonal designs with 20 runs. Those designs may perform quite differently from the perspectives of model robustness and model discrimination. Choosing the ‘‘best’’ orthogonal design for given n and m can be of interest to both researchers and practitioners. In this section, we rank and tabulate efficient 12-, 16-, and 20-run designs for practical use, using the model discrimination criteria discussed in the last section.

ARTICLE IN PRESS V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

771

3.1. Evaluating 12-run designs We first study 12-run designs in connection with MEPIg . For a given m, 3rmr11, there is only one 12  m nonisomorphic orthogonal design, except for m ¼ 5 and 6, where there are, respectively, two 12  5 and 12  6 designs. In Table 1, we summarize the performance of these 12-run designs in terms of the eight criteria listed in (10)–(17). In the table ‘‘N.D., EC ¼ 100%’’ is listed whenever a design is model robust but is not model-discriminating (over AF). If the design is model-discriminating over ENCP and EPD, the ENCP and EPD values are reported. Conditions required for a design to be model discriminating were given in (18)–(20). We list ‘‘ECo100%’’ if a design is not model robust (and therefore not model discriminating). We first note from Table 1 that not all 12-run orthogonal designs are appropriate for model robustness and model discrimination. When g ¼ 1, all six designs considered in the table have EC ¼ 100%. However, only the first four designs are model-discriminating for the criteria considered. When g ¼ 2, four designs are model-robust, but only the first one (for m ¼ 4) is model-discriminating for all eight criteria. It is interesting to note that when g ¼ 1 and m ¼ 5 or 6, where there are two non-isomorphic designs, the two designs perform quite differently. When m ¼ 5, the two designs have similar values in terms of all eight criteria. However, for the second design listed, all minimum (or, for As, maximum) values are equal to the corresponding average values, implying that all model pairs have the same model discrimination criterion values. Thus, this design may be preferable. When m ¼ 6, the first design clearly outperforms the second for all criterion values considered. We now study 12-run designs for the PMSq model space of (2). Table 2 compares 12-run designs for m ¼ 4; . . . ; 9 for the PMSq model space ðq ¼ 2; 3Þ. All designs have EC ¼ 100%, as predicted in Cheng (1995). In addition, all are modeldiscriminating designs when q ¼ 2. Generally, designs have better model discrimination capability for the PMSq model space than the MEPIg model space. This is not surprising because models in the former space contain only q main effects plus all corresponding two-factor interactions, whereas MEPIg requires estimation of all m main effects plus a few twofactor interactions. Note that the two 12  m ðm ¼ 5; 6Þ designs have identical criteria values for q ¼ 2. When q ¼ 3, the second 12  5 design performs slightly better ðEPD ¼ 0:398Þ than the first ðEPD ¼ 0:389Þ. And the first 12  6 design is just slightly better ðEPD ¼ 0:415Þ than the second ðEPD ¼ 0:409Þ.

3.2. Evaluating 16-run orthogonal designs As expected, for a given m, the 16  m orthogonal designs also perform very differently for model discrimination purpose. Consider, for instance, the 55 non-isomorphic orthogonal designs with m ¼ 7 factors. Tables 3 and 4 summarize the model discrimination criterion values of those designs for the MEPIg ðg ¼ 1; 2Þ model spaces, respectively. It can be seen that only a small fraction of designs are appropriate for model discrimination. For g ¼ 1, Table 3 shows that about half (27 out of 55) are model-robust designs. Among them, only 10 are model-discriminating designs. For g ¼ 2, Table 4 shows that there are 10 model-robust designs. Among these designs, none are model-discriminating by the AF criterion; however, one design, #52, is model-discriminating by the ENCP and EPD (and therefore by the BEPD) criterion. For comparison reason, we include results on Hellinger Distance (HD) in Tables 3 and 4 and note that the HD criterion generally ranks the designs consistently with other model discrimination criteria (e.g., ENCP). In Table 5 we summarize results of 16-run model-robust and model-discriminating designs for m ¼ 4; 5; . . . ; 8 over both the MEPIg ðg ¼ 1; 2; 3Þ and the PMSq ðq ¼ 2; 3; 4Þ. For mZ9 there are no model-discriminating designs over either MEPIg or PMSq . In the table, results are reported in four rows for each m. The first two rows display the numbers of model-robust designs and model-discriminating designs. The 3rd row provides the design index of the best design in terms of the AF Table 1 Summary of discrimination criteria values for 12-run orthogonal designs for the MEPIg model space ðg ¼ 1; 2Þ. g

m

Asmax

As

AFmin

AF

ENCPmin

ENCP

EPDmin

EPD

1

4 5 5 6 6 7

0.131 0.225 0.141 0.234 N.D., EC ¼ 100% N.D., EC ¼ 100%

0.114 0.143 0.141 0.182

2.031 1.492 1.962 1.451

2.177 1.962 1.962 1.721

7.619 4.444 7.111 4.267

8.838 7.210 7.111 5.689

0.136 0.093 0.148 0.107

0.158 0.150 0.148 0.142

2

4 5 5 6 6 7

0.134 N.D., EC ¼ 100% N.D., EC ¼ 100% N.D., EC ¼ 100% ECo100% ECo100%

0.122

2.021

2.114

7.556 3.556 5.333 2.667

8.391 6.365 6.000 4.571

0.138 0.083 0.125 0.093

0.220 0.220 0.205 0.200

Note: For m ¼ 5; 6, there are two non-isomorphic designs.

ARTICLE IN PRESS 772

V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

Table 2 Summary of model discrimination criteria values for 12-run orthogonal designs for the PMSq model space ðq ¼ 2; 3Þ. q

m

Asmax

As

AFmin

AF

ENCPmin

ENCP

EPDmin

EPD

2

4 5 5 6 6 7 8 9

0.109 0.109 0.109 0.109 0.109 0.109 0.109 0.109

0.097 0.099 0.099 0.100 0.100 0.101 0.102 0.103

2.250 2.250 2.250 2.250 2.250 2.250 2.250 2.250

2.344 2.328 2.328 2.317 2.317 2.308 2.302 2.297

9.778 9.778 9.778 9.778 9.778 9.778 9.778 9.778

10.489 10.370 10.370 10.286 10.286 10.222 10.173 10.133

0.296 0.296 0.296 0.296 0.296 0.296 0.296 0.296

0.319 0.333 0.333 0.344 0.344 0.352 0.358 0.363

3

4 5 5 6 6 7 8 9

0.135 N.D., EC ¼ 100% N.D., EC ¼ 100% N.D., EC ¼ 100% N.D., EC ¼ 100% N.D., EC ¼ 100% N.D., EC ¼ 100% N.D., EC ¼ 100%

0.135

2.010

2.010

7.500 6.800 7.200 5.750 5.750 5.750 5.750 5.750

7.500 7.267 7.400 7.117 7.076 6.971 6.857 6.764

0.352 0.352 0.352 0.352 0.352 0.352 0.352 0.352

0.352 0.389 0.398 0.415 0.409 0.429 0.440 0.447

Note: For m ¼ 5; 6, there are two non-isomorphic designs.

Table 3 Evaluation of 16  7 designs in terms of model-discriminating criteria values for the MEPI1 model space. Design

AFmin

1–5, 7–11, 13–19 22, 26–27, 29–30 6, 12, 20–21 23–25, 28, 31 32 33–34 35–39 40–42 43 44 45 46 47–48 49 50 51 52 53 54 55

ECo100%

AF

ENCPmin

ENCP

EPDmin

EPD

HDmin

HD

2.276

4.000

10.451

0.063

0.114

1.228

1.382

2.136

2.000

9.365

0.042

0.113

1.010

1.342

2.057

4.000

8.229

0.063

0.113

1.228

1.321

2.265 2.027 1.936 1.936 2.137 2.091 2.197

4.000 2.000 6.000 6.000 2.000 2.000 2.667

10.314 8.298 7.000 7.000 9.400 8.854 9.886

0.063 0.042 0.094 0.094 0.042 0.042 0.063

0.112 0.113 0.109 0.109 0.114 0.114 0.114

1.228 0.990 1.239 1.239 1.013 1.010 1.101

1.379 1.311 1.286 1.286 1.343 1.330 1.360

N.D., EC ¼ 100% 1.386 ECo100% N.D., EC ¼ 100% ECo100% 0.693 N.D., EC ¼ 100% 1.386 ECo100% N.D., EC ¼ 100% 1.386 0.693 1.792 1.792 0.693 0.693 0.981

Table 4 Evaluation of 16  7 designs in terms of model-discriminating criteria values for the MEPI2 model space. Design

AFmin

1–31, 33–42, 44, 46–48 32, 43, 45, 49–51, 53–55 52

ECo100%

AF

ENCPmin

ENCP

EPDmin

EPD

HDmin

HD

4.000

6.035

0.083

0.174

1.154

1.606

N.D., EC ¼ 100% N.D., EC ¼ 100%

criterion. For comparison reason, the 4th row shows the design index of the minimum aberration design for a given m, which was obtained in Li et al. (2003). All corresponding designs can be found on the web site of the corresponding author. We note that, in most cases, only a small percentage of total designs are model-robust designs, with an even smaller percentage that are model-discriminating designs. For instance, for m ¼ 6, there are 27 orthogonal designs. Among them, 16 designs satisfy EC ¼ 100% for the MEPIg model space for g ¼ 1; the number reduces to 8 and 2 for g ¼ 2 and 3, respectively. The numbers of model-discriminating designs are much smaller: there are 7, 0, and 0 such designs for g ¼ 1; 2, and 3. Again, note that for any given m, the numbers of model-robust and model-discriminating designs for the PMSq

ARTICLE IN PRESS V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

773

Table 5 Numbers of 16-run model-robust (denoted by MR) and model-discriminating (denoted by MD) designs, and best designs for AF over different model spaces. m

Total number of designs

4

5

MR/MD

# of MR designs # of MD designs Best design for AF Minimum aberration

5

11

MR MD Best design for AF Minimum aberration

6

27

MR MD Best design for AF Minimum aberration

7

55

MR MD Best design for AF Minimum aberration

8

80

MR MD Best design for AF Minimum aberration

a

MEPIg model space

PMSq model space

g¼1

g¼2

g¼3

q¼2

q¼3

q¼4

3 3 #3

3 3 #3

3 3 #3

5 3 #3

4 3 #3

N.A.a N.A.a –

6 3 #4

5 3 #4

11 6 #4

8 4 #4

6 3 #4

8 0 –

2 0 –

27 9 #13

17 0 –

9 0 –

10 0 –

1 0 –

55 11 #32

30 0 –

11 0 –

3 0 –

0 0 –

80 3 #67,68

32 0 –

0 0 –

Design # 3 8 6 #4 Design # 4 16 7 #13 Design # 6 27 10 #32 Design # 6 16 3 #67,68 Design # 6

It is not applicable for m ¼ 4 and k ¼ 4 since there is only one model here.

model space are larger than those for the MEPIg model space. Based on results of Table 5 we recommend use of 16-run designs only for small numbers of factors. Specifically, for g ¼ 2; 3 for the MEPIg model space and q ¼ 3; 4 for the PMSq model space, 16-run designs can be recommended for mr5; for g ¼ 1 and q ¼ 1; 2 for the two model spaces, respectively, 16-run designs can be recommended for mr8. Table 5 shows, for each m, the best orthogonal designs in terms of AF are the same over the six different model spaces MEPIg ðg ¼ 1; 2; 3Þ and the PMSq ðq ¼ 2; 3; 4Þ. For m ¼ 4 and m ¼ 5 the best designs for AF are minimum aberration designs. But this is not true for larger m. In those cases, the minimum aberration designs are regular designs; whereas the best orthogonal designs for AF are non-regular designs. Thus, the commonly used minimum aberration designs may not perform optimally with respect to model discrimination criteria. For instance, we found that the minimum aberration design for m ¼ 8 (Design #6) has ECo100% for gZ2 for the MEPIg model space and for qZ4 for the PMSq model space. It satisfies EC ¼ 100% for g ¼ 1 and q ¼ 2 and 3 for the two model spaces, respectively. However, it is not modeldiscriminating in either case. In contrast, Designs 67 and 68 have much better properties in terms of model robustness and model discrimination. The criterion values for all 16-run designs were obtained and are available from the corresponding author.

3.3. Evaluating 20-run designs There has been increasing interest in 20-run orthogonal designs since the complete catalog of non-isomorphic 20-run designs was obtained by Sun et al. (2008). Almost all 20-run designs are non-regular designs, with the only exception being that design comprised of five replicates of a 4  3 regular orthogonal design. As noted by several authors (e.g., Loeppky et al., 2007; Cheng, 1995), non-regular designs often enjoy some attractive projection properties. In particular, 20-run designs become good candidates from the perspective of both model robustness and model discrimination. Loeppky et al. (2007) obtained efficient model-robust 20-run designs in terms of the (EC, IC)-criterion for the PMSq model space. Li (2006) studied 20-run designs for the MEPIg model space, using both the model robustness and model discrimination criteria proposed in Jones et al. (2007). We investigate 20-run designs in terms of newly proposed model discrimination criteria for both model spaces. In Table 6 we report percentages of model-discriminating designs for a given number of factors m, for model spaces MEPIg ðg ¼ 1; 2; 3Þ and PMSq ðq ¼ 2; 3; 4Þ. While the percentages of model-robust designs are generally high for 20-run designs (see Li, 2006, Table 3), the percentages of model-discriminating designs are much smaller. It was shown in Li (2006) that all 20  m designs are model-robust for mr18 when g ¼ 1, and for mr7 when g ¼ 2 and 3; however, many

ARTICLE IN PRESS 774

V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

such designs are not model-discriminating. For instance, Table 6 shows that when g ¼ 3, only 34.7% of all 20  6 designs are model-discriminating and none of the 20  7 designs is model-discriminating. As a general guide, we recommend use of 20-run designs for purposes of model discrimination for either model space only when the number of factors mr7.

4. Constructing model-discriminating designs Orthogonal designs have been widely used by both practitioners and researchers. However, it can be seen from the last section that model discrimination is such a strong requirement that model-discriminating orthogonal designs may not exist for a desired combination of factors, m, and sample size, n. In these situations, model-discriminating non-orthogonal designs may still exist. One approach to determine if non-orthogonal model-discriminating designs exist is to construct optimal designs using the model discrimination criteria proposed in this article. To construct efficient model-discriminating designs, we considered two algorithms: the coordinate-exchange algorithm of Meyer and Nachtsheim (1995) and the CP-exchange algorithm of Li and Wu (1997). Both algorithms start from a randomly chosen design and then iteratively improve the designs by exchanging rows or columns. The CP algorithm of Li and Wu (1997) performs exchanges of elements in a column and thus can retain design balance. (A design is called balanced if each column has the same number of þ’s and ’s.) In comparison, the coordinate-exchange algorithm of Meyer and Nachtsheim (1995) searches over a broader space of designs that may not necessarily be balanced. Consequently, the optimal design found by the coordinate-exchange algorithm will construct designs having criterion values that are generally equal to or better than those found by the CP algorithm. We constructed a class of optimal designs with small run sizes, using the criteria proposed in this article. As an example, we used the coordinate-exchange algorithm to construct 12-run AF-optimal and the ENCP-optimal designs for the MEPIg model space ðg ¼ 1; 2Þ. The optimal designs are provided in Appendix B (Tables B1 and B2). Table 7 summarizes the model discrimination criteria values of the AF-optimal designs. Compared to 12-run orthogonal designs (see Table 1), improvements have clearly been made. For instance, when m ¼ 5 and g ¼ 1, the AF-optimal design has a better AF value (2.283) than those of the two 12  5 orthogonal designs (1.962). Table 7 also confirms that 12-run designs are not appropriate for mZ5 and g ¼ 2, and m ¼ 7. Similar conclusions can also be made for the ENCP-optimal designs, whose results are summarized in Table 8. It can be seen that there are more ENCP-optimal designs than AF-optimal designs in the cases considered here. Interestingly, some ENCP-optimal designs are not model-discriminating designs. For instance, when ðm; gÞ ¼ ð6; 1Þ, the ENCP-optimal design has ENCP ¼ 9:791, which is slightly larger than the ENCP value of 9.508 of the AF-optimal design. However, the latter design is model-discriminating, but the former design is not. In general, when both AF-optimal and ENCP-optimal designs exist, we recommend the use of AF-optimal design. However, the ENCP-optimal designs can be considered in the situations, where the corresponding AF-optimal designs do not exist, such as 12-run designs for which ðm; gÞ ¼ ð5; 2Þ. Table 6 Percentages of 20-run model-discriminating designs over two model spaces. m

Total number of designs

4 5 6 7 8 9

MEPIg model space

3 11 75 474 1603 2477

PMSq model space

g¼1

g¼2

g¼3

q¼2

q¼3

q¼4

100.0 100.0 100.0 100.0 100.0 97.1

100.0 72.7 68.0 49.4 20.3 5.1

100.0 63.6 34.7 0 0 0

100.0 100.0 100.0 100.0 100.0 100.0

100.0 63.6 48.0 15.8 0.1 0

100.0 36.4 6.7 0 0 0

Table 7 Summary of the model-discriminating criteria values for 12-run AF-optimal designs. m

g

AFmin

AF

ENCPmin

ENCP

EPDmin

EPD

4

1 2 1 2 1 2

2.079 1.988 1.674 – 1.674 –

2.379 2.241 2.283 – 2.226 –

8.000 8.000 5.333 – 5.333 –

10.844 9.708 10.044 – 9.508 –

0.125 0.125 0.083 – 0.083 –

0.157 0.211 0.146 – 0.143 –

5 6

Note: There are no MD designs for ðm; gÞ ¼ ð5; 2Þ; ð6; 2Þ with respect the AF criterion. Model space ¼ main effects þ 1 or 2 two-factor interactions.

ARTICLE IN PRESS V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

775

5. Efficacy of AF-optimal designs In this section, we provide some insight into the relative performance of the AF-optimal designs. We do this in two ways. First, we compare criterion values for the AF-optimal designs with those of comparable orthogonal designs as well as with a set of baseline, randomly chosen balanced designs. Second, we conduct two simulation studies to assess the abilities of these designs to correctly discriminate using a standard model selection approach. In classical optimal design, when a single model is under consideration, comparisons of alternatives are often based on relative efficiency calculations. For model discrimination using the AF-criterion, the average efficiency of design d1 relative to design d2 is obtained using Re ðd1 ; d2 Þ ¼

" #1=pðjÞ i X jMðjÞ ðd1 Þj

1

i

nm ðnm  1Þ

iaj

jMðjÞ i ðd2 Þj

:

To get some idea as to the improvement we can attribute to the AF-optimal design relative to other designs, we provide in Table 9 the AF criterion values for the optimal design, for all of the model discriminating orthogonal designs for seven factors in 16 runs, and for the minimum aberration design. We also provide the efficiencies of these designs, relative to the optimal design. AF criterion values for these designs range from 1.936 to 2.276. The efficiencies of these designs, relative to the optimal design, range from 53.6% to 80.2%. To provide a baseline, we also generated 5000 balanced designs at random for this problem and have provided the AF criterion value, 0.744, for these designs. Another measure of design efficacy is provided by the probability of identifying the correct model. This probability, of course, will depend on the design, the true model, the true model parameters, the distribution of the error term, and the model selection method employed by the analyst. In an attempt to gauge the performances of the designs on this dimension, we conducted two simulation studies. In the first study, we assess the performance of the 12 (non-baseline) designs in Table 9. As before, we employ the MEPI1 model space. For purposes of our simulation, each model in the MEPI1 space takes the form y ¼ Xb þ e, where all non-zero elements of b are taken to be 10 for main effects and 1 for the two-factor interaction, and eNð0; 1Þ. The simulation consisted of 50,000 replicates. In each replicate, one of the 35 potential two-factor interactions is selected at random and its regression coefficient set to 1. Thus, 50,000 error vectors and b-vectors were generated, leading to as many Y-vectors. For each response vector, we used the all-possible regressions procedure with minimum MSE criterion to select the ‘‘best’’

Table 8 Summary of the model-discriminating criteria values for 12-run ENCP-optimal designs. m

g

AFmin

AF

ENCPmin

ENCP

EPDmin

EPD

4

1 2 1 2 1 2

2.079 0.297 1.674 N.D. N.D. N.D.

2.379 0.455 2.283

8.000 8.000 5.333 2.623 0 0

10.844 13.869 10.044 8.098 9.791 6.392

0.125 0.125 0.083 0.067 0 0

0.157 0.211 0.146 0.204 0.142 0.193

5 6

Model space ¼ main effects þ 1 or 2 two-factor interactions.

Table 9 Evaluation of 16  7 MD designs in terms of AF values and probabilities of identifying the corrected models (CMR) in a simulation study for the MEPI1 model space. Design

AF

Relative efficiency (%)

CMR (%)

Baseline MA (design # 6) 32 43 45 49 50 51 52 53 54 55 Optimal

0.744 N.D. 2.276 2.136 2.057 2.265 2.027 1.936 1.936 2.137 2.091 2.197 2.597

– – 80.2 71.7 62.9 77.9 63.2 53.6 53.6 71.4 68.0 75.4 100.0

– 32.4 69.8 62.8 61.8 68.3 60.3 57.3 58.2 63.7 62.5 67.2 81.3

ARTICLE IN PRESS 776

V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

Correct Model Rate (CMR)

85 80 75 70 65 60 55 1.9

2

2.1

2.2 2.3 2.4 AFbar Criterion Value

2.5

2.6

Fig. 1. Probability of identifying the correct model (CMR) vs. AF values for 16  7 model-discriminating designs over MEPI1.

Table 10 Simulation results for three 16  7 designs for models where any number of main effects with or without one 2-factor interaction may be active ðbi ¼ 2; N ¼ 5000Þ. Design

AF

CMR (%)

OA (MA design # 6) OA (design # 32)

N.D. 2.276 2.597

5.90 19.60 23.50

AF-optimal

model over the specified model space. (For this study, this is equivalent to choosing the one-interaction model having the smallest p-value for the t-test of the interaction term; this is not the case in the second simulation study described below.) The rate at which the correct model is identified (CMR) is provided in the last column of Table 9. The 10 modeldiscriminating orthogonal designs identify the correct model with CMRs ranging from 57.3% to 69.8%. Note that the optimal model-discriminating design performs substantially better than any of the orthogonal designs, obtaining a CMR of 81.3%. Fig. 1 shows that the ranks of the simulated CMR values are quite consistent with those of the design relative efficiencies. In both the MEPIg and PMSq models spaces, we assume that the largest g or q value can be specified with reasonable level of confidence. However, this may not be the case in many applications. Moreover, in the MEPIg model space, there is no guarantee that all main effects are present. In a second simulation study, we again considered 16  7 designs, but this time we enlarge the search space of possible models at the analysis stage. In this model space, any or all main effects may be present with or without one interaction term. We determine the CMRs for three of the designs employed in the first simulation study: (i) OA (minimum abberation, design #6 in Table 3). (ii) OA (design #32 in Table 3). (iii) AF-optimal design (see Table B3 in Appendix B). The simulation results are provided in Table 10 and are reassuring. Although the new model space contains many more models than the MEPI1 does, the AF-optimal design resulting from the MEPI1 model space continues to outperform competing orthogonal designs. Its CMR is 23.5%, compared with the 5.9% for the minimum aberration design and 19.60% for the best orthogonal design. 6. Discussion In this article we propose new criteria for model discrimination, and we use these criteria to evaluate classes of twolevel orthogonal designs with 12, 16, and 20 runs. The results demonstrate for given numbers of runs and factors, not all n  m designs are created equal—generally, only a few are suitable for model discrimination. We also explore the construction of optimal model-discriminating designs for small run sizes. Throughout the article, we focused on two model spaces: the MEPIg model space (Sun, 1993; Li and Nachtsheim, 2000) and the PMSq model space (Loeppky et al., 2007). However, the criteria are applicable to any other model space. We conclude the article by briefly discussing three issues:

ARTICLE IN PRESS V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

777

Table 11 A comparison of AF-optimal designs for MEPI1 and MEPI2 model spaces. Design

AF for MEPI1

AF for MEPI2

MEPI1 -optimal MEPI2 -optimal

2.379 2.346

N.D. 2.241

0

20

40

60

r = 0.993

80

0

20

40

60

80

r = 0.981

r = 0.837

r = 0.981

r = 0.978

r = 0.861

r = 0.991

r = 0.807

r = 0.971

80 60 40 20 0

MEPI.AF

80 60 40 20 0

r = 0.993 PMS.AF

r = 0.978

r = 0.981

80 60 40 20 0

MEPI.EC.IC

80 60 40 20 0

r = 0.861

r = 0.837

r = 0.881

r = 0.807 PMS.EC.IC

r = 0.991

r = 0.981

r = 0.971

80 60 40 20 0

r = 0.881 Gen.Aberration

0

20

40

60

80

0

20

40

60

80

0

20

40

60

80

Fig. 2. Matrix plot of ranks of the 75 20  6 orthogonal designs, for the AF discrimination criteria and EC/IC model robustness criteria, and generalized aberration.

choice of model space, choice of criteria, and the relation between minimum aberration and model discrimination criteria. First, with respect to model space, in this article we constructed designs over the MEPIg or PMSq model space for fixed values of g and q. The question naturally arises: why not also simultaneously consider the chosen model space for smaller values of g and q. For example, if one uses MEPI2 , why not also consider MEPI1 ? This is easily handled by creating a new model space as a union of the model spaces for the range of p or q considered. One could similarly combine the MEPIg and PMSq model spaces. There is, of course, an increased computational burden as the cardinality of the model spaces rises. Fortunately, for the orthogonal designs considered in this article, we find that the orthogonal designs that are identified as best for MEPIg or PMSq are generally also best for smaller values of g and q, respectively. For instance, Table 5 shows that the optimal 16  m AF-optimal designs over MEPIg and PMSq are identical, where mo9, g ¼ 1; 2, and q ¼ 2; 3; 4. For optimal designs, the best design for a fixed g or q is frequently highly efficient for model spaces based on smaller p or q. For the 16  7 example, we constructed a 2  2 table of criterion values over the MEPI1 and MEPI2 model spaces for the AF-optimal designs for both cases. As shown in Table 11, the optimal design for the MEPI2 model space is nearly optimal for the MEPI1 model space. To be specific, AF for the optimal design for MEPI1 is 2.379, whereas AF for the optimal design for MEPI2 is 2.346. Of course, the reverse is not generally true. In this case, the optimal design for the MEPI1 model space fails to be model-discriminating for the MEPI2 model space. These results are consistent with prior findings in the literature on the robustness of optimal

ARTICLE IN PRESS 778

V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

designs for models that are nested within the assumed model (see, e.g., Cook and Nachtsheim, 1982). Thus for computational savings, we have focused here on single, largest-possible values of g or q. Second, with regard to model discrimination criteria, we generally favor use of AF as a primary criterion, for two reasons: (1) along with the As criterion, it is the most conservative of criteria considered and (2) as noted by Atkinson and Fedorov (1975b), the non-centrality parameter is a key consideration when the data analyst will be conducting tests for the presence or absence of the various factors. Other criteria such as BEPD, EPD, or Hellinger Distance, are less conservative. However, we recommend their use for ranking the discrimination capabilities of designs when they are not model discriminating by the AF criterion. We conclude this article by briefly considering the relation between model estimation criteria (e.g., EC and IC), model discrimination criteria (e.g., AF), and the generalized abberation criterion. To do so, we rank the 75 20  6 orthogonal designs according to different criteria and then study the correlations among those ranks. For model discrimination, the AF criterion was used in connection with the MEPI1 and PMS2 model spaces. For model estimation, both EC and IC are considered. Because all designs have EC ¼ 100% for g ¼ 1; 2; 3 and q ¼ 2; 3; 4, the designs are ranked by sequentially maximizing EC4 and IC4 over the MEPIg model space, and EC5 and IC5 over the PMSq model space. It can be seen from Fig. 2 that the correlations among the ranks with respect to different criteria are generally high. What seems striking from the bottom row of the matrix are the high correlations between the ranks of the designs for the generalized aberration criterion and the AF discrimination criteria (r ¼ 0:981 for ‘‘Gen.Aberration’’ vs ‘‘MEPI.AF’’ and r ¼ 0:991 for ‘‘Gen.Aberration’’ vs ‘‘PMS.AF’’). This suggests a close relationship between the AF discrimination criterion and generalized aberration. This may not be surprising as the aberration criterion tends to minimize the confounding between main effects and lower-order interactions, and both MEPIg and PMSq model spaces focus on the same types of effects (i.e., main effects and two-factor interactions). The discrimination criteria, however, possess some key advantages relative to the generalized aberration criterion: (1) they can be applied to any design, not just orthogonal designs, (2) they can be used as an optimization criterion for algorithmic design construction, and (3) they are statistical criteria, where aberration is a combinatoric index. The results also indicate that model robustness criteria in connection with the MEPI model space may be more closely related to generalized aberration (‘‘Gen.Aberration’’ vs ‘‘MEPI.EC.IC’’, r ¼ 0:971) than model robustness for the PMS model space (‘‘Gen.Aberration’’ vs ‘‘PMS.EC.IC’’, r ¼ 0:881). This may not be surprising in light of the hierarchical nature of both the generalized aberration criterion and model robustness for the MEPI model space. Both criteria tend to value the estimation of all main effects before the estimation of interactions. Appendix A. Proofs of Theorems 1–3

Proof of Theorem 1. 0

0

0

Eb fbðjÞ MiðjÞ bðjÞ g ¼ Eb fTrace½MðjÞ bðjÞ bðjÞ g ¼ Trace½MðjÞ Eb fbðjÞ bðjÞ g ¼ Trace½MðjÞ s2b I ¼ s2b Trace½MiðjÞ : i i i i i i i i i

&

Proof of Theorem 2. Eb;e ðPDij Þ ¼ Eb Ee fPDij jbg ¼ Eb ½Ee fðHi y  Hj yÞ0 ðHi y  Hj yÞjbg ¼ Eb ½Ee fy0 ðHi  Hj ÞðHi  Hj Þyjbg ¼ Eb ½Ee fðEfyg þ eÞ0 ðHi  Hj ÞðHi  Hj ÞðEfyg þ eÞjb ¼ Eb ½Ee fEfyg0 Dij Efyg þ 2e0 Dij Efyg þ e0 Dij ejb ¼ Eb ½fEfyg0 Dij Efyg þ Ee fe0 Dij egjb: But Ee fe0 Dij eg ¼ Ee Tracefe0 Dij eg ¼ Ee TracefDij ee0 g ¼ TracefDij E½ee0 g ¼ s2e TraceðDij Þ. Thus: BEPDij ¼ Eb ½b0 Xi0 ðXi ðXi0 Xi Þ1 Xi0  Xj ðXj0 Xj Þ1 Xj0 ÞðXi ðXi0 Xi Þ1 Xi0  Xj ðXj0 Xj Þ1 Xj0 ÞXi b þ s2e TracefDij g ¼ Eb ½b0 ðXi0 Hi  Xi0 Hj ÞðHi Xi  Hj Xi Þb þ s2e TracefDij g ¼ Eb ½fb0 ðXi0  Xi0 Hj ÞðXi  Hj Xi Þb þ s2e TracefDij g ¼ E ½b0 X 0 ðI  H ÞðI  H ÞX b þ s2 TracefD g i j j i ij b e ¼ Eb ½b0 Xi0 ðI  Hj ÞXi b þ s2e TracefDij g ¼ Eb ½TracefXi0 ðI  Hi ÞXi bb0 g þ s2e TracefDij g ¼ Trace½fXi0 ðI  Hi ÞXi Efbb0 gg þ s2e TracefDij g ¼ s2b TracefXi0 ðI  Hj ÞXi g þ s2e TracefDij g: The last equality holds since Eb fbb0 g ¼ s2b I. Suppose now that column k in Xi is also a column of Xj (i.e., suppose that f i and f j have a term in common). Then the k th diagonal element of Xi0 ðI  Hj ÞXi is zero, and the theorem follows. &

ARTICLE IN PRESS V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

779

Proof of Theorem 3. EPDij ¼ 0 ) ½ðHi  Hj Þ2  ¼ 0 ) Hi ¼ Hj ) ENCPij ¼ Trace½Xi0 ðI  Hj ÞXi  ¼ Trace½Xi0 ðI  Hi ÞXi  ¼ 0:

To show that ENCPij ¼ 0 ) EPDij ¼ 0, assume that EPDij 40. Then: TraceðDij Þ40 ) Trace½ðHi  Hj Þ2 40 ) Hi aHj : Then by the uniqueness of projection matrices (Harville, 1997, p. 166), the column space of Xi is not equal to the column space of Xj , and it follows that ðI  Hj ÞXi a0 ) Trace½Xi0 ðI  Hj Þ2 Xi  ¼ Trace½Xi0 ðI  Hj ÞXi  ¼ ENCPij 40:

The assertions that BEPDij ¼ 03EPDij ¼ 0 and BEPDij ¼ 03ENCPij ¼ 0 follow immediately from the definition of BEPDij . Finally, from above, ðjÞ EPDij ¼ 0 ) Hi ¼ Hj ) Xi0 ðI  Hj Þ2 Xi ¼ Xi0 ðI  Hi Þ2 Xi ¼ 0pi pi ) MðjÞ i ¼ 0pðjÞ pðjÞ ) jMi j ¼ 0: i

i

Counterexamples to the reverse assertion ðjMðjÞ i j ¼ 0 ) EPDij ¼ ENCPij ¼ BEPDij ¼ 0Þ are provided in Section 3.

&

Appendix B. Some useful model-discriminating designs

Table B1 Some 12-run AF-optimal designs over MEPIg . m¼4 ðg ¼ 1Þ

m¼4 ðg ¼ 2Þ

m¼5 ðg ¼ 1Þ

1 1 1 1 0 0 1 0 0 1 0 0

1 1 1 1 1 1 0 1 0 0 0 0

0 0 0 0 1 1 1 1 0 0 1 1

1 1 1 0 1 1 0 0 0 0 0 1

1 1 0 1 0 0 1 0 0 0 1 1

1 0 0 0 0 1 1 0 1 1 1 0

0 0 0 0 1 1 1 1 1 1 0 0

0 1 1 0 0 1 0 0 1 0 1 0

0 0 1 1 0 0 1 1 1 0 0 1

0 0 1 0 1 0 0 0 1 1 1 1

1 0 1 0 0 0 1 0 0 1 1 1

m¼6 ðg ¼ 1Þ

1 1 0 0 0 0 0 1 1 1 0 1

1 1 1 0 0 1 0 0 1 0 0 1

1 0 0 1 1 1 0 0 1 0 1 0

0 0 0 0 0 1 1 1 0 1 1 1

1 0 1 0 1 0 1 1 0 0 1 0

1 1 0 0 1 1 0 1 1 0 0 0

1 1 0 0 0 1 1 1 0 0 0 1

0 0 0 1 0 1 0 1 1 1 0 1

1 0 1 0 1 1 0 1 0 0 1 1

Table B2 Some 12-run ENCP-optimal designs over MEPIg . m¼4 ðg ¼ 2Þ

m¼5 ðg ¼ 2Þ

1 0 1 0 1 0 0 0 1 1 1 0

1 0 1 1 0 0 0 0 0 0 1 1

0 0 1 0 1 1 0 0 1 1 0 1

1 1 1 0 1 1 1 0 0 0 0 0

1 0 1 0 0 1 1 1 1 0 0 0

1 1 0 0 0 0 1 1 0 0 1 1

0 1 0 1 0 1 0 1 1 1 0 1

1 0 1 0 0 1 1 0 0 1 0 1

m¼6 ðg ¼ 1Þ 0 1 1 0 0 0 1 0 1 1 1 1

1 0 0 0 1 0 1 1 0 0 1 1

0 1 1 1 0 0 0 1 1 0 1 0

0 0 1 0 0 0 1 1 1 1 0 1

1 1 0 0 0 1 0 0 1 0 1 1

m¼6 ðg ¼ 2Þ 0 1 1 0 1 0 0 1 0 0 1 1

1 1 0 1 1 0 0 1 0 1 0 0

1 0 1 0 0 1 1 1 1 1 0 0

0 0 0 1 0 1 1 1 1 0 1 0

0 0 1 0 1 0 0 1 0 1 0 1

0 1 1 0 0 0 0 0 1 0 1 1

0 0 0 1 1 1 1 0 0 1 1 0

ARTICLE IN PRESS 780

V. Agboto et al. / Journal of Statistical Planning and Inference 140 (2010) 766–780

Table B3 A 16  7 AF-optimal design over MEPI1. 0 0 1 1 0 1 1 0 1 1 0 0 1 1 0 0

0 1 0 0 1 1 1 0 0 1 1 1 0 1 0 0

0 1 0 0 1 0 1 1 1 0 1 0 0 1 1 0

0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 1

1 1 0 0 0 1 1 1 1 0 1 0 1 0 0 0

0 0 1 1 1 0 1 1 0 0 0 1 0 1 1 0

0 1 0 1 0 1 0 1 0 0 0 1 1 1 0 1

References Atkinson, A.C., Donev, A.N., 1992. Optimum Experimental Designs. Clarendon Press, Oxford. Atkinson, A.C., Fedorov, V.V., 1975a. The design of experiments for discriminating between two rival models. Biometrika 62, 57–70. Atkinson, A.C., Fedorov, V.V., 1975b. Optimal design: experiments for discriminating between several models. Biometrika 62, 289–303. Bingham, D.R., Li, W., 2002. A class of optimal robust parameter designs. Journal of Quality Technology 34, 244–259. Bingham, D.R., Chipman, H.A., 2007. Incorporating prior information in optimal design for model selection. Technometrics 49, 155–163. Box, G.E.P., Bisgaard, S., 1993. What can you find out from 12 experimental runs. Quality Engineering 5, 663–668. Cheng, C.-S., 1995. Some projection properties of orthogonal designs. The Annals of Statistics 23, 1223–1233. Cook, R.D., Nachtsheim, C.J., 1982. Model robust, linear-optimal designs. Technometrics 24, 49–54. Fedorov, V.V., Malyutov, M.B., 1972. Optimal designs in regression problems. Mathematische Operationsforschung und Statistik 3, 281–308. Fedorov, V.V., Uspensky, A.B., 1975. Numerical Aspects of the Method of Least Squares. Moscow State University, Laboratory of Statistical Methods (in Russian). Harville, D.A., 1997. Matrix Algebra from a Statistician’s Perspective. Springer, New York. Jones, B., Li, W., Nachtsheim, C.J., Ye, K., 2007. Model discrimination—another perspective on model-robust designs. Journal of Statistical Planning and Inference 137, 1577–1583. ¨ Lauter, E., 1974. Experimental design in a class of models. Mathematische Operationsforschung und Statistik 5, 379–396. Li, W., 2006. Screening designs for model selection. In: Dean, A.M., Lewis, S.M. (Eds.), Screening Designs for Model Selection. In Screening: Methods for Experimentation in Industry, Drug Discovery and Genetics. Springer, Berlin (Chapter 10). Li, W., Lin, D.K.J., Ye, K., 2003. Optimal foldover plans for non-regular designs. Technometrics 45, 347–351. Li, W., Nachtsheim, C.J., 2000. Model-robust factorial designs. Technometrics 42, 379–396. Li, W., Wu, C.F.J., 1997. Columnwise–pairwise algorithms with applications to the construction of supersaturated designs. Technometrics 39, 171–179. Loeppky, J.L., Sitter, R.R., Tang, B., 2007. Non-regular designs with desirable projection properties. Technometrics 49, 454–467. Lin, D.K.J., Draper, N.R., 1992. Projection properties of plackett and burman designs. Technometrics 34, 423–428. Meyer, R.K., Nachtsheim, C.J., 1995. The coordinate-exchange algorithm for constructing exact optimal designs. Technometrics 37, 60–69. Meyer, R.D., Steinberg, D., Box, G., 1996. Follow-up designs to resolve confounding in multifactor experiments. Technometrics 38, 303–313. Ponce De Leon, A.C., Atkinson, A.C., 1991. Optimum experimental design for discriminating between two rival models in the presence of prior information. Biometrika 78, 601–608. Srivastava, J.N., 1975. Designs for searching non-negligible effects. In: Srivastava, J.N. (Ed.), A Survey of Statistical Designs and Linear Models. NorthHolland, Amsterdam, pp. 507–519. Sun, D.X., 1993. Estimation capacity and related topics in experimental designs. Unpublished Ph.D. Dissertation, Department of Statistics and Actuarial Science, University of Waterloo. Sun, D.X., Li, W., Ye, K.Q., 2008. An algorithm for sequentially constructing non-isomorphic orthogonal designs and its applications. Statistics and Applications (Special Issue in Honour of Professor Akole Dey) 6, 141–156. Tsai, P.W., Gilmour, S.G., Mead, R., 2000. Projective three-level main-effects designs robust to model uncertainty. Biometrika 87, 467–475. Tsai, P.W., Gilmour, S.G., Mead, R., 2007. Three-level main-effects designs exploiting prior information about model uncertainty. Journal of Statistical Planning and Inference 137, 619–627.