Maximin distance optimal designs for computer experiments with time-varying inputs and outputs

Journal of Statistical Planning and Inference 144 (2014) 63–68 Contents lists available at ScienceDirect Journal of Statistical Planning and Inferen...

Download PDF

312KB Sizes 0 Downloads 43 Views

Report

PDF Reader
Full Text

Journal of Statistical Planning and Inference 144 (2014) 63–68

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi

Maximin distance optimal designs for computer experiments with time-varying inputs and outputs Max D. Morris a,b,n a b

Department of Statistics, Iowa State University, United States Department of Industrial and Manufacturing Systems Engineering, Iowa State University, United States

a r t i c l e i n f o

abstract

Available online 27 September 2012

Computer models of dynamic systems produce outputs that are functions of time; models that solve systems of differential equations often have this character. Time-indexed inputs, such as the functions that describe time-varying boundary conditions, are also common with such models. Morris (2012) described a generalization of the Gaussian process often used to produce ‘‘meta-models’’ when inputs are ﬁnite-dimensional vectors, that can be used in the functional input setting, and showed how the maximin distance design optimality criterion (Johnson et al., 1990) can also be extended to this case. This paper describes an upper bound on the maximin distance criterion for functional inputs. A class of designs that are optimal under certain conditions is also presented; while these designs are of limited practical value, they show that the derived bound cannot be improved in the general case. & 2012 Elsevier B.V. All rights reserved.

Keywords: Computer experiment Dynamic model Gaussian stochastic process Maximin distance design Meta-model Surrogate

1. Introduction and notation The statistical treatment of computer experiments, in which a computer model of a physical system is the object of experimental focus, has received substantial attention in the statistical literature over the last two decades. The computer model is often an implementation of a complicated deterministic function for which evaluation requires the speciﬁcation of quantities that complete the deﬁnition of a problem to be solved. Because the function is complex, it is often difﬁcult or impossible to understand analytically, and so it is implemented as a computer model y ¼ f ðxÞ, where x represents the variables for which values must be speciﬁed, or inputs; f is the computer model; and y represents the variables that constitute the problem solution, or outputs. f is generally also complex and its evaluation may require substantial computational effort. A common ﬁnal or intermediate goal of a computer experiment is the development of a surrogate or meta-model, an approximation of the computer model based on a ﬁnite number of executions of f, that can be used to quickly predict what the outputs would be at new values of the inputs along with useful quantiﬁcation of the uncertainty associated with these predictions. A few of the broadly cited papers on this topic are Sacks et al. (1989), Currin et al. (1991), Kennedy and O’Hagan (2001), and Higdon et al. (2004). In each of these papers, and many others, a Gaussian stochastic process (GaSP) is used as the statistical basis for formulating a surrogate for predicting a scalar- or vector-valued y from a scalar- or vector-valued x. In many applications, the computer models of greatest interest are written to mimic dynamic physical processes; the literature describing methodology based on GaSP meta-models includes examples involving DNA population dynamics

n Correspondence address: Department of Statistics, Iowa State University Ames, IA 50014, United States. Tel.: þ1 515 291 2775; fax: þ 1 515 294 4040. E-mail address: [email protected]

0378-3758/$ - see front matter & 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.jspi.2012.09.005

64

M.D. Morris / Journal of Statistical Planning and Inference 144 (2014) 63–68

(e.g. Henderson et al., 2010), the design of helicopter rotor blades (Booker et al., 1999), and the global climate (e.g. Tebaldi et al., 2005). These computer models are often based on systems of ordinary or partial differential equations that require speciﬁcation of time-varying quantities as inputs, e.g. boundary conditions or so-called ‘‘forcing functions’’, and their solutions are outputs that are also functions of time (and sometimes also of space). Smooth functional outputs can often be reduced to vectors of low dimension for practical analysis through, for example, principle components representation, e.g. Higdon et al. (2008). Morris (2012) introduced a generalization of the GaSP meta-model for the case in which the inputs x are expressed as a function of time t, along with a corresponding generalization of the maximin distance design criterion (Johnson et al., 1990), along the following lines. An experimental design is a collection of n functional inputs, D ¼ fx1 ðtÞ,x2 ðtÞ, . . . ,xn ðtÞg, deﬁned for t 2 ½0,T. The GaSP treats the responses at a particular time s that result from different input functions xi and xj s s as correlated with Corrðyi ðsÞ,yj ðsÞÞ ¼ expðydij Þ, where dij is called the correlation distance between xi and xj over ½0,s, but treats responses at any two distinct times as independent whether they are generated by the same input function or not. One way to look at this is to say that the correlation distance associated with any two responses at different times is inﬁnite. This modeling assumption is admittedly non-intuitive, since functional output that is continuous in time suggests dependence across time, but incorporating such dependence is not straight-forward in this case. For models that have vector-valued inputs and functional outputs, temporal correlations can be incorporated in a simple way through the separable or product correlation structure, with one factor reﬂecting temporal correlation over the time interval ðs1 ,s2 Þ, and the remaining factors expressing correlation associated with the inputs used in modeling output at either t1 or t2. However, in our context Corrðyi ðs1 Þ,yj ðs1 ÞÞaCorrðyi ðs2 Þ,yj ðs2 ÞÞ, preventing the use of this approach. As a practical matter, the assumption of temporal independence may be less critical than it initially seems. For example, where input-time product correlations are appropriate and observed data include outputs at each of a collection of times S for selected inputs, predictive interpolation for any time in S depends only on data observed at that time, i.e. data at other times are not additionally informative. (See, for example, Drignei and Morris, 2006.) Generalizing a commonly used product correlation model described by Sacks et al. (1989) for vector-valued inputs, the Rs a s correlation distance between inputs xi and xj through time s is deﬁned as dij ¼ 0 ws ðsuÞ9xi ðuÞxj ðuÞ9 du for a speciﬁed a 2 ð0,2 and ws deﬁned over ½0,s. A popular choice is a ¼ 2, the so-called ‘‘Gaussian’’ correlation form, which is particularly appropriate for modeling outputs that are smooth functions of inputs, a condition that is generally true for models that implement systems of partial differential equations. The positive-valued weight functions ws ðÞ, one deﬁned for each time for which output is to be predicted, generalize the individual weights associated with each dimension of the usual vectorinput model. Effective speciﬁcation of ws requires at least some understanding of the computer model under study. For example, for systems in which outputs are more dependent on recent inputs than those in the more distant past, reasonable weight functions generally take larger values for small arguments (recent times) and smaller values for larger arguments; the example of Section 3.1 features a very simple weight function of this type. In contrast, for systems that are sensitive to early inputs but become less inﬂuenced by inputs as time progresses, larger values of ws may be associated with larger values of its argument (more distant times). Suppose we want to predict outputs over a set of times S. The maximin (Mm) distance criterion for D over output times in S, under a model of independence between outputs at any two distinct times, is

f¼

s

min

di,j

1 r i o j r n,s2S

ð1Þ

and a Mm distance design for a given value of n is any D for which f is maximized. Morris (2012) constructed a Mm distance design numerically for an experiment involving a dynamic model of the reaction of stem cell ‘‘kinetics’’ to exposure to ionizing radiation. In Section 2 of this paper, we derive a general upper bound for f for even values of n. In Section 3, we present a class of designs for n ¼ 0½mod 4 that achieve this bound. The designs in this class are of limited practical value, but serve to show that the bound of Section 2 cannot be improved in the general case.

2. An upper bound for / Suppose for simplicity that the problem is scaled so that T¼1, and that each input function is constrained so that 0 rx r 1, the latter being required so that distance is constrained. Further, require that for each s 2 S Z s ws ðsuÞ du ¼ 1 ð2Þ 0 s

so that distances di,j are comparable across all s 2 S. Result. Restrict n to be an even integer. Then an upper bound on f (as deﬁned in Eq. (1)) is 12 n=ðn1Þ. Proof. Deﬁne the total distance between the elements of a design at any time s to be X s cs ¼ di,j : ioj

M.D. Morris / Journal of Statistical Planning and Inference 144 (2014) 63–68

At any particular time u 2 ½0,s and for even n, the largest value corresponds to designs for which

P

a

i o j 9xi ðuÞxj ðuÞ9

65

can take for any a 2 ð0,2 is n2 =4, which

xi ðuÞ ¼ 0 for n=2 elements i 2 f1,2, . . . ,ng, xi ðuÞ ¼ 1 for n=2 elements i 2 f1,2, . . . ,ng: s

As a result, because each ws is restricted by (2), c ¼ maxs2S c can be no larger than n2 =4; this is achieved by any design for which the above condition is met at every time u 2 ½0,1 except on a set of integration measure zero. It follows that f can be no larger than the average correlation distance between two input functions, ðn2 =4Þ= n2 ¼ 12 n=ðn1Þ, and that this can s only be achieved for designs for which all dij are equal (to 12 n=ðn1Þ) at each s 2 S. & 3. Optimal designs First, consider the case in which there is only one time at which model output is to be predicted, S ¼ fsg, and restrict attention to the case of n ¼ 0 ½mod 4. Let Z be the n ðn1Þ design matrix for any balanced, orthogonal, main-effectssaturated, 2-level design, with coding levels 0 and 1 in each column, e.g. for n ¼4: 0 1 0 0 1 B0 1 0C B C Z¼B C: @1 0 0A 1

1

1

(A historic reference for experimental designs of this form is Plackett and Burman, 1946.) Let 0 ot1 o t 2 o o t n2 o s be the set of values that divides the integral of ws evenly over ½0,s; that is: Z t2 Z s Z t1 1 : ws ðsuÞ du ¼ ws ðsuÞ du ¼ ¼ ws ðsuÞ du ¼ n1 0 t1 t n2 Finally, deﬁne an n-function design D to be the set of functions xi over ð0,s for which xi ðtÞ ¼ Zi,1

for t 2 ð0,t 1 ,

xi ðtÞ ¼ Zi,j

for t 2 ðt j1 ,t j , j ¼ 2,3, . . . ,n2,

xi ðtÞ ¼ Zi,n1

for t 2 ðt n2 ,s: s

(The values taken by xi(t) for t 4s are immaterial.) Because each dij ¼ 12 n=ðn1Þ (the upper bound for f), this design is Mmoptimal. Now let S ¼ fs1 ,s2 , . . . ,sm g. For given n, determine an optimal design for prediction at s1 as above (i.e. as determined by Z applied within subintervals delineated by appropriate values now denoted by t 1i , i ¼ 1,2, . . . ,n2). Deﬁne s1 ot 21 o t 22 o o t 2n2 o s2 such that Z t2 Z Z t1 1 1 ws2 ðs2 uÞ du þ ws2 ðs2 uÞ du ¼ 0

s1

t 12

t 11

ws2 ðs2 uÞ du þ Z

Z

t22 t21

ws2 ðs2 uÞ du

s1

¼ ... ¼ t 1n2

ws2 ðs2 uÞ du þ

Z

s2

t 2n2

ws2 ðs2 uÞ du ¼

1 : n1

ð3Þ

Continue the deﬁnition of input functions in D through t 2 ðs1 ,s2 as xi ðtÞ ¼ Zi,1

for t 2 ðs1 ,t 21 ,

xi ðtÞ ¼ Zi,j

for t 2 ðt 2j1 ,t 2j , j ¼ 2,3, . . . ,n2,

xi ðtÞ ¼ Zi,n1

for t 2 ðt 2n2 ,s2 :

That is, over ðs1 ,s2 , repeat the pattern with which the input functions evaluate to 0 or 1 in ð0,s1 , but over subintervals of different lengths so as to satisfy requirement (3). Again, the total correlation distance among all pairs of input functions, s over ð0,s2 is c 2 ¼ ðn=2Þ2 , the correlation distance between any pair of input functions over this interval is the same s dij2 ¼ 12 n=ðn1Þ, and the design is Mm-optimal for prediction times s1 and s2 together. Continue this pattern deﬁning input functions through the remaining intervals ðsi1 ,si , i ¼ 3,4, . . . ,m. Within ðsi1 ,si Þ ﬁnd values t i1 o t i2 o o t in2 such that i Z t j1 i Z tj2 i Z sj X X X 1 ð4Þ wsi ðsi uÞ du ¼ wsi ðsi uÞ du ¼ ¼ wsi ðsi uÞ du ¼ j j n1 s t t j1 j¼1 j¼1 1 j ¼ 1 n2 where s0 ¼ 0, for each i ¼ 3,4, . . . ,m. By extension of the arguments above, any design constructed in this manner is Mmoptimal.

66

M.D. Morris / Journal of Statistical Planning and Inference 144 (2014) 63–68 j

Prediction sets S and weight functions ws can be constructed for which this scheme will not work because no ti exist satisfying the integral requirements (4). Here is one set of weight functions for which the construction works for any ﬁnite Rs set S. Deﬁne a positive function W on ½0,1, and let Oi ¼ 0i Wð1uÞ du. Deﬁne wsi ðsi uÞ ¼ Wð1uÞ=Oi

for u 2 ½0,si :

Over the interval of ½0,si , these functions wsj , j Zi, differ only by a constant factor. As a result, the integral requirements for j the deﬁnition of ti can be satisﬁed within each interval individually. Speciﬁcally, so long as: Z si Z ti Z si Z ti 1 2 1 wsi ðsi uÞ du ¼ wsi ðsi uÞ du ¼ . . . ¼ wsi ðsi uÞ du ¼ ws ðs uÞ du ð5Þ n1 si1 i i si1 t i1 t in2 for all i ¼ 1,2, . . . ,m, the construction described above will result in a Mm-optimal design with the correlation distance s between any two points di,j ¼ 12 n=ðn1Þ for each s 2 S. 3.1. Example 1 Deﬁne: Wð1uÞ ¼ u, Then Oi ¼

1 2 2 si ,

0 ou r 1:

and

wsi ðsi uÞ ¼ 2u=s2i ,

u 2 ½0,si :

Select t i1 ot i2 o o t in2 such that (5) is satisﬁed, reducing in this case to 2

2

2

2

t i1 s2i1 ¼ t i2 t i1 ¼ ¼ s2i t in2 ¼

1 ðs2 s2i1 Þ n1 i

for the change times in ½si1 ,si . These are quadratic polynomials in the t’s, and so can be easily solved. Fig. 1 is a display of change times for an optimal design with this weight function, n ¼8, m¼5, and S ¼ f0:4,0:5,0:6,0:8,1:0g. The speciﬁc set of input functions that are required to change values at each change time could be determined, for example, using any regular 274 fractional factorial design of resolution III. 3.2. Example 2 Fig. 2A displays a design of eight input functions that may be more typical of the forcing functions used in dynamic models than those comprising the optimal designs of described above. These eight functions are deﬁned by xi ðtÞ ¼ ðcosð2pbtÞ þ 1Þ=2,

t 2 ½0,1

for b 2 f0,1:7g. The designs shown in Fig. 2B and C are more ‘‘exaggerated’’ versions of these curves, transformed so that x(t) is ‘‘closer’’ to either 0 or 1, or equal to either 0 or 1. Speciﬁcally, for Fig. 2B, the curves are deﬁned by 8 < ð½cosð2pbtÞ14 þ1Þ=2 if cosð2pbtÞ Z 0, xi ðtÞ ¼ : ð½cosð2pbtÞ14 þ1Þ=2 if cosð2pbtÞ o 0

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 1. Change times (ﬁlled circles) and prediction times (open circles) for a Mm distance optimal design as described in the text.

1.0

0.8

0.8

0.8

0.6

0.6

0.6

x

x

1.0

x

1.0

0.4

0.4

0.4

0.2

0.2

0.2 0.0

0.0

0.0 0.0

0.2

0.4 0.6 time

0.8

1.0

0.0

0.2

0.4 0.6 time

0.8

1.0

Fig. 2. Designs 1A–1C as described in the text.

0.0

0.2

0.4 0.6 time

0.8

1.0

M.D. Morris / Journal of Statistical Planning and Inference 144 (2014) 63–68

67

Table 1 Values of f for designs described in the text, with a ¼ 1 and 2. Freq. set

Waveform set A

1 2 3 4

B

C

a¼1

a¼2

a¼1

a¼2

a ¼ 1 or 2

0.3339 0.3339 0.3339 0.3339

0.1810 0.1810 0.1810 0.1853

0.3428 0.3590 0.3712 0.3774

0.2580 0.2745 0.2873 0.2893

0.3045 0.3473 0.3756 0.3990

and for Fig. 2C, the curves are deﬁned by ( 1 if cosð2pbtÞ Z0, xi ðtÞ ¼ 0 if cosð2pbtÞ o0 for the same set of values of b, which we denote by ‘‘frequency set 1’’. We denote the three ‘‘waveform sets’’ by A, B, and C, and specify these three designs as 1A, 1B, and 1C, respectively. Designs 2A–2C were constructed in the same way using frequency set 2: b 2 f0,2:8g, designs 3A–3C were constructed with frequency set 3: b 2 f0,3:9g, and designs 4A–4C were constructed with frequency set 4: b 2 f0,4:10g. For each design, the Mm criterion function f (Eq. (1)) was computed for S ¼ f0:5,0:55,0:6, . . . ,1g and weight functions as described in Example 1; results are displayed in Table 1. Note that the upper bound on f for these designs is 1 8 2 7 ¼ 0:5714; functions in the optimal designs described earlier in this section would oscillate between the value of 0 and 1 approximately 44 times over the simulated time interval. The design in this collection that is best by the Mm criterion is 4C, based on the highest frequencies and most exaggerated waveforms, for which f ¼ 0:3990. While this value is substantially less than the bound, it should be remembered that f is not a statistical index that directly quantiﬁes, say, precision, so 2ððn1Þ=nÞf is not related to ‘‘efﬁciency’’ in the usual sense. 4. Practical limitations and conclusion The construction outlined in Section 3 demonstrates that there are designs for which the maximin distance bound of Section 2 is reached, proving that no tighter general bound exists. However, the optimal designs constructed here would often be of limited practical value in many cases because they require that each x(t):

take only values that are the maximum or minimum allowed at each time (except over a range of zero integration measure), and

oscillate between these two extreme values approximately nm=2 times over t 2 ½0,1. It may be possible to ﬁnd optimal designs consisting of functions that oscillate between extreme values fewer times. However, any designs for which one or more 0 o xi ðtÞ o 1 over sets of t that have non-zero integration measure cannot attain the general Mm bound. Still, in many applications, the only input functions that are physically realistic are ‘‘smooth’’, or at least vary less radically, and the optimal input sets described here would have limited credibility. Nearoptimal designs for more constrained cases may be constructed numerically, as in the example of Morris (2012). While more difﬁcult than the derivation presented here, it would be of interest to construct alternative criterion bounds that are conditioned on smoothness or magnitude of the derivatives of x(t).

Acknowledgments The author gratefully acknowledges the contributions of the Associate Editor and Referees; their comments have led to signiﬁcant improvements in this paper. References Booker, A.J., Dennis, J.J., Frank, P., Seraﬁni, D., Torczon, V., Trosset, M., 1999. A rigorous framework for optimization of expensive functions by surrogates. Structural Optimization 17, 1–13. Currin, C., Mitchell, T., Morris, M., Ylvisaker, D., 1991. Bayesian prediction of deterministic functions with application to the design and analysis of computer experiments. Journal of the American Statistical Association 86, 953–963. Drignei, D., Morris, M.D., 2006. Empirical Bayesian analysis for computer experiments involving ﬁnite-difference codes. Journal of the American Statistical Association 101, 1527–1536. Henderson, D.A., Boys, R.J., Wilkinson, D.J., 2010. Bayesian calibration of a stochastic model using multiple data sources. Biometrics 66, 249–256.

68

M.D. Morris / Journal of Statistical Planning and Inference 144 (2014) 63–68

Higdon, D., Gattiker, J., Williams, B., Rightley, M., 2008. Computer model calibration using high dimensional output. Journal of the American Statistical Association 103, 570–583. Higdon, D., Kennedy, M., Cavendish, J., Cafeo, J., Ryne, R.D., 2004. Combining ﬁeld observations and simulations for calibration and prediction. SIAM Journal of Scientiﬁc Computing 26, 448–466. Johnson, M.E., Moore, L.M., Ylvisaker, D., 1990. Minimax and maximin distance designs. Journal of Statistical Planning and Inference 26, 1313–1348. Kennedy, M.C., O’Hagan, A., 2001. Bayesian calibration of computer models. Journal of the Royal Statistical Society, Series B 63, 425–464. Morris, M.D., 2012. Gaussian surrogates for computer models with time-varying inputs and outputs. Technometrics 54, 42–50. Plackett, R.L., Burman, J.P., 1946. The design of optimum multifactorial experiments. Biometrika 33, 305–325. Sacks, J., Welch, W., Mitchell, T., Wynn, H., 1989. Design and analysis of computer experiments. Statistical Science 4, 423–429. Tebaldi, C., Smith, R.L., Nychka, D., Mearns, L.O., 2005. Quantifying uncertainty in projections of regional climate change: a Bayesian approach to the analysis of multimodel ensembles. Journal of Climate 18, 1524–1540.

Maximin distance optimal designs for computer experiments with time-varying inputs and outputs

Maximin distance optimal designs for computer experiments with time-varying inputs and outputs

Recommend Documents