Interval metamodels for the analysis of simulation Input–Output relations

Simulation Modelling Practice and Theory 54 (2015) 86–100 Contents lists available at ScienceDirect Simulation Modelling Practice and Theory journal...

Download PDF

650KB Sizes 0 Downloads 53 Views

Report

PDF Reader
Full Text

Simulation Modelling Practice and Theory 54 (2015) 86–100

Contents lists available at ScienceDirect

Simulation Modelling Practice and Theory journal homepage: www.elsevier.com/locate/simpat

Interval metamodels for the analysis of simulation Input–Output relations Bogumił Kamin´ski ⇑ Warsaw School of Economics, Al. Niepodległos´ci 162, 02-554 Warszawa, Poland

a r t i c l e

i n f o

Article history: Received 23 March 2014 Received in revised form 5 March 2015 Accepted 25 March 2015 Available online 7 April 2015 Keywords: Stochastic simulation Metamodeling Model veriﬁcation Bayesian inference

a b s t r a c t We analyze the use of simulation metamodels for understanding the relationship between the inputs and outputs of stochastic simulation models. Such metamodels are expected to have a simple structure, but can be conditionally biased. In order to overcome this limitation, we propose the use of interval metamodels that assume that the prediction of the metamodel is an interval rather than a single point. We develop a Bayesian procedure that allows a researcher to verify how well an interval metamodel covers a response surface of a simulation model. The proposed method is illustrated by application to Schelling’s segregation model. The example shows how the procedure can help a researcher to identify non-obvious characteristics of a simulation model response surface. Ó 2015 Elsevier B.V. All rights reserved.

1. Introduction Complex stochastic simulations are often viewed as black boxes in the way they transform their input parameters into output characteristics of modeled systems. Therefore, it is often proposed to use their simpler approximations called simulation metamodels [1]. The relevant literature identiﬁes various reasons why metamodels can be useful for a researcher [2,3]. They can be summarized in three major groups: (i) understanding the shape of the relationship between the inputs and outputs of a model, (ii) prediction and (iii) optimization. These different usage scenarios imply different approaches to the selection of a functional speciﬁcation of a metamodel, simulation experiment design and parameter estimation. A review and comparison of metamodel types applied in practice is presented in [4]. In this work we concentrate on metamodels developed to improve the understanding of simulation models. In this case a researcher usually requires that the metamodel has a simple structure that is easy to understand – like linear regression or low order polynomial [2]. However, the simplicity comes at a cost and predictions from such models can be biased in subdomains of input parameters space (this situation is called conditional bias). The reason is that these types of metamodels usually have a very constrained speciﬁcation (which is desirable from the interpretation point of view) but because of this they capture global input–output relationships and are unable to ﬁt the potentially complex local shape of a response surface of a simulation model. This situation can be contrasted with metamodels used for prediction, where a very accurate approximation of a response surface is required even at the cost of metamodel interpretability. We encounter such a situation, for example, in cases when a metamodel is to be used as a surrogate building block of some larger scale simulation [3]. Let us assume that a researcher requires a simple metamodel and accepts the loss in its goodness of ﬁt. This, however, does not mean that objective metamodel acceptance criteria should be abandoned. For stochastic simulation metamodels, ⇑ Tel.: +48 (22) 564 60 00. E-mail address: [email protected] http://dx.doi.org/10.1016/j.simpat.2015.03.008 1569-190X/Ó 2015 Elsevier B.V. All rights reserved.

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

87

Kleijnen and Sargent [2] suggest the following general procedure. Once a metamodel is estimated a new set of simulations should be performed to generate validation observations. Such a data set will contain information about simulation outputs for different input parameters. Additionally, because simulation is stochastic, it should be replicated several times for the same set of input parameter values. The random nature of simulation implies that standard measures of model ﬁt – like residual mean squared error or mean absolute error – are not fully adequate as they mix simulation variance with metamodel bias. Let us explain the above issue by considering a ﬁxed input d and denoting by YðdÞ a random variable representing the output of the simulation for input d. Now by lY ðdÞ ¼ EðYðdÞÞ we deﬁne the expected value of the simulation output1 and b ðdÞ is its prediction given by the metamodel. A single replication of the simulation gives us one sample of the random variable Y YðdÞ. Using the above notation we have:

2 b ðdÞ YðdÞ ¼ Y b ðdÞ l ðdÞ þ D2 ðYðdÞÞ D2 Y Y

ð1Þ

so the variance of deviation of the metamodel from a single simulation output is equal to the sum of squared bias of the metamodel and simulation output variance. In order to address the simulation output variability problem, Kleijnen and Sargent [2] recommend, as one of possible procedures, testing whether the absolute deviation of a metamodel prediction from a mean of simulation outputs for ﬁxed input values is below a given threshold (Eq. (3) in [2]) and suggest that the results of these tests for different input values can be combined into a simultaneous test using the Bonferroni inequality. In this paper we propose to extend this approach using the Bayesian framework. The key idea is to accept that a simple metamodel can be conditionally biased.2 However, we want to control the size of this bias. Assume that a researcher accepts that the expected value of simulation lY ðdÞ lies in the interval h i b b ðdÞ lðdÞ; Y b ðdÞ þ hðdÞ , so that the bias is not greater than lðdÞ from below and hðdÞ from above. In this setup we BðdÞ ¼ Y b b ðdÞ. We will call BðdÞ b not a single value Y an interval metamodel. can think of a metamodel as returning an interval BðdÞ, b Now we can pose two questions: (1) what is the probability that l ðdÞ 2 BðdÞ for ﬁxed d and (2) how can we generalize Y

this probability by allowing d to vary over the whole input parameters space. b from the response It should be stressed here that in this paper we are concerned with the deviation of the metamodel Y surface lY (which is the expected value of the simulation output). Therefore, the probability that lY ðdÞ falls into the interval b should not be mixed with the probability that a single observation of simulation output (realizametamodel prediction BðdÞ tion of a random variable YðdÞ) falls into this interval, which is a different question and, in general, will be lower.3 In summary, we assume that we have a metamodel that has a simple functional form, but consider not a point but an interval prediction and call it the interval metamodel. This allows us to retain a clear interpretation of the metamodel and simultaneously control the level of its bias. In the paper we develop a Bayesian procedure that allows for validation of interval metamodels. The structure of the paper is as follows. In Section 2 we deﬁne interval metamodels and propose a methodology for their validation. Next in Section 3 we present an example application of the proposed methodology to the classical Schelling’s segregation model. Following the standard requirements for the reproducibility of computational models [5] and the Simulation Modelling Practice and Theory journal guideline that theoretical results should be presented in the context of their applicability, http://bogumilkaminski.pl/pub/intmeta.zip provides the source codes of all the computations presented in the paper. For all implementations we used R version 3.0.0 [6] with additional packages: poibin [7] and gtools [8]. Subsequent listings assume that earlier ones were executed in the R console. Two data sets train.txt and test.txt are required to run listing8.r and can be downloaded from http://bogumilkaminski.pl/pub/schelling.zip. 2. Validation of interval metamodels By a simulation model we understand a mapping Y : D ! Y, where D is a set of input parameters to the simulation model and Y is a space of random variables called simulation model results. Random variables in Y have support on real numbers R. 4 A single run of a simulation for d 2 D produces a simulation output which is a realization of the random variable YðdÞ. Element d 2 D will be called input parameters of the simulation model. Again we will focus on the most common case and assume that D Rk . In such a case the input parameters are a k-dimensional real vector. 1

In the following text lX will denote the expected value of a random variable X. b ðdÞ l ðdÞ for a ﬁxed input parameter d (bias can be positive or negative). However, for typical By conditional bias we understand the deviation Y Y metamodels (e.g. linear regression estimated using least squares method) the average of conditional bias over the whole input parameters space D will be close to 0 because positive and negative conditional biases will roughly cancel out. We want to stress that it is possible for a simple metamodel to be unbiased on average, but it usually will be biased for ﬁxed d. This situation is illustrated in Figs. 1 and 8, which are discussed in Sections 2 and 3 respectively. 3 A single simulation output has additional noise due to the stochastic nature of the simulation as shown in Eq. (1). 4 It is possible to extend the presented analysis to more general sets than R, but real valued random variables are most common in applications so we focus only on them. 2

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

88

b : D ! R. The metamodel is usually estimated using data With the given simulation model Y we associate a metamodel Y gathered in a simulation experiment but it can also be speciﬁed by a researcher based on her assumptions or domain knowlb ðdÞ aims to predict EðYðdÞÞ. 5 We will call l ðdÞ ¼ EðYðdÞÞ a response edge. For input parameters d 2 D the value of metamodel Y Y

surface of model Y. b we associate a conﬁdence bandwidth6 given by two functions l : D ! Rþ and h : D ! Rþ . We will say With the metamodel Y h i b b ðdÞ lðdÞ; Y b ðdÞ þ hðdÞ . Observe that we assume that that in point d the metamodel approximates l ðdÞ in bandwidth BðdÞ ¼ Y Y

b b will be called l and h take strictly positive values. This implies that interval BðdÞ always has a strictly positive length. Mapping B an interval metamodel. The following example presents the introduced concepts. Example. Assume that D ¼ ½0; p and Y 1 ðdÞ N ð1 þ d þ sinð10dÞ=4; 1Þ.

7

The response surface of this model is

lY 1 ðdÞ ¼ 1 þ d þ sinð10dÞ=4. Now assume that we use a simpliﬁed metamodel Yb 1 ðdÞ ¼ 1 þ d. This would be a natural choice of a metamodel if we did not know the formula for lY 1 ðdÞ. 8 However, usually we are unsure what the precision of our metamodel is so we deﬁne a conﬁdence h i b 1 ðdÞ 0:125; Y b 1 ðdÞ þ 0:125 . 9 h b 1 ðdÞ ¼ Y B

bandwidth

by

l1 ðdÞ ¼ h1 ðdÞ ¼ 0:125

assuming

and

thus

It is pertinent to consider the question regarding the accuracy of the interval metamodel, that is: can we expect that the response surface lies within it? This concept is formalized below. b Let us deﬁne a function First let us analyze the case of a single d 2 D. We want to answer the question if l ðdÞ 2 BðdÞ. Y

b and equal to 0 otherwise. We will call CðdÞ the interval metaC : D ! f0; 1g. In point d 2 D it will be equal to 1 if lY ðdÞ 2 BðdÞ model conﬁdence at point d. Next, we will associate space D with a probability measure m implying a cumulative distribution function F D : D ! R. 10 b Then it is natural to try to measure the probability that l ðdÞ 2 BðdÞ for a random element d drawn from D. It can be calculated Y

as:

C¼

Z

CðdÞdF D ðdÞ:

ð2Þ

d2D

C will be called interval metamodel conﬁdence. Note that we assume that the set fd 2 D : CðdÞ ¼ 1g is measurable with respect to m. Let us remark here that if the researcher is interested in a speciﬁc point d then it is enough for her to calculate CðdÞ. However, very often one wants to verify the metamodel over some region of parameters, so then the calculation of C can be applied. It is worth noting that although we assumed that the integral given in Eq. (2) is calculated over the whole input parameters space D the calculations of C for some subset D0 D representing some region of input space that is interesting for the researcher can be performed using the same formulas after a conditional update of m measure, provided that D0 is mmeasurable. b 1 and bounds l1 and h1 it is easy to Example. For our example simulation model Y 1 and associated with it metamodel Y calculate CðdÞ and C analytically assuming that the probability measure m1 implies a uniform distribution of d on the interval b 1 ðdÞ ¼ sinð10dÞ=4. This value is in absolute value greater than 0:125 for ½0; p. Observe that l ðdÞ Y Y1

d2Z¼

9 [ k¼0

1 þ 6k 5 þ 6k p; p : 60 60

Therefore CðdÞ ¼ 0 for d 2 Z and otherwise it is equal to 1. Under the measure m1 we get C ¼ 1=3. The situation is depicted in Fig. 1. h The purpose of this paper is to present methods that allow a researcher to approximate CðdÞ and C using simulation. We shall apply the Bayesian methodology and follow the classical approach, as deﬁned by Diaconis and Freedman [9].11 5 Again it is possible to extend the analysis to statistical parameters other than the expected value, but metamodels predicting the expected value of a simulation are most common in practice. 6 We use the term conﬁdence because it expresses the researcher’s assumption about the range of the expected value of YðdÞ – similarly to conﬁdence intervals in estimation. 7 By N ðl; rÞ we denote a random variable having normal distribution with mean l and standard deviation r. 8 Here we assume, for the clarity of the exposition, that the metamodel is given. In practice we would take that the model has a linear form a0 þ a1 d and estimate ða0 ; a1 Þ using a sample of the simulations. 9 The bandwidth will usually be identiﬁed using some statistical procedure, see Section 3 for an example. 10 Measure m might be implied by known empirical distribution of input parameters. If such a distribution is not known, uniform distribution assumption is the simplest one. 11 A similar approach is used in Chapter 4 of [10] (in Polish) which contains author’s initial results on Bayesian analysis of a simulation model’s input–output relations.

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

89

4 3 2 1 1 Fig. 1. Relationship between response surface

2

3

d

lY 1 ðdÞ (thick black line), metamodel Yb 1 ðdÞ (dashed gray line) and interval metamodel Bb 1 ðdÞ (gray lines).

First we focus on the approximation of CðdÞ. Assume that we have some a priori assumption about the distribution of

lY ðdÞ given by a cumulative distribution function F d;0 : R ! ½0; 1. Now assume that we are able to simulate n 2 N times independently the model Y at point d. We assume that lY ðdÞ is a known function of some unknown parameters h 2 H that uniquely deﬁne the distribution Q h of YðdÞ. 12 Next, closely following [9], consider a family of probabilities fQ h : h 2 Hg on R. By Q nh we denote the inﬁnite product measure on Rn . This means that n samples are independent with the common distribution Q h . We assume that H is a Borel subset of a complete separable metric space (typically it will be a subset of Rk , where k is the number of parameters). Let n0 be a prior probability on H. Given this notation let Pn0 be a joint distribution of the parameters h and n-element sample R Pn0 ðA BÞ ¼ A Q nh ðBÞn0 ðdhÞ, where A and B are Borel sets. Given this probabilistic structure it is possible to calculate posterior probability on H given the n-element sample, which we denote as nn . Under this formalism F d;0 can be derived from n0 via a known link between h and lY ðdÞ. Following this, the information taken from the n-element sample allows us to compute nn using the Bayes theorem to obtain a posteriori beliefs about distribution of lY ðdÞ, that are represented by a cumulative distribution function F d;n . We assume that F d;n is consistent in the sense of [9]. Given this information we can calculate the posterior probability that CðdÞ is equal to 1 conditional on a priori distribution F d;0 and n-element sample as:

b ðdÞ 6 hðdÞÞ ¼ cn ðdÞ ¼ PrðlðdÞ 6 lY ðdÞ Y

Z bY ðdÞþhðdÞ bY ðdÞlðdÞ

dF d;n ðxÞ;

ð3Þ

where subscript n indicates that we have taken the sample of size n. Note that cn ðdÞ is a real number that we can calculate after taking the sample. Of course cn ðdÞ will vary conditional on the sample. Therefore we deﬁne a random variable C n ðdÞ describing a distribution of cn ðdÞ prior to taking the sample. Example. We give an example of a calculation of cn ðdÞ and an approximation of C n ðdÞ for simulation model Y 1 and interval b 1 . We consider input parameter d ¼ 0 and sample size equal to n ¼ 200. Note that Cð0Þ ¼ 1. metamodel B Here, and in the following examples, we assume that a researcher knows that Y 1 ðdÞ is normally distributed with standard deviation equal to 1 and only its mean is unknown.13 We assume uniform improper prior beliefs about lY ðdÞ. In such a case, after taking an n-element sample of YðdÞ, the posterior distribution of lY ðdÞ under uninformative prior is normal with mean equal to sample mean and variance equal to 1=n (see for example [11]). By U denote the cumulative distribution function of the standard normal distribution. Then for a given sample s1 ðdÞ we have:

pﬃﬃﬃ pﬃﬃﬃ b 1 ðdÞ þ h1 ðdÞ mÞ U nð Y b 1 ðdÞ l1 ðdÞ mÞ : cn ðdÞ ¼ U nð Y

ð4Þ

After the evaluation of the above expression using the code from listing1.r we obtain c200 ð0Þ 0:8855. Now we switch to the approximation of C 200 ð0Þ. Numerically we perform it by a replicated calculation of c200 ð0Þ. We can do it by running code from listing2.r. The result is depicted in Fig. 2. We calculate that the approximated mean of C 200 ð0Þ is equal to 0:7880, but it has a signiﬁcant variance due to the small sample size n. h

12 Very often lY ðdÞ is simply one of these parameters. For example if we would assume that YðdÞ is normally distributed then h is a pair of two parameters representing mean value and variance of the distribution, and lY ðdÞ is ﬁrst of these parameters. 13 In Section 3 we provide a more realistic study but here we want to restrict our analysis to a simple case.

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

90

density 10 8 6 4 2 0.2

0.4

0.6

0.8

1 cn (d)

b1. Fig. 2. Histogram of C 200 ð0Þ for simulation model Y 1 and interval metamodel B

Next we analyze the relationship between i

h

h

i

n

o

(A1)

lY ðdÞ 2 Yb ðdÞ lðdÞ; Yb ðdÞ þ hðdÞ ;

(A2)

lY ðdÞ R Yb ðdÞ lðdÞ; Yb ðdÞ þ hðdÞ ;

(A3)

lY ðdÞ 2 Yb ðdÞ lðdÞ; Yb ðdÞ þ hðdÞ .

b We have three possibilities: lY ðdÞ and BðdÞ.

We have assumed that F d;n is consistent, so for almost all sampling sequences:

8e > 0 : lim

Z lY ðdÞþe

n!þ1

dF d;n ðxÞ ¼ 1:

ð5Þ

lY ðdÞe

Therefore in case (A1) for almost all sampling sequences limn!þ1 cn ðdÞ ¼ 1 as there exists such an e > 0 that b This means that C n ðdÞ almost surely converges to CðdÞ ¼ 1. Analogously in case (A2) we get that ½lY ðdÞ e; lY ðdÞ þ e BðdÞ. C n ðdÞ almost surely converges to CðdÞ ¼ 0. The case (A3) is problematic as in this situation even asymptotically we do not know the value of b ðdÞ þ hðdÞ F d;n Y b ðdÞ lðdÞ . Thus it is not guaranteed that C n ðdÞ almost surely converges to CðdÞ ¼ 1. F d;n Y b b only if lY ðdÞ is not on the edge of BðdÞ. This is exactly lY ðdÞ 2 BðdÞ b the reason why we require that lðdÞ and hðdÞ are strictly positive. If lðdÞ ¼ hðdÞ ¼ 0 and Y ðdÞ ¼ lY ðdÞ (we have a perfect metaThe above analysis implies that we can test whether

model in point d) then we would not be able to conclude that C n ðdÞ almost surely converges to 1. The following example illustrates the above considerations. Example. We consider three input parameters d 2 f0; p=60; p=40g. Note that Cð0Þ ¼ Cðp=60Þ ¼ 1 and Cðp=40Þ ¼ 0. The b 1 ðp=60Þ. We difference between points 0 and p=60 is that in point p=60 we have that l ðp=60Þ lies on the boundary of B Y1

take n 2 f10; 100; 1000; 10000g and report the mean value and the standard deviation of the approximation of C n ðdÞ. This can be calculated by executing the code from listing3.r. The results are presented in Table 1. We can observe that for d 2 f0; p=40g an increase in sample size makes the distribution converge to the proper value of CðdÞ. However for d ¼ p=60 the distribution does not converge because this is a boundary point. h There is one issue related to the diagnostics for the evaluation of cn ðdÞ. If it is low (not close to 1) we can have three b ðdÞ lðdÞ, (b) strong evidence that possibilities: (a) strong evidence (high posterior probability) that lY ðdÞ < Y b b lY ðdÞ > Y ðdÞ þ hðdÞ and (c) sample size n is low and posterior distribution has a large spread relative to the width of BðdÞ b ðdÞ lðdÞÞ and and encompasses this interval from above and below. In the third situation both Prðl ðdÞ < Y Y

b ðdÞ þ hðdÞÞ are high. In order to identify this third undesired possibility, we deﬁne the following diagnostic meaPrðlY ðdÞ > Y sure to detect it:

8
v n ðdÞ ¼ min :

1

9 = dF d;n ðxÞ; dF d;n ðxÞ : ; bY ðdÞþhðdÞ Z

þ1

ð6Þ

v n ðdÞ is high (observe that this implies that cn ðdÞ cannot be high simultaneously) we may conclude that sample size n is b The way that v n ðdÞ can be used in practice is presented in Section 3. insufﬁcient relative to the assumed width of BðdÞ. If

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

91

Table 1 The mean value and the standard deviation of the approximation of C n ðdÞ for d 2 f0; p=60; p=40g and n 2 f10; 100; 1000; 10; 000g. d

n

Mean

sd

0

10 100 1000 10,000

0.2203 0.6230 0.9946 >0.9999

0.0838 0.1777 0.0200 <0.0001

p=60

10 100 1000 10,000

0.2120 0.4615 0.5007 0.4999

0.0884 0.2455 0.2879 0.2890

p=40

10 100 1000 10,000

0.2036 0.3403 0.1239 0.0001

0.0921 0.2477 0.1689 0.0016

Having ﬁnished the discussion for a single element of the input parameters set, we proceed to the evaluation of the overall interval metamodel conﬁdence C. n b ðdÞ lðdÞ; Consider a set A3 D consisting of points d where the distribution F d;n is not consistent or l ðdÞ 2 Y Y

b ðdÞ þ hðdÞg, case (A3) above. We assume that mðA3 Þ ¼ 0. This means that almost surely we can omit those cases where Y C n ðdÞ does not converge to CðdÞ. We take it that prior distributions of lY ðdÞ in different points are independent and identical. This reﬂects the assumption that there are no prior assumptions about the shape of lY ðdÞ. 14 We independently sample m points from the set D with respect to measure m. By Di we denote a random variable representing the result of the i-th draw. Deﬁne X i;n ¼ C n ðDi Þ. Note that because Di are independent and have an identical distribution, X i;n are also independent and have an identical distribution. The random variable X i;n can be interpreted as a distribution of cn ðdÞ before the drawing of d and taking n samples from YðdÞ. Therefore, the set of computed values cn ðdi Þ, where di is a realization of Di , can be seen as m independent samples of the random variable X i;n , and the empirical cumulative distribution function built using them approximates the cumulative distribution function of X i;n . Denote the average of cn ðdi Þ as:

Pm cn;m ¼

i¼1 c n ðdi Þ

m

ð7Þ

:

It is a realization of a random variable:

Pm C n;m ¼

i¼1 X i;n

m

ð8Þ

:

In what follows we analyze the asymptotic properties of X i;n and C n;m . Consider a set of elementary events X ¼ D S1 . It denotes a choice of d 2 D and then an inﬁnite sequence of draws from the random variable YðdÞ. 15 We deﬁne a random variable W : X ! f0; 1g, where WðxÞ ¼ Cðdx Þ for dx implied by x 2 X. Observe that W has a Bernoulli distribution with the probability of success equal to C. We will show that as n tends to inﬁnity, X i;n almost surely converges to W. Lemma 1. If mðA3 Þ ¼ 0 then X i;n almost surely converges to a Bernoulli random variable with the probability of success equal to C as n tends to inﬁnity. Proof. First we split the set X into three disjoint subsets A1 ; A2 and A3 . By dx we will denote an element of D implied by x. i h b ðdx Þ lðdx Þ; Y b ðdx Þ þ hðdx Þ . Deﬁne A1 as a set of such x where the distribution F d ;n is consistent and l ðdx Þ 2 Y Y

x

lY

Analogously deﬁne A2 as a set h i b ðdx Þ lðdx Þ; Y b ðdx Þ þ hðdx Þ . ðdx Þ R Y

of

such

x

where

the

distribution

F dx ;n

is

consistent

and

14 This assumption is often made in the Bayesian approach to ranking and selection problems [12]. The difference here is that these methods assume that only a few alternatives are compared while we allow the decision space D to be an inﬁnite subset of Rk . 15 We consider inﬁnite sequences of draws in order to be able to perform asymptotic analysis. In practice a researcher will observe only ﬁnite sequences.

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

92

A3 follows the deﬁnition of A3 given earlier. Observe that A1 ; A2 and A3 are disjoint and A1 [ A2 [ A3 ¼ X. By assumption

mðA3 Þ ¼ 0 so we can concentrate on sets A1 and A2 . If x 2 A1 [ A2 then we have WðxÞ ¼ Cðdx Þ. But then X i;n ðxÞ ¼ C n ðdx ÞðxÞ and we have shown above that C n ðdx Þ almost surely converges to Cðdx Þ. This implies that X i;n almost surely converges to W as n tends to inﬁnity. h Note that this result implies that limn!þ1 EðX i;n Þ ¼ C, and consequently limn!þ1 EðC n;m Þ ¼ C. However, for ﬁnite m even as n tends to inﬁnity, C n;m is a random variable (in a limiting case it is an average of m Bernoulli distributed random variables). Now we move on to the analysis of the case where both n and m tend to inﬁnity. Theorem 1. If mðA3 Þ ¼ 0 then C n;m converges in probability to C as n and m jointly tend to inﬁnity. Proof. The statement of the theorem can be equivalently expressed as:

8e > 0 :

lim

PrðC n;m C P eÞ ¼ 0:

ð9Þ

ðn;mÞ!ðþ1;þ1Þ

Observe that:

0 6 Pr C n;m C P e 6 Pr jC n;m EðC n;m Þj þ EðC n;m Þ C P e :

ð10Þ

We have shown that limn!þ1 EðC n;m Þ ¼ C so there exists such an n0 that:

8n > n0 : EðC n;0 Þ C < e=2:

ð11Þ

Note that n0 is independent from m as EðC n;0 Þ ¼ EðX i;n Þ. Thus our problem reduces to proving that for n > n0 :

8e > 0 :

lim

PrðjC n;m EðC n;m Þj P e=2Þ ¼ 0:

ð12Þ

ðn;mÞ!ðþ1;þ1Þ

From Chebyshev’s inequality we have that:

PrðjC n;m EðC n;m Þj P e=2Þ 6 4

D2 ðC n;m Þ

e2

¼4

D2 ðX i;n Þ : me2

ð13Þ

Now observe that X i;n is limited to the ½0; 1 interval so D2 ðX i;n Þ 1=4, thus we have:

PrðjC n;m EðC n;m Þj P e=2Þ 6

1 : me2

ð14Þ

So substituting it back to Eq. (12) we can see that for n > n0 we get the required convergence to 0 as m tends to inﬁnity. This conclusion ﬁnishes the proof. h Note that in the proof we need both n and m to jointly tend to inﬁnity. From the proof it follows that an increase of n reduces the bias of C n;m from C and an increase of m reduces its variance. In order to converge we need to make both bias and variance tend to 0. In the proposed procedure we assume no structure of the simulation model Y because we seek to validate the existing metamodel. This approach is expressed by the assumption that a priori distributions are uninformative and independent for all the elements of set D. It should be noted at this point that there exists literature presenting models where the assumption about the correlation structure of the response surface lY ðdÞ for different input parameters is made, see for example [13,12] for the Bayesian setting or [14,15] for the frequentist treatment. Such approaches allow authors to get stronger results mainly in terms of reduced requirements for the number of samples and the ability to perform statistical inference about the input parameters d 2 D that were not sampled, yet at the cost of additional assumptions that might not be met by the response surface lY ðdÞ. In summary, we have the following conclusions: (1) the measure cn;m can be used to assess the probability that the response surface b interval metamodel B;

lY lies in the range deﬁned by the

(2) the empirical cumulative distribution function calculated using cn ðdi Þ approximates the cumulative distribution of X i;n ; in combination with diagnostic values v n ðdi Þ this allows a researcher to identify points di for which the sample size n b i Þ; was insufﬁcient to precisely determine if l ðdi Þ 2 Bðd Y

(3) it is possible to build a model that explains cn ðdi Þ by di ; this can help a researcher to identify a subspace of D where the b does not ﬁt the response surface l well. interval metamodel B Y

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

93

b 1 taken from earlier Example. Let us present the above properties for the simulation model Y 1 and the interval metamodel B examples. First we want to follow the procedure described above for n ¼ 10; 000 and m ¼ 1000. The code performing this computation is given in listing4.r. We get the value cn;m 0:3001, which approximates the value 1=3 that was calculated analytically above. The empirical cumulative distribution function of cn;m is presented in Fig. 3. We note that in most of the b (the jump in the empirical distribution function points we get a strong indication to accept the hypothesis that l ðdÞ 2 BðdÞ Y

at 1) or reject it (similar jump around 0). There are, however, some points di in which the value of cn ðdi Þ is not close to these values. We investigate them in Fig. 4. By comparing it with Fig. 1 we can note that the problems occur in points where lY 1 ðdÞ b 1 ðdÞ. In such points more samples per point would be required to accurately is near the boundary of the interval deﬁned by B b 1 ðdÞ. verify if lY 1 ðdÞ 2 B Now we turn to the analysis of bias and variance of C n;m . We take n 2 f10; 100; 10; 000g; m 2 f10; 50; 100g and report the mean value and the standard deviation of C n;m computed using the simulation given in the code from listing5.r. The results are presented in Table 2. We note three things. Firstly, the bias is reduced by the increase of n and is unaffected by m. Secondly, the standard deviation decreases with m. Lastly, the standard deviation increases with n. h The ﬁrst two observations from the above example are intuitive and follow the proof of convergence of C n;m to C. The third result may be surprising and is explained by the fact that by increasing n we make cn ðdÞ closer to 0 and 1 (we are able to b identify if l ðdÞ 2 BðdÞ with greater conﬁdence) thus increasing the variance. For ﬁxed m and n ¼ 0 we have zero variance Y

(we use identical a priori distributions for all di that are unaffected by any observations) and in the limiting case n ! 1 we obtain the variance of a binomial distribution with m trials that have the probability of success C. This observation implies that the standard deviation of C n;m would not be a good measure of uncertainty associated with the measurement of C. In the following part of the section a superior measure is proposed. Consider the result of running the simulation Y in each of m points n times. We obtain m points di and probabilities cn ðdi Þ b i Þ. Therefore, given our posterior beliefs in ﬁxed points di , the random variable Z describing the number of that l ðdi Þ 2 Bðd Y

b follows the Poisson observed points where the response surface lY lies in the bandwidth deﬁned by interval metamodel B binomial distribution16 with probability parameters cn ðdi Þ. The distribution of Z captures our posterior uncertainty about the number of actually observed points where b i Þ. Observe that D2 ðZÞ ¼ Pm ð1 cn ðdi ÞÞcn ðdi Þ so the more precise the evaluation of probabilities we have (the l ðdi Þ 2 Bðd i¼1

Y

closer they are to 0 or 1) the lower the uncertainty of Z. Now assume that we have a prior on C having Beta distribution with parameters a ¼ 1; b ¼ 1. 17 We make this assumption to allow for Bayesian inference about points in D that were not sampled for the calculation of cn;m . Some kind of additional assumption is needed because in the above analysis we assumed that distributions of lY ðdÞ are independent for different d-s and such an assumption as proposed here is relatively weak. Observe that it is less restrictive than for example: (a) the need to specify a correlation function of a Gaussian random ﬁeld used in different varieties of stochastic kriging models [13,14], (b) kernel speciﬁcation in local regression models or smoothness structure assumptions in spline models [17], (c) speciﬁcation of the functional form of input–output relationship in Bayesian models [18]. All those models assume some type of regularity of the function lY ðdÞ. 18 In practice the response surface can be more complex than the shapes allowed under these restrictions – for example discontinuous as is the case for the simulation model presented in Section 3 – so we prefer not to make such assumptions in a procedure that is aimed at validating interval metamodels. Additionally, setting prior to C allows us to tackle those cases where D is a subset of a high-dimensional space (as it is independent from the speciﬁcation of D) and the above mentioned alternatives have a limited application in such cases. Furthermore, we denote a probability density function of beta distribution with parameters a and b by pðx; a; bÞ. Assume that we have sampled m points di and in each point drawn n observations of simulation model Y. Using this data we have derived the distribution Z following the procedure described above. We want to calculate the posterior probability density function of C conditional on Z. It is be denoted as pðxjZÞ. Observe that:

pðxjZÞ ¼

m m X X PrðZ ¼ zÞpðxjZ ¼ zÞ ¼ PrðZ ¼ zÞpðx; 1 þ z; 1 þ m zÞ: z¼0

z¼0

Therefore pðxjZÞ is a probability density function of a mixture of beta distributions. We can calculate the expected value of this distribution

lC ¼

16

m X 1þz 1 þ EðZÞ 1 þ mcn;m PrðZ ¼ zÞ ¼ ¼ 2 þ m 2þm 2þm z¼0

See [16] for deﬁnition of this distribution. In this text we assume uniform a priori distribution, but in practice of course other priors for ða; bÞ can be used. 18 See also, for example, [19] giving, in particular, a review of the restrictions introduced by metamodels based on Gaussian processes for deterministic simulations; the case of stochastic simulations, as considered in this paper, is more complex so all the problems described there are also valid here. 17

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

94

Pr(c100001000 ≤ x)

1 0.75 0.5 0.25 0

0.25

0.5

0.75

1

x

Fig. 3. Empirical cumulative distribution function of C 10;000;1000 .

cn (d) 1

0.5

0

0

1

2

3

d

Fig. 4. Scatterplot of ðd; cn ðdÞÞ. Table 2 Estimated mean value and standard deviation of C n;m for n 2 f10; 100; 10; 000g and m 2 f10; 50; 100g. m

n

Mean

sd

10

10 100 10000

0:2045 0:3722 0:3343

0:0291 0:0852 0:1426

50

10 100 10000

0:2042 0:3719 0:3346

0:0131 0:0378 0:0648

100

10 100 10000

0:2044 0:3723 0:3338

0:0091 0:0273 0:0454

and its variance

r2C ¼

m X ð3 þ mÞðz mcn;m Þ2 þ ð1 þ zÞð1 þ m zÞ PrðZ ¼ zÞ : ð2 þ mÞ2 ð3 þ mÞ z¼0

The mean value

lC has a small bias relative to cn;m that results from the assumption of uniform prior on C.

b 1 . First we visualize how the distribution of Z and Example. We use the simulation model Y 1 and the interval metamodel B the derived density pjZ look. Using the code from listing6.r we calculate them for n ¼ 100 and m ¼ 20. The results are presented in Fig. 5. Now let us switch to an investigation how pjZ changes when n and m change. Using the code from listing7.r we simulate it for n; m 2 f100; 500; 2500g. In Table 3 we report the means of lC and r2C . Similar to the results presented in

Table 2, we observe that increasing m does not inﬂuence the mean of lC signiﬁcantly19 and increasing n reduces bias. On the

19

The only inﬂuence is the fact that by increasing m we reduce the deviation of

lC from cn;m .

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

Pr(Z/m = x)

95

p(x|Z)

0.3

3

0.2

2

0.1

1 x

0 0

0.25

0.5

0.75

1

Fig. 5. Distribution of Z=m for n ¼ 100 and m ¼ 20 (gray points) and density pðxjZÞ derived from it (black line).

Table 3 Simulation of pjZ for n; m 2 f100; 500; 2500g. m

lC

n

Mean of

100

100 500 2500

0:3729 0:3542 0:3411

0:0616 0:0546 0:0499

500

100 500 2500

0:3709 0:3560 0:3363

0:0280 0:0249 0:0226

2500

100 500 2500

0:3717 0:3541 0:3361

0:0125 0:0111 0:0101

Mean of

r2C

other hand dispersion (variance) of posterior distribution of C decreases with n and with m. However, we note that m has a much larger inﬂuence on r2C . This gives an example of the result shown earlier in the proof of Theorem 1 that the choice between n and m can be seen as a bias-variance trade-off. h In summary, the following interval metamodeling procedure can be proposed: b and interval metamodel B b using the sample; 1. simulate model Y and estimate metamodel Y b veriﬁcation using the methods presented in this 2. execute new independent simulations to perform interval metamodel B section. There are two practical points worth mentioning. Firstly, we have assumed in the analysis that n is ﬁxed between design points di . However, our Bayesian procedure can easily be adjusted to allow variability of the number of simulation replications between design points in order to ensure that we have an evaluation of point posterior probabilities sufﬁciently close to 0 or 1. A potential drawback of this procedure is that in some cases very large values of n would be needed (when lY ðdÞ is b very close to the boundary of BðdÞ), which might not be acceptable due to simulation time requirements. The second point is that step two of the proposed procedure can be performed iteratively. When we perform an initial veriﬁcation of the interval metamodel we obtain values of v n ðdi Þ and variance r2C . In the cases where v n ðdi Þ is high we conclude that n was too small and additional samples are needed (it is impossible to give a general rule how many more samples are required, but it is possible to compute it using formula (6) when the speciﬁcation of F d;n is known). If r2C is too high from the perspective of the researcher then m should be increased. If the researcher knows the target value of the variance of C then using the formula for r2C it is possible to check directly how an increase of m changes the variance (we treat cn;m as known and assume that additional points on the average have this probability of lying within the interval metamodel bandwidth – this is a typical approach used in sample size calculation procedures). A general observation may be made that the proposed method is conservative in the sense that it biases cn ðdÞ from the true values of 0 and 1 towards the interior of ½0; 1 interval. This means that we can expect the procedure to have the highest b b b everywhere or l ðdÞ R BðdÞ everywhere. If l ðdÞ 2 BðdÞ only in some subset of D then the upward bias of cn;m if l ðdÞ 2 BðdÞ Y

Y

Y

b b partially cancels out the downward bias when lY ðdÞ 2 BðdÞ. This situation reduces the mean bias in cases when lY ðdÞ R BðdÞ bias. Of course this behavior is not guaranteed as it depends on the assumption of a priori distribution of lY ðdÞ. Lastly, note that the whole analysis can be reinterpreted in terms of input modeling [20]. In this text we have not made a distinction between directly observable inputs and parameters that are obtained using statistical inference [21]. With this distinction it can be observed that the measure m on D may be interpreted as the representation of the uncertainty of

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

96

measurement of simulation parameters. Thus the proposed methodology can be seen as a way of assessing the impact of input uncertainty on simulation output.20 3. Example application We use the classical Schelling’s segregation model [23] and its implementation [24] in NetLogo 5.0.3 [25] as an example. The simulation model has two parameters: n is the number of agents in the simulation and sw is the percentage of similar agents wanted by every agent in her neighborhood. In our experiment we set sw 2 ½20; 70 and n 2 ½500; 2000. The simulation is run until no agents want to move.21 The simulation output parameter is the percentage of neighbors similar to the agent averaged over all agents (sp). In order to collect the data required to estimate the metamodel we generate 2601 sample points using grid design from space ½500; 2000 ½20; 70 where each dimension is divided into 51 equidistant values. For each point one simulation is run. The collected data is stored in train.txt ﬁle. The observed sp values vary from 56:77 to 100. Using this data we estimate a polynomial of degree 2 with interactions on parameters sw and 1= lnðnÞ to obtain the following sp prediction formula22 (all the computations are performed using the code from listing8.r):

b ðsw; nÞ ¼ c Y sp ¼ 145:4795 þ 3:9919sw 0:0137sw2 þ 1492:4188= lnðnÞ 1803:7888= ln ðnÞ2 14:3349sw= lnðnÞ

ð15Þ

We deﬁne our interval metamodel by setting hðsw; nÞ ¼ lðsw; nÞ ¼ 3. Note that these values will usually not be strongly related to standard conﬁdence intervals that can be calculated for linear regression prediction. This is because we know that our model is probably conditionally biased and conﬁdence intervals in linear regression are calculated under the assumption that the model is properly speciﬁed. For example in our model the mean width of 95% conﬁdence interval of Eðspjsw; nÞ obtained after estimation is equal to around 0:4 and its maximum value is around 0:9. 23 Both these values are much lower than 3 and, as we will see later, are overly optimistic, because the metamodel given by Eq. (15) is conditionally biased. Having speciﬁed the interval metamodel let us move to its validation. In order to do this we uniformly draw a sample of 64 integer tuples such that sw 2 ½20; 70 and n 2 ½500; 2000. In each sample point we replicate the simulation 64 times. The collected observations are stored in test.txt ﬁle. For the computation of posterior distributions F ðswi ;ni Þ;64 and consequently probabilities c64 ðswi ; ni Þ in m ¼ 64 sampled points we used a Bayesian bootstrap [26] with 10,000 replications. We obtain that c64;64 ¼ 0:8907656. Table 4 presents the counts of calculated values for c64 ðswi ; ni Þ. Some of the values are exactly equal to 0 or 1 due to the fact that the computations are performed with limited precision and Bayesian bootstrap used a ﬁnite number of replications. However, we have 6 observations that have different observed values. First we run a diagnostic test to check if we do not have a problem that posterior distributions have a much higher dispersion than the interval width lðdÞ þ hðdÞ ¼ 6. In order to do this we calculate v 64 ðswi ; ni Þ following the formula given in Eq. (6). We ﬁnd that for all ðswi ; ni Þ this metric evaluates to 0, so we do not have this problem (sample size is large enough to b avoid it). Therefore, these 6 points are the cases when the true value of l ðsw; nÞ is close to the boundary of interval Bðsw; nÞ Y

b nÞ. This is a case that and 64 samples per point were not enough to give us sufﬁcient information whether lY ðsw; nÞ 2 Bðsw; can be considered typical, especially for expensive simulations. The procedure proposed in this paper allows us to handle this issue. We calculate the posterior distribution of C using the mixture of beta distributions as it was explained in Section 2. The results are presented in Fig. 6. The mean and the standard deviation of this distribution have the following values lC ¼ 0:8789242 and rC ¼ 0:04098572. Observe that a simpliﬁed way to approximate lC and rC would assume that c64;64 was calculated as a mean of exact obserb vations whether l ðsw; nÞ 2 Bðsw; nÞ or not. Then, using standard Bayesian inference, we would evaluate the mean again as Y

lC but the standard deviation would be lower and amount to 0:03985569 – this value could be called intrinsic dispersion. It is 2:76% lower than in the proposed procedure case. The higher dispersion of rC is a result of the extra uncertainty in the b evaluation of test lY ðsw; nÞ 2 Bðsw; nÞ in the 5 cases that were not conclusive. Obviously the larger the number of inconclusive probabilities c64 ðswi ; ni Þ the higher the inﬂation of variance would be. One could imagine an extreme case where for all observations we have c64 ðswi ; ni Þ ¼ c64;64 ¼ 0:8907656. In such a situation we would get even more dispersion since in this case rC ¼ 0:05474093 and it would be 37:35% higher than the intrinsic dispersion. Note that an increase in the number of samples per point reduces the inﬂation of intrinsic dispersion. On the other hand an increase in the number of different sampled points brings down the intrinsic dispersion. 20

A wider discussion of input uncertainty assessment can be found in [22] and references therein. Given the assumed range of parameters such a situation is always possible. For very high values of parameters, eg. n ¼ 2500 and sw ¼ 100 the simulation might not terminate in ﬁxed state. 22 We use 1= lnðnÞ rather than n; lnðnÞ or 1=n as a second explanatory variable in the formula as it provides the best ﬁt of the metamodel without increasing its complexity. 23 The conﬁdence intervals reported here have frequentist interpretation, because we used least squares estimation and standard statistical inference. However, it can be checked that Bayesian estimation of this regression with standard priors would yield similar, overly optimistic, values. 21

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

97

Table 4 The counts of observed values of c64 ðdi Þ in the example given in Section 3. c64 ðswi ; ni Þ

0.000

0.095

0.132

0.914

0.934

0.959

0.974

1.000

Count

5

1

1

1

1

1

1

53

10

density

8 6 4 2 0

0.7

0.8

0.9

1

C

Fig. 6. Posterior distribution of C (gray line) and point evaluation c64;64 (vertical solid black line).

The proposed methodology allows us to visualize where the interval metamodel does not ﬁt the simulation data well. Fig. 7 presents a scatter plot of the sampled points with a visualization of c64 ðswi ; ni Þ. We can observe that points where this probability is evaluated as lower than 1 concentrate around vertical lines (0 values are observed for sw equal to 25, 33, 34 and 50). This suggests that lY ðsw; nÞ might have jumps near these values. We verify the above hypothesis for the example case of n ¼ 1000. The results of the calculations are presented in Fig. 8. We observe that the averages of 10 simulations (black points) have jumps when we change sw between values 20–21, 25–26, 33–34, 40–41, 50–51, 60–61 and 66–67. However, only when the jumps are large do the sampled value averages fall outside b the interval metamodel Bðsw; nÞ such as for sw equal to 25, 33, 34 and 50 in Fig. 7. Similarly, in the same ﬁgure, the light gray points (probabilities greater than 0 and less than 1) are also located near the jumps (sw values 21, 25, 26, 32, 36, 49).24 It should be noted that it would be possible to create a metamodel that would properly model these jumps. However, it would have a much more complex structure than a simple polynomial of degree 2 of variables sw and 1= lnðnÞ. In order to conclude this section we shall discuss how interval metamodels can be built in practice. The construction of b and (2) selection of hðdÞ the interval metamodel involves two steps: (1) selection and estimation of the base metamodel Y and lðdÞ. The step of building the base metamodel can be performed using standard procedures, as discussed for example by Kleijnen and Sargent [2]. However, as was already remarked in the introduction, when building interval metamodels we are most concerned about an improvement in the understanding of the relationship between the input and output of simulation models, so linear or low order polynomial metamodels are the most natural choice. In the example that follows we consider the metamodel given by Eq. (15) and compare it to a simpler metamodel that is a linear regression on sw and n estimated using the same data:

b ðsw; nÞ ¼ c Y sp ¼ 62:2122 þ 0:7204sw 0:0059n:

ð16Þ

In the second step we have to choose lðdÞ and hðdÞ. The methodology proposed in this text is general and allows them to be arbitrary functions of d. However, in practice it can be expected that the most frequent choice is to use symmetric and constant bandwidth, i.e. assume that lðdÞ ¼ hðdÞ ¼ const. 25 In this way the interval metamodel formula is as simple as the original metamodel. The choice of the appropriate constant can be aided by an analysis of maxi v n ðdi Þ and cn;m . In general maxi v n ðdi Þ should be very close to 0 as otherwise we know that in some points our sample size is too small. The choice of value for cn;m is more subjective and depends on the objective of the researcher. A natural choice is a value similar to a standard p-value threshold equal to 90%; 95% or 99%. This process is presented in Fig. 9. We can see that for the polynomial metamodel (Eq. (15)) the values of maxi v 64 ðswi ; ni Þ indicate that the constant value of lðdÞ and hðdÞ should not be less than 0:5. Similarly, we get 90% coverage by setting this value to 3, as was done in the ﬁrst part of this section. For linear metamodel (Eq. (15)) maxi v 64 ðswi ; ni Þ indicates that the minimum constant should be 0:75 and when it reaches 5 then c64;64 obtains the value 0:8681. 24 The reason for the presence of response surface jumps in the NetLogo implementation of Schelling segregation model is that each agent has a ﬁnite number of present neighbors (maximum 8) so a change of agent decision is only possible for sw threshold equal to i=j where j 2 f1; 2; . . . ; 8g and i 2 f0; 1; . . . ; jg. 25 b ðdÞ ¼ hðdÞ= Y b ðdÞ ¼ const. Another plausible option would be to use a bandwidth that is equal to a certain percentage of the metamodel prediction lðdÞ= Y

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

98

n 2000 1500 1000 500

20

30

40

50

60

70

sw

Fig. 7. Classiﬁcation of points sampled in the test phase in the example given in Section 3. Black points indicate the situation where the posterior b probability that lY ðsw; nÞ 2 Bðsw; nÞ is equal to 1. Gray circles represent the case where this probability is equal to 0. Light gray points indicate intermediate probability values.

sp 100 90 80 70 60

20

30

40

50

60

70

sw

Fig. 8. Test of the relationship between sw and sp for n ¼ 1000. Black points represent the means of 10 sample simulations made for a given sw value. The solid line represents the metamodel given by Eq. (15). Dashed lines give the bandwidth of the interval metamodel where lðsw; nÞ ¼ hðsw; nÞ ¼ 3.

c64,64

maxi v64 (swi , ni )

1

0.4

0.8

0.3

0.6 0.2 0.4 0.1

0.2 0

l(d) = h(d) 0

1

2

3

4

5

0

l(d) = h(d) 0

1

2

3

4

5

Fig. 9. Comparison of metamodels given by Eqs. (15) and (16) with varying interval bandwidth. The solid line represents the results for an interval metamodel based on Eq. (15) and the dashed line for the interval metamodel based on Eq. (16).

Lastly, it is worth remarking that plots in Fig. 9 allow the researcher to quickly compare the ﬁt of two metamodels. For example, the researcher can see that the polynomial metamodel with bandwidth 2 has roughly equivalent coverage to the linear model with bandwidth 4:3. Given such information it is possible to balance the complexity of the metamodel against its goodness of ﬁt.

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin

99

4. Concluding remarks In this paper we have analyzed the use of simulation metamodels for understanding the relationship between the inputs and outputs of stochastic simulation models. Such metamodels are expected to have a simple structure as this guarantees that they can be easily understood and interpreted by a researcher. However, this simplicity can lead to the problem that the predictions from the metamodel can be conditionally biased. In order to overcome this limitation and capture the level of conditional bias in such models, we propose the use of interval metamodels which assume that the prediction is an interval rather than a single point. The aim of this procedure is to ensure that the response surface of the simulation model lies within the bandwidth deﬁned by the interval metamodel. In this way we can retain a simple structure for the metamodel and offset the problem of its conditional bias. We have proposed a procedure that allows us to verify how well the interval metamodel covers the response surface of the simulation model. It uses the Bayesian framework and allows a researcher to: (1) verify the probability cn ðdÞ that for a ﬁxed input d the interval metamodel contains the expected value of the simulation output; additionally the diagnostic measure v n ðdÞ can be used to help identify situations when the selected sample size n is too small; (2) calculate the evaluation of probability cn;m that in a random point of input space the interval metamodel covers the response surface; (3) assess the uncertainty associated with the value of cn;m due to sampling (posterior distribution of C). We have illustrated the proposed method on an example application to the Schelling segregation model and have shown that it can naturally help a researcher to uncover non-obvious characteristics of the simulation model response surface by identifying points where the simple metamodel is most biased. b is estimated from training data, but the proposed In the discussion given in the text we have assumed that metamodel Y methodology does not preclude a situation in which it is deﬁned by the researcher based on the research question at hand. For example, one might want to verify if the expected value of simulation output is greater than some ﬁxed value indepenb ðdÞ ¼ const; lðdÞ ¼ 0 and hðdÞ ¼ þ1 for all d 2 D. dent of the input parameter. We would then have Y Interval metamodels can be used in two different ways. In the ﬁrst approach a researcher may know the levels of acceptable bias of the metamodel and thus deﬁne l and h before the analysis. Subsequently, she will be interested in the evaluation of the interval metamodel conﬁdence C. In the second approach l and h may be undeﬁned beforehand and the researcher may wish to test several speciﬁcations of bandwidth width. In this way she can learn the appropriate width of bandwidth (for b and in what subspace of D the interval metamodel does not ﬁt example to obtain 90%; 95% or 99% coverage of l by B) Y

the simulation well (an example of such an analysis was given in Section 3). The proposed method uses the most straightforward approach to the design of an interval metamodel veriﬁcation experiment. It assumes that m points are sampled randomly from the input parameters space and that n replications of the simulation are performed in each point. This totals to m n simulations. However, it is clear that this computational budget in practice could be allocated more efﬁciently. First of all n does not have to be ﬁxed but could be chosen dynamically conditional on the collected data. Furthermore, one could consider a more efﬁcient design for the sampling of m points from the input parameters space. These issues of optimal allocation of computing budget in an interval metamodel veriﬁcation problem are an open area for further investigations. Lastly, in this paper we show how to validate interval metamodels but give only basic guidance for methods of their estimation. Therefore, there are several open methodological questions for further work. The ﬁrst one is how to select bandwidths l and h so that they give the assumed interval metamodel conﬁdence C and have some additional desired properties (i.e. they give the shortest interval). The second is whether the knowledge of wanting to obtain the interval metamodel b. would affect the way we estimate the base metamodel Y Acknowledgment Thanks are given to two anonymous reviewers who have provided detailed comments that helped to improve the paper. References [1] R.R. Barton, Metamodels for simulation input–output relations, in: J. Swain, D. Goldsman, R. Crain, J. Wilson (Eds.), Proceedings of the 1992 Winter Simulation Conference, IEEE, 1992, pp. 289–299. [2] J.P. Kleijnen, R.G. Sargent, A methodology for ﬁtting and validating metamodels in simulation, Eur. J. Oper. Res. 120 (1) (2000) 14–29. [3] I.R. Santos, P.R. Santos, Simulation metamodels for modeling output distribution parameters, in: S. Henderson, B. Biller, M.-H. Hsieh, J. Shortle, J. Tew, R. Barton (Eds.), Proceedings of the 2007 Winter Simulation Conference, IEEE, 2007, pp. 910–918. [4] G.G. Wang, S. Shan, Review of metamodeling techniques in support of engineering design optimization, J. Mechan. Des. 129 (4) (2006) 370–380. [5] R. Peng, Reproducible research in computational science, Science 334 (6060) (2011) 1226–1227. [6] R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria (2013). . [7] Y. Hong, On computing the distribution function for the poisson binomial distribution, Comput. Stat. Data Anal. 59 (2012) 41–51.

100 [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26]

´ ski / Simulation Modelling Practice and Theory 54 (2015) 86–100 B. Kamin G.R. Warnes, B. Bolker, T. Lumley, gtools: Various R programming tools, r package version 3.3.1 (2014). . P. Diaconis, D. Freedman, On the consistency of bayes estimates, Annals Stat. 14 (1) (1986) 1–26. B. Kamin´ski, Podejs´cie wieloagentowe do modelowania rynków (in Polish), Oﬁcyna Wydawnicza SGH, 2012. A. Gelman, J.B. Carlin, H.S. Stern, D.B. Rubin, Bayesian Data Analysis, Chapman & Hall/CRC, 2004. W.B. Powell, I.O. Ryzhov, Optimal Learning, Wiley, 2012. S.H. Ng, J. Yin, Bayesian kriging analysis and design for stochastic simulations, ACM Trans. Model. Comput. Simul. 22 (3) (2012) 17–26. B. Ankenman, B. Nelson, J. Staum, Stochastic kriging for simulation metamodeling, Oper. Res. 58 (2) (2010) 371–382. P. Salemi, J. Staum, B. Nelson, Generalized integrated brownian ﬁelds for simulation metamodeling, in: R. Pasupathy, S.-H. Kim, A. Tolk, R. Hill, M. Kuhl (Eds.), Proceedings of the 2013 Winter Simulation Conference, IEEE, 2013, pp. 543–554. Y.H. Wang, On the number of successes in independent trials, Statistica Sinica 3 (2) (1993) 295–312. T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, 2011. S.E. Chick, Bayesian analysis for simulation input and output, in: S. Andradóttir, K. Healy, D. Withers, B. Nelson (Eds.), Proceedings of the 1997 Winter Simulation Conference, IEEE, 1997, pp. 253–260. L.S. Bastos, A. O’Hagan, Diagnostics for gaussian process emulators, Technometrics 51 (4) (2009) 425–438. S. Vincent, Input data analysis, in: J. Banks (Ed.), Handbook of Simulation, John Wiley & Sons, 1998, pp. 55–92. B. Zeigler, Theory of Modelling and Simulation, Wiley/Interscience, 1976. E. Song, B.L. Nelson, A quicker assessment of input uncertainty, in: R. Pasupathy, S.-H. Kim, A. Tolk, R. Hill, M. Kuhl (Eds.), Proceedings of the 2013 Winter Simulation Conference, IEEE, 2013, pp. 474–485. T. Schelling, Dynamic models of segregation, J. Math. Sociol. 1 (2) (1971) 143–186. U. Wilensky, NetLogo Segregation model, Center for Connected Learning and Computer-Based Modeling, Northwestern University, Northwestern University, Evanston, IL (1997). . U. Wilensky, NetLogo, Center for Connected Learning and Computer-Based Modeling, Northwestern University, Northwestern University, Evanston, IL (1999). . D.B. Rubin, The bayesian bootstrap, Annals Stat. 9 (1) (1981) 130–134.

Interval metamodels for the analysis of simulation Input–Output relations

Interval metamodels for the analysis of simulation Input–Output relations

Recommend Documents