Probabilistic Engineering Mechanics 17 (2002) 219±225
www.elsevier.com/locate/probengmech
Estimation of input parameters in complex simulation using a Gaussian process metamodel Jeong-Soo Park a,*, Jongwoo Jeon b a
Department of Statistics, Info-Telecomm Research Institute, Chonnam National University, Kwangju 500-747, South Korea b Department of Statistics, Seoul National University, Seoul, South Korea
Abstract The unknown input parameters of a simulation code are usually adjusted by the nonlinear least squares estimation (NLSE) method which minimizes the sum of differences between computer responses and real observations. However, when a simulation program is very complex and takes several hours for one execution, the NLSE method may not be computationally feasible. In this case, one may build a statistical metamodel which approximates the complex simulation code. Then this metamodel is used as if it is the true simulation code in the NLSE method, which makes the problem computationally feasible. This `approximated' NLSE method is described in this article. A Gaussian process model is used as a metamodel of complex simulation code. The proposed method is validated through a toy-model study where the true parameters are known a priori. An application to nuclear fusion device is presented. q 2002 Elsevier Science Ltd. All rights reserved. Keywords: Approximated nonlinear least squares estimation; Kriging; Simulation metamodel; Computer experiments; Nuclear fusion; Projection pursuit regression
1. Introduction and motivation An objective of research in a large class of dynamic systems is to determine any unknown parameters in a theory which is implemented in a computer simulation code. If there is an available database of experimental model, the parameters can be determined by `tuning' the computer model to the real data so that the tuned code gives a good match to data. In other words, this method involves simulating a theory model with ranges of parameters and selecting elements from these ranges which best match the simulated data with the real data. Assuming the validity of the simulation code, these parameters are usually adjusted by the nonlinear least squares estimation (NLSE) method which minimizes the sum of squared differences between computer responses and real observations. This is an inverse problem which is usually encountered in the simulation practice. When a simulation program is, however, very complex and takes several hours for one execution of the program, the NLSE method may not be computationally feasible. In these cases, one may build a statistical metamodel (or a surrogate or an emulator) which approximates an unknown functional relationship that exists between a controllable set of input variables and a simulated response variable. Then * Corresponding author. Fax: 182-625-303-449. E-mail address:
[email protected] (J.-S. Park).
this metamodel is used as if it is the true simulation program in the NLSE method, which solves the problem and makes it computationally feasible. This `approximated' NLSE method is described in this article. We use a Gaussian process model (GPM) as a metamodel of complex simulation code (Section 2). This model has been successfully used in the design and analysis of computer experiments [1]. This approach provides a statistical basis for analyzing deterministic data, for designing experiments for ef®cient prediction and for comparison of computerencoded theory and experimental data in the analysis of complex systems. The method described here was motivated by a nuclear fusion research. There is a large database of results from nuclear fusion reactor known as tokamaks (a Russian acronym for `toroidal magnetic chamber'). See Wesson [2] for an introductory account. There is also a computer simulation code known as BALDUR [3] which was implemented based on a comprehensive mathematical model for tokamaks. This simulation code is computationally very expensive and contains four unknown transport coef®cients to be estimated (Section 5). The accurate estimation of these parameters is very important for designing the next generation of tokamak. The works related to this article are found in many areas: for example, Miller and Frenklach [4] in chemical kinetics, Gregory and Smith [5] in econometrics, Craig et al. [6] in
0266-8920/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved. PII: S 0266-892 0(02)00006-1
220
J.-S. Park, J. Jeon / Probabilistic Engineering Mechanics 17 (2002) 219±225
hydrocarbon reservoir research, Roberts and Vasta [7] in mechanical engineering, and Capodieci et al. [8] in electrical engineering. Roberts et al. [9] used a spectral metamodel and the approximated NLSE method for the estimation of non-linear system parameters in ship roll motion. This article deals with the approximated NLSE method where the Gaussian process metamodel is used. In Section 2, we outline the GPM for computer simulation experiments. The approximated NLSE formulation is given in Section 3. In Section 4, the proposed method is validated through a toy-model study where the true parameters are known a priori. In addition, the GPM is compared with the classical regression model and the projection pursuit regression model. It turns out that that the Gaussian process metamodel is overall the best among the three metamodels. These numerical experiments indicate the validity of our method. In the application to tokamak given in Section 5, a cheaper emulator of BALDUR code has been constructed, and the system parameters were estimated from the data of two tokamaks (named ASDEX and PDX). Finally, in Section 6, we provide some concluding remarks.
Z
x: In our application, the e i enters because there are Monte-Carlo integrations performed in BALDUR. Some possible choices of covariance function are from the power exponential family [1] which is given by " # d X 2 R
t; u exp 2u uti 2 ui u ;
3
2. A Gaussian process metamodel of simulation
F fl
si 1#i#n ;
We use a statistical model which treats the response Y
x as a realization of a random function superimposed on a regression model [1],
where
si ; ¼; sn are the design sites. Here s 2 V is a covariance matrix between observations (or design sites), and F is a so-called design matrix. For any prediction site x, the n £ 1 vector vx and k £ 1 vector fx are de®ned by
Y
x b0 1
k X j1
bj fj
x 1 Z
x;
1
COV
t; u s 2 R
t; u between Z(t) and Z(u), for t
t1 ; ¼; td ; u
u1 ; ¼; ud ; where s 2 is the process variance (a scale factor) and R
t; u is the correlation function. The rationale is that departures of the complex response from the simple regression model, though deterministic, may resemble a sample path of a (suitably chosen) stochastic process Z. Computer observations are sometimes on the subject of measurement errors which may be due to approximation, round-off error, or Monte-Carlo routines in the simulation code. For a given set of design sites {s 1 ; ¼; sn }; let y be an n £ 1 vector of observations from a computer experiment given by 1 # i # n;
where u $ 0: The non-negative parameter u determines the covariance structure of Z: small u re¯ects large correlations between nearby observations while large u re¯ects small nearby correlations. It is thus related to the smoothness of the response. Of course, many other covariance functions are possible [10]. Once a covariance function and its parameters are speci®ed, one can predict Y
x based on the model (1) using the observations y
s: For the prediction formula, de®ne the n £ n matrix V and n £ k matrix F by V R
si ; sj 1#i#n 1 g 2C I; 1#j#n
4
where g 2C s c2 =s 2 which is the ratio of noise versus signal variance, and 1#l#k
where fs are known functions and b s are unknown regression coef®cients. Here the random process Z(´) representing the systematic departure from the assumed linear model is assumed to be a Gaussian process with mean zero and covariance
y
si Y
si 1 ei ;
i1
2
where measurement errors e i are assumed to be uncorrelated, mean zero Normal random variables with constant variance Var
ei s c2 ; are assumed to be independent of
v 0x R
s1 ; x; ¼; R
sn ; x; f 0x f1
x; ¼; fk
x; respectively. Here vx is a correlation vector between design sites and a prediction site x. Then the best linear unbiased predictor (BLUP) of Y(x) given the observation vector y is [1,10] " # " # V F 21 y 0 0 ^ Y
x v x ; f x f 0x b^ 1 v 0x V 21
y 2 F b^ ; 0 F 0 0
5 where b^
F 0 V 21 F21 F 0 V 21 y is the generalized least squares estimator of b . This prediction (Eq. (5)) is called `universal Kriging' in spatial statistics literature [10]. The mean squared error of prediction is ^ ^ 2 Y
x2 ; MSE
x MSE
Y
x Eu Y
x
6
and the normalized mean squared error, mse
x MSE
x=s 2 is " # " # V F 21 vx 0 0 R
x; x 2 v x ; f x :
7 F0 0 fx In the absence of measurement error (s c2 0 in Eq. (2)), the prediction surface interpolates the observations [10]
J.-S. Park, J. Jeon / Probabilistic Engineering Mechanics 17 (2002) 219±225
^ i at a design point si has mse
si because the predictor Y
s ^ 0; i.e. Y
si Y
si : This is one of the reasons why the GPM is used for deterministic data analysis in computer experi^ ments. If s c2 . 0; then Y
x smoothes the observed data; 2 smaller u and larger g C give smoother predictions. This GPM has been successfully employed in the design and analysis of computer experiments [1,11]. See Ref. [12] for a review of metamodels in simulation. After computer simulation data have been collected at the design sites, maximum likelihood estimates (MLE) of the model parameters are computed to build a prediction model. Since we assume y
x has a multivariate Normal distribution with mean F b and covariance matrix s 2 V; the likelihood function of y is L y; u; b; s 2 ; g 2C ; x !
2ps 2 2n=2
y 2 F b 0 V 21
y 2 F b p : exp 2 2s 2 uVu
8
When u and g2C are speci®ed, the MLE of s 2 is given by 1 s^ 2
Y 2 F b^ 0 V 21
Y 2 F b^ ;
9 n where b^ is the generalized least squares estimator of b as in Eq. (5). Then 22 times the log likelihood function (except for constants) with b^ and s^ 2 plugged in is
l n log s^ 2 1 loguVu:
10
Since the likelihood equations do not lead to a closed form solution, a numerical optimization procedure is used [13]. We used a quasi-Newton optimizer with multiple initial values because of multi-modality of the likelihood surface. Now, by minimizing Eq. (10), the parameters of Gaussian process metamodel are estimated. 3. Approximated nonlinear least squares estimation In this section, we describe the approximated nonlinear least squares method for estimating unknown input parameters cp in a computer code using real experimental data (or database) and computer data. In collecting computer data, some candidates of cp are used as inputs of computer simulation code with other variables. We introduce the following notations for q input parameters: d cp xE yE c xC nC
dimension of input variables of computer code the true (unknown) parameters (q dimensional) the independent variables in experimental database (d±q dimensional) the experimental observations in database the input variables of computer code corresponding to cp (q dimensional) the input variables of computer code corresponding to xE (d±q dimensional), number of observations for computer code
nE YC
221
number of observations in experimental database the true computer code as a function of c and xC
The subscripts C and E stand for computer simulation and real experiment, respectively. Here s 0
c; xC represents a design site selected for a computer experiment. Thus the computer response y (or yC) is a function of s, i.e. yC Y
c; xC 1 e as in Eq. (2). Since we assume the true simulation code is close to the real experimental data with some variations, the following model is used: yE Y
cp ; xE 1 e;
11
where e is assumed to be independent and identically distributed with mean 0 and variance s e2 : A major dif®culty of the computer experiment in our application is that one run of BALDUR takes approximately 3±5 CPU minutes on a CRAY-2 supercomputer. If we use a nonlinear regression technique, then cp may be estimated by minimizing the residual sum of squares RSS
cp
nE h X i1
i2 yEi 2 Y cp ; xEi ;
12
where yEi is an observed response from real experiments and Y
cp ; xEi is the corresponding theoretical value (an output of BALDUR) at the experimental point xEi : Since there are 74 observations in the experimental data set, one evaluation of RSS
cp takes about 74 £ 4 min (5 h) on a CRAY-2 at Lawrence Livermore National Labs. It is too time-consuming to run the code as many times as needed for an iterative nonlinear optimizer to ®nd cp : The approximated NLSE method ®rst ®ts the model (1) by MLE using computer data alone, then by treating the ®tted prediction model as if it were the true model, ®nd the nonlinear least squares estimates (cbp ) so that the simulation predictor gives the best match to the experimental data. That is, ®nd cbp such that it minimizes RSSP
cp
nE h X i1
i2 yEi 2 Y^ cp ; xEi ;
13
^ p ; xE is the best linear unbiased prediction of the where Y
c i true computer response Y at
cp ; xEi : Since it is dif®cult to have a closed form of the ®rst derivative of Eq. (13) with respect to cp ; a numerical optimization routine is necessary to ®nd cbp : Note that Y^ is a computationally cheaper emulator of the expensive computer simulation code. This solves the problem and makes it computationally feasible. The advantages of this method are that it is reasonably easy and cheap to implement, and that the computer and experimental data are uncoupled. The prediction residuals ^ p ; xE can be used to check the validity of the yEi 2 Y
c i prediction model, cbp and the approximated NLSE method (see Fig. 1 in Section 5). Once cp has been estimated, some indications of the accuracy of the estimates are usually necessary. For these
222
J.-S. Park, J. Jeon / Probabilistic Engineering Mechanics 17 (2002) 219±225
purposes, we assumed that the prediction residuals yEi 2 ^ p ; xE are normally distributed. We computed the Y
c i empirical Fisher information matrix (FIM) for the negative `unconcentrated' log likelihood function:
lLS RSSP
c
p
=s e2
1 nE log
s e2 ;
14
Table 2 Result of performance evaluation for toy-model 2 (true values: cp1 2:0; cp2 4:0; cp3 1:0) Metamodel
c p estimates
RSSP value
Distance
Classical regression
cbp1 2:366 cbp2 2:727 cbp3 0:988
1.785
1.323
0.731
0.538
0.739
1.062
s e2
is the population variance of yE. The empirical where FIM is actually the Hessian matrix of l LS at cp cbp ce2 where s ce2 RSS
cbp =n : The explicit second and s e2 s P E ce2 is n =
s ce2 2 : Since derivative of l LS with respect to s e2 at s E p derivatives of l LS with respect to c are not explicitly obtainable, a numerical differentiation method was used to approximate the empirical FIM. Then the matrix was inverted, and the corresponding diagonal entries in the inverse matrix are approximately the asymptotic variances. 4. Validation through toy-model study To test the proposed method, three toy-models are considered with three different metamodels: classical regression model, GPM, and projection pursuit regression (ppreg) model [14]. The ppreg is an exploratory nonlinear regression method that models the response as a sum of nonparametric functions of projections of the independent variables. It is of the form Y Y 1
M X j1
bj fj aTj X 1 e;
15
for unit vectors a j and a dimension M to be chosen by the user. For large enough M such a model can approximate arbitrary continuous functions of X. The model parameters b j, fj and a j for j 1; 2; ¼; M in Eq. (15) are estimated by minimizing the mean squared error over all possible parameters. The number of terms M of ppreg is selected so that the data are ®tted very well but can still be ®tted in a parsimonious representation. See S-Plus User's manual [14] and Ref. [15] for the details of ppreg. In ®tting the classical regression model to the computer data, the best model was selected by the all possible regression method based on the Mallows' Cp statistic [16]. The three toy-model functions along with the `true' paraTable 1 Result of performance evaluation for toy-model 1 (true values: cp1 1:0; cp2 1:0) p
Metamodel
c estimates
RSSP value
Distance
Classical regression
cbp1 1:393 cbp2 1:089 cbp1 1:114 cbp2 1:105
6:13 £ 1024
0.403
1:41 £ 1023
0.153
8:03 £ 1023
0.179
GPM Ppreg
cbp1 1:067 cbp2 0:833
GPM
Ppreg
cbp1 2:434 cbp2 3:992 cbp3 1:318 cbp1 1:000 cbp2 1:178 cbp3 0:689
meter values are given in each of the following subsections. These toy-models were also tested in Ref. [17]. For each of the test functions below, nC values of
c; x were selected as inputs for the `computer code', i.e. the function Y
c; x as given is evaluated at nC values, and they were added in independent normal random errors. The input was chosen to be well spread around a reasonable space known to contain the true parameter value (actually Latin-hypercube designs [18] were used). Then, nE experimental data points were generated in a similar fashion, except that the experimental values all used the same cp as given. The results of performance comparison for each of toy-models are given in Tables 1±3. The fourth column is the Euclidian distance between the true cp values and the estimated values. 4.1. Toy-model 1 Computer data
nC 30 : Y
c1 ; c2 ; x1 c1 exp
2c2 x1 1 e; 0 # c1 ; c2 ; x1 # 3; e , N
0;
0:012 : Table 3 Result of performance evaluation for toy-model 3 (true values: cp1 2:0; cp2 1:0; cp3 4:0; cp4 4:0) Metamodel
c p estimates
RSSP value
Distance
Classical regression
cbp1 1:887 cbp2 0:919 cbp3 3:927
16.098
1.430
14.166
0.902
19.955
1.131
GPM
Ppreg
cbp4 2:583 cbp1 1:550 cbp2 1:199 cbp3 3:354 cbp4 3:769 cbp1 1:500 cbp2 1:012 cbp3 4:167 cbp4 5:000
J.-S. Park, J. Jeon / Probabilistic Engineering Mechanics 17 (2002) 219±225
Experimental data
nE 30 : cp1
cp2
1:0; e , N
0; 0:01:
4.2. Toy-model 2 Computer data
nC 20 : x1 q 2 2 Y
c; x c3 1
x2 1 x3 c2 x4 =x1 2 c3 1 x1 1 3x4 1 e; c1 1 # c1 ; c2 # 5; 0 # c3 # 2; e , N
0; 0:5; x1 , U
2; 4; x2 , U
2; 15; x3 , U
25; 1; x4 , U
0:5; 1: Experimental data
nE 20 : cp1
2:0; cp2 cp3 1:0; e , N
0; 1:0:
4.3. Toy-model 3 Computer data
nC 60 : Y
c; x c1 exp
ux1 1 x2 u 1 c2
x4 1 1:2x5 1 1=2:5 1 c3
c2 1 2x3 1 x4 1 c4 exp
x1 1 x3 2 x5 2 x6 1 3 cos6
x2 1 x3 1 x4 1 e; c1 , U
0; 3; c2 , U
0; 6; c3 , U
0; 5; c4 , U
0; 5; xi , U
26; 6; i 1; 2; ¼; 6; e , N
0; 4: Experimental data
nE 70 : p
c
2:0; 1:0; 4:0; 4:0; e , N
0; 6: It is apparent that the Gaussian process metamodel is overall the best among the three metamodels. Also this toy-model study suggests that the approximated NLSE method with GPM works well. 5. An application to nuclear fusion model An objective of research in nuclear fusion reactors (tokamaks) is to understand the transport mechanism governing the process, and in particular, to obtain an appropriate model for the global energy con®nement time and to determine the parameters (transport coef®cients or rate constants) in a transport model. Since some of the constants cannot be mathematically determined [19], a systematic statistical method is required. This method may need to use both experimental data and computer data obtained by running a tokamak simulation code called BALDUR [3], based on a theoretical model, because it is dif®cult to extract information on the constants from experimental data alone. One of the simple measures of energy ef®ciency in
223
tokamak is the global energy con®nement time t E. The theoretically based con®nement model may be written as [20]:
tE f
cp ; a; R; P; I; N; B;
16
where f is a known function (calculated by a complex code), a and R are the minor and major radii of the tokamak, respectively, P the input total power, I the plasma current, N the electron density, B the toroidal magnetic ®eld and cp
cp1 ; ¼; cp4 are the adjustable theory parameters determining energy transfer by turbulent modes known as drift waves, rippling, resistive ballooning and critical value for the ion temperature gradient mode [19], respectively. Of course, there are several other variables which are not the major ones. The experimental data were taken from the database collected by Kaye and Goldston [20] for two tokamaks: ASDEX (32) in Germany, PDX (42) in Princeton, which have ®xed values of R and a, with number of observations in parentheses. Following the previous statistical analysis [20] on the experimental data, we take log10 transformations to P, I, N, B, and t E. Therefore the variables considered in this study are c1, c2, c3, c4, log P; log I; log N; log B; and log tE : The ®rst four cs are input variables of the BALDUR code corresponding to the true coef®cients cp1 ; cp2 ; cp3 and cp4 : Note that the experimental data consists of only the second four variables and log tE whereas the computer data consists of eight independent variables and log tE obtained from BALDUR. Since BALDUR simulator needs 5 cpu minutes on a Cray supercomputer for one execution, a careful selection of input points is required, which is a statistical design problem for very complex simulation code. We used a data-adaptive sequential optimal design strategy which is reported in Park [21]. Following the procedure, we obtained 66 and 64 observations from BALDUR code corresponding to ASDEX and Table 4 MLE of parameters of the model (1) from ASDEX and PDX tokamak simulation data Symbol
Description
ASDEX value
PDX value
nC u b0
Computer data set sample size Common correlation coef®cient Regression coef®cient (intercept) Regression coef®cient for c1 Regression coef®cient for c2 Regression coef®cient for c3 Regression coef®cient for c4 Regression coef®cient for log P Regression coef®cient for log I Regression coef®cient for log N Regression coef®cient for log B Variance of Y Variance ratio s c2 =s 2
66 2.24 21.41
64 10.09 21.81
20.278 0.074 20.172 0.084 20.713 0.25 0.154 0.091 0.448 0.051
20.30 20.07 20.11 0.00 20.33 0.17 20.03 0.28 0.46 0:37 £ 1022
b1 b2 b3 b4 b5 b6 b7 b8 s2 g C2
224
J.-S. Park, J. Jeon / Probabilistic Engineering Mechanics 17 (2002) 219±225
à plot for tokamak experimental data: Octagon: ASDEX, cross: PDX. Fig. 1. Residual vs. prediction Y
Table 5 Estimates of BALDUR input parameters Symbol
Description
Estimated value
Standard error
c1p c2p c3p c4p
Drift waves coef®cient Rippling coef®cient Resistive ballooning coef®cient Critical value for the ion temperature gradient mode
0.140 2.282 2.397 0.546
0.429 1.630 1.261 0.332
PDX tokamaks, respectively. This application was also considered in [17]. Table 4 shows the parameter estimates obtained from computer data which are individual to a given machine. The values of b 2 and b 4 are small for both tokamaks, indicating that the effects of c2 and c4 are small. Note that the intercept value (b 0) for log tE of PDX is less than that of ASDEX. This indicates that PDX is cooler than ASDEX for similar input parameters. Also note that s c2
s 2 £ g 2C for ASDEX is bigger than that of PDX, which means that computer observations of ASDEX have larger `measurement' errors than those of PDX. Table 5 presents the results of cp estimationÐthe ones of most interest. In computing the RSSP of Eq. (13), each machine was treated independently (mainly because they have different geometries and design regions). Thus we used the objective function in optimization to ®nd cp as a sum of the corresponding function in Eq. (13) for each tokamak. We used a quasi-newton optimization routine to ®nd cbp : Several starting values were tried to avoid the local minima. The estimate of cp2 has a relatively large estimated standard error, indicating that our knowledge of it at this time is rather uncertain. Fig. 1 shows residual plot (residual vs. predicted values) for tokamak experimental data.
6. Concluding remarks We have considered the approximated NLSE of the
unknown parameters using the GPM which were ®tted by the computer data. So this method is different from the ordinary nonlinear regression where a known regression function is used instead of a predictor. Thus, we need some justi®cations that the estimation method works well in estimating the true constants. The simulation study using some toy-models reported in Section 4 suggests that our method works well. ^ p ; xE Note that the selection of a prediction model for Y
c i is important because the values of cbp may vary according to the selected prediction model. Actually several GPMs with various combination of non-zero b js and u is in Eqs. (1) and (3) can be used as the metamodel. We do not claim that the methodology presented here is the ®nal answer, but it seems to be a good start. Other methods based on maximum likelihood estimation are proposed in Refs. [17,21]. If there is prior knowledge about the parameter vector cp ; the Bayesian estimation approach may be reasonable (see Ref. [6] for this direction). The methodology we have proposed here may prove useful in other applications in other disciplines where the unknown parameters in a theory must be estimated using very complex computer codes and real experimental data. Acknowledgements Park's research was supported by the Ministry of Information and Telecommunication of Korea, and Jeon's research was supported by KOSEF through the Statistical Research Center for Complex Systems at Seoul National University. The authors would like to thank Professor Clifford Singer at the University of Illinois for help on tokamak data and the Baldur code, and Professor Dennis Cox at Rice University for valuable comments on this study. References [1] Sacks J, Welch W, Mitchell T, Wynn H. Design and analysis of computer experiment (with discussion). Stat Sci 1989;4:409±35.
J.-S. Park, J. Jeon / Probabilistic Engineering Mechanics 17 (2002) 219±225 [2] Wesson J. Tokamaks. Oxford: Clarendon Press, 1987. [3] Singer CE, Post DE, Mikkelsen DR, Redi MH, et al. BALDUR: a onedimensional plasma transport code. Comput Phys Commun 1988;49: 275±398. [4] Miller D, Frenklach M. Sensitivity analysis and parameter estimation in dynamic modeling of chemical kinetics. Int J Chem Kinet 1983;15: 677±96. [5] Gregory AW, Smith GW. Calibration as estimation. Econ Rev 1990; 9:57±89. [6] Craig PS, Goldstein M, Seheult AH, Smith JA. Bayes linear strategies for matching hydrocarbon reservoir history. In: Bernardo JM, et al., editors. Bayesian statistics, vol. 5. Oxford: Oxford University Press, 1996. p. 69±95. [7] Roberts JB, Vasta M. Parametric identi®cation of systems with non-Gaussian excitation using measured response spectra. Prob Engng Mech 2000;15(1):59±71. [8] Capodieci L, Subramanian R, Rangarajan B, Heavlin WD, et al. Novel methodology for postexposure bake calibration and optimization based on electrical linewidth measurement and process metamodeling. J Vacuum Sci Technol B 1998;16:3752±8. [9] Roberts JB, Dunne JF, Debonos A. A spectral method for estimation of non-linear system parameters from measured response. Prob Engng Mech 1995;10(4):199±207. [10] Ripley B. Spatial statistics. New York: Wiley, 1981. [11] Aslett A, Buck RJ, Duvall SG, Sacks J, Welch WJ. Circuit optimiza-
[12] [13] [14] [15] [16] [17] [18] [19] [20] [21]
225
tion via sequential computer experiments: design of an output buffer. J Royal Stat Soc C 1998;47:31±48. Kleijnen JPC, Sargent RG. A methodology for ®tting and validating metamodels in simulation. Eur J Oper Res 2000;120(1):14±29. Park JS, Baik JS. Ef®cient computation of maximum likelihood estimators in a spatial linear model with power exponential covariogram. Comput Geosci 2001;27(1):1±7. Friedman JH, Stuetzle W. Projection pursuit regression. J Am Stat Assoc 1981;76:817±23. StatSci. S-PLUS for windows user's manual, vol. 2, Seattle: Statistical Sciences Inc, 1993. Draper N, Smith H. Applied regression analysis 2/e. New York: Wiley, 1981. Cox DD, Park JS, Singer CE. A statistical method for tuning a computer code to a data base. Comp Stat Data Anal 2001;37:77±92. McKay MD, Conover WJ, Beckman RJ. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 1979;21:239±45. Singer CE. Theoretical particle and energy ¯ux formulas for tokamaks. Comments Plasma Phys Control Fusion 1988;11:165. Kaye SM, Goldston GC. Global energy con®nement scaling for neutral-beam-heated tokamak. Nucl Fusion 1985;25:65±9. Park JS. Tuning complex computer codes to data and optimal designs. Unpublished PhD Thesis, University of Illinois, Urbana-Champaign, 1991.