An active learning radial basis function modeling method based on self-organization maps for simulation-based design problems

An active learning radial basis function modeling method based on self-organization maps for simulation-based design problems

Accepted Manuscript An active learning radial basis function modeling method based on self-organization maps for simulation-based design problems Qi ...

1MB Sizes 1 Downloads 23 Views

Accepted Manuscript

An active learning radial basis function modeling method based on self-organization maps for simulation-based design problems Qi Zhou , Yan Wang , Ping Jiang , Xinyu Shao , Seung-Kyum Choi , Jiexiang Hu , Longchao Cao , Xiangzheng Meng PII: DOI: Reference:

S0950-7051(17)30247-2 10.1016/j.knosys.2017.05.025 KNOSYS 3923

To appear in:

Knowledge-Based Systems

Received date: Revised date: Accepted date:

27 October 2016 12 May 2017 24 May 2017

Please cite this article as: Qi Zhou , Yan Wang , Ping Jiang , Xinyu Shao , Seung-Kyum Choi , Jiexiang Hu , Longchao Cao , Xiangzheng Meng , An active learning radial basis function modeling method based on self-organization maps for simulation-based design problems, Knowledge-Based Systems (2017), doi: 10.1016/j.knosys.2017.05.025

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT Highlights  A sensitive region pursuing based active learning RBF modeling approach is proposed LOO error is taken as indicator to measure the sensitivity of the lost information



The boundary of the sensitive regions is determined by self-organization maps



Detailed comparison with other approaches are made via several numerical cases



The proposed approach is applied to three engineering cases.

AC

CE

PT

ED

M

AN US

CR IP T



ACCEPTED MANUSCRIPT

An active learning radial basis function modeling method based on self-organization maps for simulation-based design problems Qi Zhou1,2, Yan Wang2, Ping Jiang1,*, Xinyu Shao1, Seung-Kyum Choi2, Jiexiang Hu1, Longchao Cao1, Xiangzheng Meng 1. The State Key Laboratory of Digital Manufacturing Equipment and Technology, School of Mechanical Science and Engineering, Huazhong University of Science & Technology, 430074 Wuhan, PR China

CR IP T

2. George W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA Telephone: +86-027-87557742 Fax numbers: +86-027-87543074 Email addresses: [email protected]

PT

ED

M

AN US

Abstract: The radial basis function has been widely used in constructing metamodels as response surfaces. Yet, it often faces the challenge of accuracy if a sequential sampling strategy is used to insert samples sequentially and refine the models, especially under the constraint of computational resources. In this paper, a sensitive region pursuing based active learning radial basis function (SRP-ALRBF) metamodeling approach is proposed to sequentially exploit the already-acquired knowledge in the modeling process for obtaining a desirable estimation of the relationship between the input design variables and the output response. In this method, the leave-one-out (LOO) errors of each sample point are taken to identify the boundaries of sensitive regions. According to the obtained LOO information, the original design space is divided into some subspaces by adopting the self-organization maps (SOMs). The boundary of the most sensitive region, where the output response is multi-modal or non-smooth with abrupt changes, is determined by the topological graph generated by cluster analysis in SOMs. In the most sensitive region, infill sample point searching is performed based on an optimization formulation. Ten numerical examples are used to compare the proposed SRP-ALRBF with four existing active learning RBF metamodeling approaches. Results show the advantage of the proposed SRP-ALRBF approach in both prediction accuracy and robustness. It is also applied to three engineering cases to illustrate its ability to support complex engineering design. Keywords: Radial basis function; Active learning; Sensitive region; Self-organization maps

CE

1. Introduction

AC

Modern products and system designs such as vehicles, civil structures, medical devices, drugs, and many others rely on physics based simulations. Simulations help to predict the performance of systems in exploring design space and searching for optimal design. Nevertheless, accurate predictions require high-fidelity and expensive simulations. Even with the latest processing power of computer, it is still impractical to only rely on high-fidelity simulations to yield full-scale relationships between design variables as inputs and system performance as the output (Christelis & Mantoglou, 2016; Zhou, Shao, Jiang, Gao, Zhou, et al., 2016). For instance, it can take days to simulate the collapse behavior of a complete ship structure using finite element analysis, or for the mechanical behavior of polycrystalline alloy specimens for strength test using full atomic molecular dynamics simulations, to produce one single output of performance prediction on a computer cluster. Design optimization requires such simulations to be iteratively run, and predictions of performances to be made with various combinations of input values. Therefore, relying on full scale high-fidelity

ACCEPTED MANUSCRIPT simulations to search optimum deign is computationally prohibitive. An effective way to reduce the searching time is to adopt metamodels or surrogate models to approximate the input-output responses (Tyan, Nguyen, & Lee, 2014; Viana, Simpson, Balabanov, & Toropov, 2014).

AN US

CR IP T

Many metamodeling techniques have been reported in support of engineering design, including the Kriging (M. Li, Li, & Azarm, 2008; Tugrul & Polat, 2014), radial basis function (RBF) (Liang, Zhu, & Wu, 2014; Qasem, Shamsuddin, & Zain, 2012), polynomial response surface model (Eddy, Krishnamurty, Grosse, Wileden, & Lewis, 2015; Su & Chen, 2012), and support vector regression (Wang, Ni, & Wang, 2012; Xiao, Gao, Xiong, & Luo, 2015). Among these metamodeling techniques, RBF metamodeling shows a good trade-off between the prediction accuracy and modeling cost for high-dimensional and nonlinear problems (Hongbing Fang & Horstemeyer, 2006; Jin, Chen, & Simpson, 2001; Tyan, et al., 2014). It has been widely applied in complex engineering systems design, such as ship structures (Prebeg, Zanic, & Vazic, 2014; Volpi, et al., 2014), automotive vehicles (H. Fang, RaisRohani, Liu, & Horstemeyer, 2005; Sun, Li, Gong, He, & Li, 2011; Yin, Fang, Wang, & Wen, 2016), satellites (Peng, Liu, Long, & Yang, 2014; Shi, Liu, Long, & Liu, 2015), and compressor blades (Liu, Xu, Wang, Meng, & Yang, 2016). It is important to point out that all of the above applications require that the RBF metamodel should have a desirable prediction performance, since a low-fidelity RBF metamodel may result in inaccurate or even distorted relationships of inputs and outputs for many black-box problems, especially when the computational resource is limited.

AC

CE

PT

ED

M

To obtain a RBF metamodel with desirable approximation performance, various approaches have been presented in literature. In a broad sense, these methods can be divided into two types: enhancing the metamodel itself and enhancing the sampling techniques. In the first approach, the prediction accuracy of the RBF metamodel is improved by tuning the parameters or radial basis function types involved in the model. For example, Yao et al. (Yao, Chen, Zhao, & van Tooren, 2012) proposed a concurrent subspace width optimization method to determine the optimum width parameters for RBF metamodels. Mullur et al. (Mullur & Messac, 2005) introduced non-radial basis functions and included more than one basis function for each sample to provide a more flexible way of RBF metamodeling. Sarimveis et al. (Sarimveis, Alexandridis, Mazarakis, & Bafas, 2004) proposed a dynamical RBF modeling approach by using a specially designed genetic algorithm (GA) to obtain the model parameters. Meng et al. (Meng, Dong, & Wong, 2009) combined the differential evolution and fuzzy c-means to configure the structure of RBF. Liu et al. (Liu, et al., 2016) made full use of the local and global prediction errors of different radial basis functions to calculate their weights in building ensembles of RBF metamodels. In the second approaches of enhancing sampling techniques, research is focused on how to collect as much information as possible with limited computational resources with some special design of experiment (DOE) or active learning methods. Some approaches make the newly added sample points distributed in the design space as evenly as possible. For example, Crombecq et al. (Karel Crombecq, De Tommasi, Gorissen, & Dhaene, 2009) proposed a Voronoi-based sampling approach, where the new sample in the largest Voronoi cell is selected to refresh the metamodel. Jin et al. (Jin, Chen, & Sudjianto, 2002) proposed a maximin scaled distance approach for active learning RBF modeling. Because the characteristics of outputs are usually ignored, these approaches often failed to improve the accuracy of RBF metamodels for nonlinear problems. Some approaches address this issue by

ACCEPTED MANUSCRIPT

CR IP T

building an infilling optimization problem to simultaneously consider input and output parameter spaces. For instance, Yao et al. (Yao, Chen, & Luo, 2009) put forward a gradient enhanced active learning RBF modeling method, where the adaptive sample points that have the maximum expected gradient in present RBF metamodel are selected to update the RBF model. Wei et al. (Wei, Wu, & Chen, 2012) developed a curvature enhanced active learning approach for RBF models. Ye et al. (Ye, Pan, Huang, & Shi, 2015) proposed a sequential RBF metamodel by taking the extreme points on the response surfaces and the minimum density function into consideration. Zhou et al. (P. Jiang, et al., 2015; Zhou, Shao, Jiang, Gao, Wang, et al., 2016) proposed a weighted accumulative error (WAE) based active learning approach. However, because the simultaneous consideration of input and output characteristics usually generates a highly nonlinear infilling optimization objective in the whole design space, it often results in sub-optimal samples (K. Crombecq, Laermans, & Dhaene, 2011; Zhou, Shao, Jiang, Gao, Wang, et al., 2016). With the propagation of suboptimal information during the active learning process, the cumulative effect tends to be detrimental to the prediction performance of the obtained RBF metamodel.

ED

M

AN US

To address the issue of nonlinearity in sequential sampling, in this paper a sensitive region pursuing based active learning RBF (SRP-ALRBF) metamodeling approach is proposed to help analyze the already acquired data from previous iterations so as to arrange new experiments in regions that are more difficult to approximate. First, the leave-one-out (LOO) errors of each sample point are taken as the indicator to measure the sensitivity of the RBF metamodel to the lost information at each sample point. According to the obtained LOO information, the whole design space is divided into some subspaces by adopting the selforganization maps (SOMs). Second, the boundary of the sensitive region, a subspace where the output responses are multi-modal and non-smooth with abrupt changes, is determined by the topological graph generated by cluster analysis in SOMs. Finally, in the sensitive region, an infilling mathematical model is introduced and sample point searching is performed. The prediction performance of SRP-ALRBF approach is illustrated using ten numerical cases and three real-life cases. A comparison between SRP-ALRBF with other four active learning RBF metamodeling approaches is made. The result indicates that more accurate RBF metamodels with SRP-ALRBF can be further developed in the future.

AC

CE

PT

The remainder of this paper is organized as follows. In Section 2, a brief introduction of radial basis function and self-organizing maps is provided. Details of the proposed SRPALRBF metamodeling approach are described in Section 3. Ten numerical examples with different levels of complexity are used to compare the proposed approach with existing active learning RBF metamodeling approaches for prediction performance, considering different sample sizes and the degrees of nonlinearity of the problems. In Section 5, the proposed approach is applied to three real-life cases, followed by a summary of this work and future work in Section 6. 2. Background In this section, RBF metamodeling and SOM are introduced. 2.1 Radial basis function metamodeling RBF metamodeling was first proposed by Hardy (Hardy, 1971) to approach irregular topographic contour of geographical data. Typically this method employs a three-layer feedforward neural network with an input layer, a hidden layer of radial units, and an output layer.

ACCEPTED MANUSCRIPT Let the sample set generated by DOE be X   x1 , x2 xM  . The corresponding responses are Y   y1 , y2 yM  . The formulation of RBF metamodeling can be specified as a linear combination of some radial basis functions with weight coefficients as M

fˆ ( x )   i ( x  xi )

(1)

i 1

CR IP T

where fˆ ( x ) is the predicted value at un-sampled point x , M is the number of sample points, and xi is the ith sample point in X . x  xi represents the Euclidean distance between the prediction site x and the ith sample point that can be mathematically expressed as x  xi  ( x  xi )T ( x  xi ) (2)  () represents the radial basis functions. Commonly used radial basis functions are (Tripathy, 2010): (1) Bi-harmonic   r   r ; (2) Thin-plate spline   r   r 2 log(r ) ; (3) Multiquadric   r   r 2  c2 (4) Cubic   r    r  c 3  cr (5) Gaussian   r   e   (6) Inverse-multiquadric   r  

AN US

2

1

r  c2 2

where c is a constant value and 0  c  1 . The unknown interpolation coefficients i are obtained by minimizing the sum of the squares of deviations, which can be expressed as M   J ( )    f ( xi )   i ( xi  x j ) )    i 1  j 1 M

2

(3)

ED

M

Solving the above optimization problem, the coefficient vector   [1 , 2 , , M ] can be obtained as 1 (4)    T     T Y where  are all zeros except for the regularization parameters along its diagonal.  is the design matrix which can be expressed as

CE

PT

  ( x1  x1 )  ( x1  x2 )    ( x2  x1 )  ( x2  x2 )    ( x  x )  ( x  x ) M 1 M 2 

 ( x1  xM )    ( x2  xM )   ( xM  x M

  ) 

(5)

2.2 Self-organization maps

AC

Self-organization map (SOM) is an unsupervised learning neural network originally proposed by Kohonen et al. (Kohonen, 1982). SOMs has been widely applied in the field of data mining (Bekel, Heidemann, & Ritter, 2005; Nikkilä, et al., 2002), machine learning (Jiang, Luo, Beggs, Cheung, & Scorgie, 2015), and visualization assisted design optimization (Chu, Gao, Qiu, Li, & Shao, 2010). SOM has several interesting characteristics. First, SOM is able to map the complex high-dimensional relationships between inputs and outputs into a low-dimensional space while maintaining the topological structure of the original data. Second, it provides the flexibility to cluster data with similar properties.

ACCEPTED MANUSCRIPT

Competitive layer

x1

x2

... xm

CR IP T

Input layer

Figure 1. Sketch map of SOMs

M

AN US

Fig.1 plots a schematic diagram of SOMs. The network of SOMs consists of two layers: the input layer and the competitive layer. All neurons on the input layer are connected to the neurons on the competitive layer by different weights. The initial values of weights are generated using random or linear assignment methods. The neuron nodes on the competitive layer which have the same dimension as the input design variables form a two-dimensional grid. In the training process, Euclidean distances between the design vector and weighted vector neuron nodes on the competitive layer are taken as the indicators to determine the best matching unit. Once the best matching unit is obtained, the weight value of this best matching unit as well as its neighboring neurons will be updated to move closer to the design vector. This training process is repeated until the design vector and neuron nodes on the competitive layer are fully matched. Finally, the distribution of the neuron nodes on the competitive layer can keep the same distribution of the original data. 3. Proposed approach

AC

CE

PT

ED

The goal of the proposed SRP-ALRBF approach is to obtain an estimation of the relationship between the design variables as inputs and the output response with sufficient accuracy subject to limited computational resource. Similar to existing active learning RBF modeling approaches, SRP-ALRBF is also based on an iterative learning process. An initial sample set is generated as a start. Then the computational simulation models or physical experiments are conducted at these sample points. Based on the sampling data, an initial RBF metamodel is constructed. If the maximum available sample point is not achieved, a new sample point is selected and added to the current sample set for each iteration to update the RBF metamodel. The novelty of the proposed SRP-ALRBF approach lies in its way of selecting infill sample points in the input design space. In the process of searching new sample points, the leave-one-out errors of each sample point are taken as the indicators to measure the sensitivity of the RBF metamodel to the lost information at each sample point. With the obtained LOO information, the original design space is divided into some subspaces by adopting the SOMs. Then the boundary of the sensitive regions, where the output responses are multi-modal or non-smooth with abrupt changes, can be determined by the topological graph generated by the cluster analysis in SOMs. Finally, in the sensitive region, an infilling mathematic model is built and sample point searching is performed. In Fig. 2, the framework of the proposed SRP-ALRBF approach is presented. In the following subsection, the details of the procedure involved in each step are described. A numerical case presented in Section 4.1 is used as an auxiliary example to illustrate the key ideas of the proposed approach throughout this section.

ACCEPTED MANUSCRIPT Start Step 1: Construct the initial RBF metamodel  Generate the initial sample points using OLHD  Obtain the response values at these sample points  Construct the initial RBF metamodel

Step 2: Obtain the LOO error at each sample points  Construct RBF with each points leaved out  Obtain the LOO error for each sample points

Step 3: Obtain the sensitive region by SOMs

Step 4: Obtain the new sample point  Determine if the SR is needed to be expanded or not  Create the space-filling criterion in SR  Search the newly added sample point

Step 5: Update the RBF metamodel

AN US

 Add the new obtained sample points into the sample set  Construct the RBF metamodel at current sample set

CR IP T

 Map the original samples by SOMs  Obtain the topological graph for LOO errors  Determine the low and up boundaries for the SR

Step 6:Is the stopping criteria met?

No

Yes

Step 7:Output the final RBF metamodel Stop

M

Figure 2. Flowchart for the proposed SRP-RBF approach. 3.1 Step 1:Construct the initial RBF metamodel

AC

CE

PT

ED

The proposed approach starts with an initial RBF metamodel. First, an initial sampling set is generated to obtain the preliminary knowledge of the design space. Then the response vector at the initial sampling set is obtained by running the computer simulation models or conducting physical experiments, which are typically expensive. Based on the sampled data, an initial RBF metamodel is constructed. In this paper, the optimal Latin hypercube sampling (LHS) is adopted to uniformly cover the design space. Unlike traditional LHS where the factor levels are randomly combined, in the optimal LHS, the combination of factor levels is optimized according to some special methods (Zhou, Shao, Jiang, Zhou, & Shu, 2015). Plenty of approaches are available for obtaining an optimal LHS. Here, an enhanced stochastic evolutionary algorithm (van Dam, Husslage, den Hertog, & Melissen, 2007) (Xiong, Xiong, Chen, & Yang, 2009) (Jin, Chen, & Sudjianto, 2005) is adopted, where the maximin distance criterion of potential points in the design space is used. 3.2 Step 2:Obtain the LOO errors at each sample point The leave-one-out (LOO) method is a type of cross validation method that can be used to assess the prediction accuracy of different metamodels. For a given sampling set X   x1o , x2o , , xto  that consists of t sample points as d -dimensional variables x o ’s, together with the corresponding response vector Y   y1 , y2 yt  , the basic procedure for obtaining the LOO errors at each sample point is demonstrated in Algorithm 1.

ACCEPTED MANUSCRIPT

Algorithm 1. Obtaining the LOO errors for sampling points. Input: The current sampling set X   x1o , x2o , , xto  and responses

Y   y1 , y2

yt  .

Output: The LOO errors for sampling points. 1 Begin 2 for i  1 to c do 3 xoi   x1o , x2o , , xio1 , xio1 , , xto   Remove the ith sample point out;

AC

CE

PT

ED

M

AN US

CR IP T

yˆ i  Build a metamodel using the remaining sample points; 4 yˆ i ( xio )  Predict the response for the ith sample point; 5 o 6 eLOO ( xio )  Calculate the absolute difference between yˆ i ( xi ) and yi ( xio ) ; 7 end for 8 end It is obvious that the LOO method cannot be used to predict the error at sample points that are not in the current sample set. However, it can provide some local information of metamodels (G. Li, Aute, & Azarm, 2009). That is, the sensitivity of the metamodel to the loss of sample information in the current sample set is provided. Taking the Modified Easom (ME) function presented in Section 4.1 as an example, as illustrated in Fig. 3(a) and Fig. 3 (b), sample point #13 is located in the region where the function values are non-smooth and have abrupt changes, whereas sample point #11 is located in a relative smooth region. Two metamodels, with the loss of sample point #13 and sample point #11, are built and shown in Fig. 3 (c) and Fig. 3 (e), respectively. The corresponding errors between the actual function and these two metamodels are demonstrated in Fig. 3 (d) and Fig. 3 (f). It can be seen from Fig. 3 (c) and Fig. 3 (d) that there is a significant difference between the actual function and RBF metamodel without considering the information at sample point #13. The large LOO error at sample point #13, along with the large actual errors between the actual function and RBF metamodel in the region near the sample point #13, indicate that sample point #13 is more likely located in a sensitive region and more experiments are required to explore the characteristics of this region. On the other hand, as illustrated in Fig. 3 (e) and Fig. 3 (f), although sample point #11 is left out, the RBF metamodel still can capture the general tendency of the actual function. A small LOO error at sample point #11 implies that the accuracy of the metamodel is not sensitive to the loss of sample point #11, i.e., the RBF metamodel has enough prediction accuracy in the region near the sample point #11. Hence, the LOO errors at the existing sample points can be adopted as indicators to discover the regions where more experiments are required to be sampled to improve the prediction accuracy of the metamodel.

ACCEPTED MANUSCRIPT

3#

6#

9# 12# 11#

y

x2

5#

1#

x1

x1

13#

AN US

e



x2

x1

11#

e

CE

PT

ED

x1

(d) Actual errors for the RBF metamodel without sample 13#

M

(c) RBF metamodel without sample 13#

11#

7#

(b) Contour maps of ME function

13#



8#

4#

(a) 3D surface of ME function with all sample points

x2

10#

CR IP T

x2

2#

13#

x2

x1

(e) RBF metamodel without sample 11#

x2

x1

(f) Actual errors for the RBF metamodel without sample 11#

AC

Figure 3. An illustration of using the LOO to provide local information for RBF metamodel.

3.3 Step 3:Obtain the sensitive region by SOMs As discussed in Section 3.2, assigning additional sample points in the sensitive region will improve the prediction accuracy of the RBF metamodel. However, the LOO errors alone do not provide guidance of where the additional sample points should be positioned. This is because although the location of the sensitive region can be estimated according to the obtained LOO errors with each existing sample point in the previous step, the boundary of the sensitive region cannot be determined. In this regard, SOM can be used to generate a

ACCEPTED MANUSCRIPT topological graph of the LOO errors in the design variable space and hence divide the original design space. In this study SOM is applied to determine the boundary of the sensitive region. For a given sample set X   x1o , x2o , , xto  , together with the corresponding LOO error vector Y LOO   y1LOO , y2LOO , , ytLOO  obtained using Algorithm 1 and assuming that there are k

CR IP T

nodes in the output layer, the process of obtaining the topological graph by the cluster analysis in SOMs can be divided into the six steps that follow. Step I: Generate an initial vector m  m1 , m2 , , mk  for the output layer neuron nodes using the linear initialization method and define a threshold for the number of training times tmax . Step II: Input the training data [ X ,Y LOO ] and obtain the best matching neuron unit corresponding to each sampling point by calculating the minimum distance between the sampling points and output layer neuron nodes. The minimum distance is expressed as (6) x o - mc  min  x o - mi  i

h(i, c)  e



mi -mc 2 h2

AN US

In Eq. (6),  denotes the Euclidean distances between two neuron nodes. mc is defined to be best match unit. Through Eq. (6), all the input samples can be mapped to the corresponding output neurons. Step III: Collect neurons that are mapped into the affiliate domain of each neuron. In this work, a Gaussian function is used as the neighborhood function. (7)

In Eq. (7),  h denotes the ranges of the neighborhood, which can be calculated as  h (t )   h (t )e

( 2 h0

t ) tmax

(8)

M

0

where  h is the initial value of the range of the neighborhood and t is the current number of training. Step IV: Update the weight w of all the output neurons. The updated weight for a neuron is equal to the average value of the weight vectors of this neuron and that of neurons in its affiliate domain. The expression of updating weight at the i th neuron is defined as (9) wi (t  1)  wi (t )   (t )h(t )( x(t ) - mi (t )) where  (t ) is the learning factor. Step V: Check whether the maximum number of training iterations is achieved or not. If so, proceed to Step VI. Otherwise go back to Step II. Step VI: Generate the topological graph of LOO errors in the design variable space.

CE

PT

ED

0

AC

As a demonstration, the topological graph shown in Fig.4 is generated using SOMs, where the training data is from the LOO errors at sample points in Fig 3(b). It can be seen from Fig.4 that the relatively higher values of LOO errors are distributed over the top right section. The corresponding ranges of design variables x1 and x2 can be easily learned from the topological graph with different colors.

ACCEPTED MANUSCRIPT

x2

AN US

CR IP T

x1

LOO

Figure 4. An illustration of topological graph of the LOO errors in the design variable space.

PT

ED

M

Once the topological graph is obtained, the boundary of the sensitive region can be obtained by mapping the topological information to the original design variable space. Fig.5 plots the obtained sensitive region in the original design variable space. The highlighted nodes represent the sample points in the current sample set. The large rectangle is the range of the design variable space. The smaller rectangle represents the boundary of the sensitive region. Its vertices and the coordinates of vertices are marked with arrows. As shown in Fig.5, sample point #13 falls into the sensitive region, which is expected because the region around sample point #13 is not sampled well and thus more sample points are needed to improve the prediction error in this region as discuss in Section 3.2.

AC

CE

3#

x2

6#

9#

(0.7352,3.3004)

12#

Sensitive region

(0.7352,1.2802) 11#

(3.3455,3.3004)

13#

2#

5#

10# (3.3455,3.3004) 8#

1#

4#

7#

x1 Figure 5. The demonstration of the obtained sensitive region by SOMs.

ACCEPTED MANUSCRIPT 3.4 Step 4:Obtain the new sample point

CR IP T

According to literature and experience, the corner points of the design variable space may sometimes contain valuable information about the studied systems. To take this situation into consideration, a criterion described in Algorithm 2 is introduced to determine whether it is needed to extend the sensitive region or not. The criterion, which is based on the work of Chu et al. (Chu, et al., 2010), is whether there will be no distinct difference of the clustering results between the vertices of the original design variable space and the sensitive region if the distance between them is relative small. In Algorithm 2,  is defined as the distance factor. It has been found that  ( 0.15    0.3 ) is proper in most cases, and we adopt (   0.2 ) in all test cases presented in this work. According to the Algorithm 2, the sensitivity region depicted in Fig.5 does not need to be extended in the active learning RBF metamodeling process. This illustration is plotted in Fig. 6. Algorithm 2. The criterion to determine whether the sensitive region needs to be extended or not. Input: The coordinates of the vertexes for the original design variable space o o   o xv  xvo , xvo , , xvo , the boundary of the original design variable space  xlow , xup  , the 1

2

2d



AN US



sr sr  boundary of the sensitive region  xlow , and the distance factor  . , xup 

Output: The finally obtained sensitive region. 1 Begin sr sr xup  xlow

Obtain the coordinates of the center for the sensitive region;

2

sr xcenter 

3

o o d0  xup  xlow Obtain the longest diagonal line for the original design space;

2

M

2

4 For i  1 to 2 do sr dvo  xvo  xcenter 5  Calculate the distance between the center of the sensitive d

i

2

ED

i

region and each vertex; 6 end for 7 dvo  min(dvo ) Obtain the minimum distance between the center of the sensitive

PT

m0

region and each vertex; 8 If dvo   d0 min

sr  xup  Extend

CE

9

sr  xlow , 

the sensitivity region to include the vertex m0 ;

AC

10 else sr sr  11  xlow , xup  Do not extend the sensitivity region; 12 end

ACCEPTED MANUSCRIPT

3#

6#

9#

(0.7352,3.3004)

(3.3455,3.3004)

12#

Sensitive region

(0.7352,1.2802) 11#

2#

x2

5#

ist

o D d v min

13#

an

ce

10# (3.3455,3.3004) 8#

4#

x2

CR IP T

Center of sensitive region

1#

m0

7#

x1

Figure 6. The obtained sensitive regions based on Algorithm 2.

find

AN US

When searching a fill-in sample point in the sensitive region, the unnecessary data clustering should be avoided. Because the available computational or experimental resource is limited, another potential sensitive region may have no chance to be sampled. Hence, the fillin sample point should have a large distance to the existing sample points in the sensitive region and be sufficiently away from the existing sample points in the current sampling set. To achieve this goal, an optimization formulation is established, as x i 1

s.t.

g ( x )  x  xio

2

2

M

p

max f ( x )   x  xisr

 0.5mean(min( xio  x oj )) , xio , x oj  x o  (i  j )

ED

sr sr  x   xlow , xup 

(10)

2

AC

CE

PT

In Eq. (10), the objective function f ( x) denotes the accumulated distance between the potential sample point and the existing sample points in the sensitive region and p is the number of sample points in the sensitive region. The constraint g ( x ) is to keep a spacefilling property of the whole design space, which can be obtained in a two-step procedure: Step 1: For each existing sample point, calculate the minimum distance of this sample point from the remaining sample points in the current sampling set X o . Step 2: Obtain the average of the minimum distance. By solving the optimization problem in Eq.(10), the obtained new fill-in sample point is expected to have a reasonable distance from the existing sampling points and at the same time can significantly improve the prediction accuracy of the RBF metamodel. In this work, genetic algorithms (Coello, 2000) and a penalty method (Homaifar, Qi, & Lai, 1994) for constraint-handling are adopted to solve the optimization problem. The newly added sample point for the case in Fig. 3(b) is shown in Fig.7. As illustrated in Fig.7, the newly added sample point falls within the boundaries of the sensitive region and at the same time keeps a desirable space-filling property.

ACCEPTED MANUSCRIPT

3#

6#

9# New sample (3.3455,3.3004) point

(0.7352,3.3004)

12#

13#

Sensitive region

2#

5#

1#

4#

11# 10# (3.3455,3.3004) 8#

CR IP T

(0.7352,1.2802)

x2

7#

x1

Figure 7. The illustration of an added infill sample points in one of the iteration. 3.5 Step 5:Update the RBF metamodel

3.6 Step 6:Check the stopping criterion

AN US

Once the new fill-in sample point is obtained, it is evaluated and the information at this point is added into the current sampling set. Then the RBF metamodel is rebuilt based on the updated sampling data.

ED

M

Two important aspects are considered when choosing stopping criterion: (a) achieving pre-determined metamodeling accuracy always needs a large number of sample points (Ajdari & Mahlooji, 2013; Eason & Cremaschi, 2014; Xu, Liu, Wang, & Jiang, 2014); and (b) practically, the available resource for running the simulation models or conducting physical experiments is limited (sometimes the limitation is known a priori). The proposed approach uses the maximum number of available sample points as the stopping criterion with the hope that the final RBF metamodel is able to achieve the required prediction accuracy.

PT

3.7 Step 7:Output the final RBF metamodel Once the stopping criterion is reached, the final RBF metamodel will be built and output.

CE

4. Numerical experiments demonstrations

AC

4.1 Numerical experiment Ten numerical examples from Aute et al. (Aute, Saleh, Abdelaziz, Azarm, & Radermacher, 2013), Li et al. (G. Li, et al., 2009), Jin et al. (Jin, et al., 2001) and Zhou et al. (Zhou, et al., 2015) are utilized to illustrate the effectiveness of the proposed ALRBF metamodeling approach. These numerical test examples are classified into two groups according to the nonlinearity of their behaviors: The first type is a set of test examples that exhibit low-order nonlinear behavior, and the second is a set of examples with high-order nonlinearity. These ten numerical examples are summarized in Table 1 and their analytical functions are described as follows. It is noticed that although the analytical functions are explicit, the relationships between the input variables and the corresponding responses are

ACCEPTED MANUSCRIPT assumed to be unknown because the analytical functions are only used to obtain the responses at the given sample points.  Six-hump Camel-back function (SC): f ( x1 , x2 )  4 x12  2.1x14  x16 / 3  x1 x2  4 x22  4 x24 ; x1  [2, 2], x2  [2, 2]



Aute’s function : f ( x1 , x2 )  x22  x1 ;

(12)

x1  [8,8], x2  [8,8];



Two-dimensional Rosenbrock function (RB):

CR IP T

f ( x1 , x2 )  100( x12  x2 ) 2  ( x1  1) 2 ; x1  [2.048, 2.048], x2  [2.048, 2.048];



Jin’s function : f ( x1 , x2 )  0.5 x12  x22  x1 x2  7 x1  7 x2 ; x1  [2, 2], x2  [2, 2];



Booth function: x1  [10,10], x2  [10,10]

Modified Easom function (ME):

AN US

f ( x1 , x2 )  ( x1  2 x2  7) 2  (2 x1  x2  5) 2



f ( x1 , x2 )  cos( x1 )  cos( x2 )  e( x1  2) x1  [5,5], x2  [5,5]



Li’s function : f ( x1 , x2 )  cos( x12  x22 );

2

 ( x2  2)2

;

M

x1  [5,5], x2  [5,5];



(11)

(13)

(14)

(15)

(16)

(17)

Dixon & Price function (DP):

n

, x4 )  ( x1  1) 2   i(2 xi 2  xi 1 ) 2 ;

ED

f ( x1 ,

xi  [10, 10], i  1, n4

AC

CE

f ( x1 ,...x6 )  -



(18)

Hartman function with n  6 (Hart6):

PT



i 2

, 4;

4 6 1 (2.58   ci exp[ aij ( x j  pij ) 2 ]), x j  [0,1]; 1.94 i 1 j 1

3 17 3.05 1.7 8   10 0.05 10 17 0.1 8 14  [ci ]  [1 1.2 3 3.2]T ,[ aij ]   ;  3 3.5 1.7 10 17 8    8 0.05 10 0.1 14   17  0.1312 0.1696 0.5569 0.0124 0.8283 0.5886   0.2329 0.4135 0.8307 0.3736 0.1004 0.9991  [ pij ]    0.2348 0.1451 0.3522 0.2883 0.3047 0.6650    0.4047 0.8828 0.8732 0.5743 0.1091 0.0381

Borehole model function (BM):

(19)

ACCEPTED MANUSCRIPT 2 x3 ( x4  x6 )

fborehole 

ln( x2 / x1 ) 1  2 x3 x7 / (ln( x2 / x1 ) x12 x8 )  x3 / x5 

x1  [0.05, 0.15], x2  [100,50000], x3  [63070,115600], x4  [990,1110],

(20)

x5  [63.1,116], x6  [700,820], x7  [1120,1680], x8  [9855,12045];

Table 1. Summary of the mathematical test cases. Dimensions 2 2 2 2 2 2 2 4 6 8

Function characteristics Smooth in general

CR IP T

Type 2

Test functions Six-hump Camel-back function (SC) Aute’s function (Aute) 2-dimensional Rosenbrock function (RB) Jin’s function(Jin) Booth function (Booth) Modified Easom function (ME) Li’s function (Li) Dixon & Price function (DP) Hartman function (Hart6) Borehole model function (BM)

Nonlinearity in some regions with several local minima/minimum

AN US

Types Type 1

4.2 Sampling

M

Two different sample sizes, small (n=40) and large (n=80), are used to study the prediction performance of the proposed SRP-ALRBF metamodeling approach. The label “small size” and “large size” imply the relative meaning of sample amount through a comparison with each other. In this work, one half of the total numbers of sample points are generated as an initial sampling set to generate a global view of the design space. A discussion of the effects of the initial number of sample points on the proposed approach is given in Section 4.5.

ED

4.3 Accuracy and robustness evaluation metrics

CE

PT

To assess the prediction accuracy of different active learning RBF metamodeling techniques, two accuracy metrics, the relative maximum absolute error (RMAE) and relative root mean square error (RRMSE) are adopted. RMAE and RRMSE indicate the local and global accuracy of the RBF metamodel, respectively. Low values of RMAE and RRMSE are desired. These two accuracy metrics can be mathematically described as RMAE 

1 max yi  yˆi , i  1,..., N STD

AC

RRMSE 

STD 

1 STD

1 N  ( yi  yˆi )2 N i 1

(21)

1 N  ( yi  yi )2 N  1 i 1

where N is the number of validation points; yi is the actual response at i th validation point; yˆi is the predicted value from RBF metamodels at i th validation point. yi and STD are the mean and standard deviation of all observed response values, respectively. The robustness of a metamodel refers to its capability of achieving favorable accuracy for different types of problems with different sample sizes (Zhao & Xue, 2010). In this paper, the standard deviations of RMAE and RRMSE calculated using Eq. (21) are used as the

ACCEPTED MANUSCRIPT robustness metrics. The smaller the values of the standard deviation are, the more robust the metamodel is. 4.4 Results and discussion

AC

CE

PT

ED

M

AN US

CR IP T

For comparison, four other active learning RBF metamodeling techniques were tested: (1) a maximin scaled distance (MMD) based active learning RBF metamodeling approach (MMD-ALRBF) (Jin, et al., 2002) (2) a curvature enhanced active learning RBF metamodeling approach (CE-ALRBF) (Wei, et al., 2012) (3) a gradient enhanced active learning RBF metamodeling approach (GE-ALRBF) (Yao, et al., 2009) and (4) a weighted accumulative error based active learning RBF metamodeling approach (WAE-ALRBF) (P. Jiang, et al., 2015). In each numerical case, additional 1600 validation points within the design space are randomly selected to calculate the accuracy and robustness metrics for each metamodeling technique under different circumstances (e.g., different problem characteristics and different sample sizes). For each metamodeling approach, 60 different runs are performed for all approaches on the numerical examples and the average values of the prediction accuracy metrics are selected as the final results to avoid unrepresented numerical results. Aute (Type 1) and Hart6 (Type 2) functions are used as an illustration. Fig. 8 and Fig. 9 depict the standard deviations as error bars of the RMAE and RRMSE for different approaches under the two sample sizes for Aute (Type 1) and Hart6 (Type 2) functions, where the lower bounds (one standard deviation below mean value) and upper bounds (one standard deviation above mean value) are shown. As illustrated in Fig. 8 and Fig. 9, SRP-ALRBF actually performs better than other ALRBF approaches, especially with a small sample size. The error bars of the RMAE and RRMSE for the remaining cases are presented in supplemental materials. Although it can be observed that some error bars for SRP and other approaches could overlap, the orders of the error metrics in most case are relatively small (e.g., 10E-3 or 10E-5). Table 2 summarizes the mean values and standard deviations (in parenthesis) of the accuracy metrics for different metamodeling approaches under the two sample sizes. In Table 2, the best results of RMAE and RRMSE among these five metamodeling approaches are marked in bold, and the worst results are in italic. The ranking results of the mean values of accuracy metrics for each case are also marked with curly brackets ({}). Two intuitive conclusions can be made from Table 2. (1) On average, the proposed SRP-ALRBF approach performs better than the other four metamodeling approaches. (2) For most numerical cases, the proposed SRP-ALRBF is demonstrated to be the best in terms of both global and local accuracy, especially for the examples of Type 2. Table 3 summarizes the average ranking results for the five approaches considering all numerical cases. As observed from Table 3, the average ranking of the proposed SRP-ALRBF is the best among all approaches with both small and large sample sets. WAE-ALRBF ranks the second and third with the small and large sample sets, respectively. The average rank of GE-ALRBF is the worst with small sample set. MMD-ALRBF ranks the last with large sample set. In addition, p-values are used to test the differences between approaches over multiple data sets. The Bergmann-Hommel procedure is adopted to calculate adjusted p-values (Garcia & Herrera, 2008), as listed in Table 4. All p-values are less than 0.01, indicating that there are significant differences in prediction performance between the proposed SRP-ALRBF approach and other four existing active learning RBF metamodeling techniques. Some more detailed analysis of the comparison results are demonstrated in the following subsections.

RMAE

RRMSE

ACCEPTED MANUSCRIPT

RRMSE

AN US

RMAE

(c) RMAE values for Aute large under large set

CR IP T

(b) RRMSE values for Aute under small set

(a) RMAE values for Aute under small set

(d) RRMSE values for Aute under large set

RRMSE

PT

ED

RMAE

M

Figure 8. Error bar of RMAE and RRMSE for different approaches for Aute function.

(c) RMAE values for Hart under large set

(b) RRMSE values for Hart under small set

RRMSE

AC

RMAE

CE

(a) RMAE values for Hart under small set

(d) RRMSE values for Hart under large set

Figure 9. Error bar of RMAE and RRMSE for different approaches for Hart function.

ACCEPTED MANUSCRIPT Table 2. Summary of the accuracy results for numerical cases. Small set (n=40)

Large set (n=80)

Accuracy

RRMSE

RMAE Aute RRMSE

RMAE RB RRMSE

RMAE Jin RRMSE

RMAE Booth RRMSE

RMAE ME RRMSE

RMAE Li RRMSE

RMAE DP RRMSE

RMAE Hart6 RRMSE

RMAE BM

SRP-

MMD-

CE-

GE-

WAE-

SRP-

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

0.0643{2}

0.0797{3}

0.0866{5}

0.0837{4}

0.0614{1}

0.0414{5}

0.0063{2}

0.0155{4}

0.0088{3}

0.0019{1}

(0.0049)

(0.0044)

(0.0051)

(0.0055)

(0.0050)

(0.0056)

(0.0005)

(0.0049)

(0.0004)

(0.0002)

0.0115{2}

0.0122{3}

0.0144{4}

0.0176{5}

0.0114{1}

0.0018{5}

0.0006{2}

0.0016{4}

0.0011{3}

0.0002{1}

(0.0022)

(0.0018)

(0.0020)

(0.0021)

(0.0023)

(1.73E-04)

(5.10E-05)

(1.93E-04)

(2.01E-04)

(2.18E-05)

0.2539{3}

0.2804{4}

0.3199{5}

0.2213{2}

0.0643{1}

0.0832{5}

0.0679{3}

0.0703{4}

0.0596{2}

0.0571{1}

(0.0433)

(0.0419)

(0.0399)

(0.0335)

(0.0086)

(0.0108)

(0.0114)

(0.0097)

(0.0078)

(0.0072)

0.0325{3}

0.0334{4}

0.0408{5}

0.0221{2}

0.0133{1}

0.0199{5}

0.0166{4}

0.0132{3}

0.0117{1}

0.0119{2}

(0.0043)

(0.0051)

(0.0054)

(0.0019)

(0.0012)

(0.0016)

(0.0017)

(0.0016)

(0.0020)

(0.0025)

0.1755{2}

0.2527{5}

0.2473{4}

0.2139{3}

0.1272{1}

0.0090{3}

0.0133{5}

0.0109{4}

0.0035{1}

0.0046{2}

(0.0198)

(0.0195)

(0.0165)

(0.0236)

(0.0209)

(0.0012)

(0.0020)

(0.0018)

(0.0005)

(0.0005)

0.0100{1}

0.0222{4}

0.0270{5}

0.0107{3}

0.0101{2}

0.0006{5}

0.0002{1.5}

0.0005{4}

0.0003{3}

0.0002{1.5}

(0.0021)

(0.0019)

(0.0023)

(0.0021)

(0.0024)

(4.94E-05)

(4.93E-05)

(4.48E-05)

(5.03E-05)

(5.34E-05)

0.0151{4}

0.0188{5}

0.0123{2}

0.0138{3}

0.0055{1}

7.00E-04{5}

5.03E-04{3}

2.01E-04{2}

5.06E-04{4}

1.98E-04{1}

(0.0018)

(0.0018)

(0.0018)

(0.0020)

(0.0004)

(5.5E-05)

(5.62E-05)

(5.04E-05)

(4.85E-05)

(4.97E-05)

0.0011{1}

0.0016{5}

0.0013{3}

0.0014{4}

0.0012{2}

5.05E-05{4}

2.38E-05{3}

9.91E-06{1}

5.65E-05{5}

1.64E-05{2}

(2.18E-04)

(1.88E-04)

(2.12E-04)

(1.81E-04)

(2.22E-04)

(3.21E-06)

(2.7E-06)

(2.9E-06)

(3.03E-06)

(3.01E-06)

0.0100{3}

0.0232{4}

0.0251{5}

0.0067{2}

0.0066{1}

0.0028{5}

0.0014{2}

0.0020{3}

0.0023{4}

0.0013{1}

(0.0020)

(0.0046)

(0.0043)

(0.0015)

(0.0014)

(2.16E-04)

(1.42E-04)

(1.56E-04)

(2.36E-04)

(1.25E-04)

0.0010{1}

0.0020{3}

0.0021{4}

0.0037{5}

0.0012{2}

1.44E-04{2}

2.13E-04{3}

2.52E-04{4}

3.65E-04{5}

1.21E-04{1}

(1.36E-04)

(2.15E-04)

(2.04E-04)

(1.89E-04)

(1.50E-04)

(1.44E-05)

(2.05E-05)

(2.06E-05)

(1.95E-05)

(1.59E-05)

3.5983{5}

2.3176{1}

2.5729{4}

2.4873{3}

2.3413{2}

3.0165{5}

2.0681{2}

2.1131{4}

2.0698{3}

1.6261{1}

(0.4653)

(0.2819)

(0.2951)

(0.3371)

(0.3070)

(0.4825)

(0.2966)

(0.3230)

(0.3133)

(0.2051)

0.7174{5}

0.5202{3}

0.5290{4}

0.4777{1}

0.4806{2}

0.5149{5}

0.3229{3}

0.3977{4}

0.2406{2}

0.2126{1}

(0.0967)

(0.0885)

(0.0665)

(0.0755)

0.8472{5}

0.3866{2}

0.4649{4}

0.3951{3}

(0.0799)

(0.0363)

(0.0421)

(0.0381)

0.0681{5}

0.0393{2}

0.0417{3}

0.0438{4}

(0.0073)

(0.0042)

(0.0035)

(0.0041)

3.7695{5}

2.6995{2}

3.4009{4}

2.7387{3}

(0.0412)

(0.0381)

(0.0366)

(0.0279)

0.7863{5}

0.4640{3}

0.6262{4}

0.3539{2}

(0.0791)

(0.0500)

(0.0495)

3.6222{5}

1.9957{2}

2.4085{4}

(0.3925)

(0.2431)

(0.3506)

0.7400{5}

0.4105{4}

0.3874{2}

(0.0935)

(0.0386)

3.2027{5}

2.4308{3}

(0.0461) 3.1820{4}

(0.0386)

(0.0818)

(0.0950)

(0.0410)

(0.0377)

(0.0209)

(0.0170)

0.2341{1}

0.1294{5}

0.0050{2}

0.0207{4}

0.0074{3}

0.0038{1}

(0.0193)

(8.99E-03)

(4.64E-04)

(2.00E-03)

(4.41E-04)

(2.65E-04)

0.0222{1}

0.0098{5}

0.0003{1.5}

0.0012{4}

0.0008{3}

0.0003{1.5}

(0.0016)

(1.00E-03)

(4.22E-04)

(8.53E-05)

(9.87E-05)

(3.87E-04)

2.2768{1}

2.2944{5}

1.6337{2}

2.0228{4}

1.7738{3}

1.5369{1}

(0.0027)

(0.0266)

(0.0225)

(0.0243)

(0.0204)

(0.0011)

0.3501{1}

0.4706{5}

0.3376{3}

0.3418{4}

0.2906{1}

0.2917{2}

(0.0410)

2.1621{3}

1.2072{1}

(0.0393)

(0.0292)

(0.0250)

(0.0185)

(0.0242)

2.8418{5}

1.4960{2}

1.7528{4}

1.6080{3}

0.9492{1}

(0.2123)

(0.1002)

(0.3226)

(0.2178)

(0.2375)

(0.1800)

(0.0870)

0.4084{3}

0.3286{1}

0.5988{5}

0.3635{4}

0.3198{2}

0.3409{3}

0.2851{1}

(0.0368)

1.8685{2}

(0.0297)

(0.0562)

(0.0429)

(0.0330)

(0.0330)

(0.0267)

1.1451{1}

2.6941{5}

1.5076{3}

2.4748{4}

1.2210{2}

0.8697{1}

(0.0397)

(0.0228)

(0.0300)

(0.0173)

(0.0087)

(0.0279)

(0.0165)

(0.0288)

(0.0197)

(0.0078)

0.6791{5}

0.5005{3}

0.6097{4}

0.4960{2}

0.4794{1}

0.5452{5}

0.4023{3}

0.5095{4}

0.3692{2}

0.2816{1}

(0.0705)

(0.0466)

(0.0616)

(0.0439)

(0.0532)

(0.0501)

(0.0338)

(0.0450)

(0.0448)

(0.0260)

PT

RRMSE

WAE-

CR IP T

RMAE SC

GE-

AN US

metrics

CE-

M

functions

MMD-

ED

Test

Table 3. Average ranking results for the five approaches considered all numerical cases. Small set (n=40)

Large set (n=80)

Accuracy

CE-

GE-

WAE-

SRP-

MMD-

CE-

GE-

WAE-

SRP-

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

ALRBF

RMAE

3.9

3.1

4.1

2.8

1.1

4.8

2.6

3.7

2.8

1.1

RRMSE

3.3

3.4

3.8

3.1

1.4

4.6

2.8

3.4

2.8

1.4

MMD-

CE

metrics

AC

Average rank

Table 4. Adjusted p-values obtained in the numerical examples by Bergmann-Hommel’s dynamic procedure. Small set (n=40)

Large set (n=80)

i

hypothesis

pi -value

i

hypothesis

pi -value

1

GE-ALRBF vs .SRP-ALRBF

6.664E-07

1

MMD-ALRBF vs .SRP-ALRBF

5.200E-11

2

MMD-ALRBF vs .SRP-ALRBF

1.560E-05

2

GE-ALRBF vs .SRP-ALRBF

2.535 E-05

3

CE-ALRBF vs .SRP-ALRBF

2.534E-04

3

WAE-ALRBF vs .SRP-ALRBF

0.008

4

WAE-ALRBF vs .SRP-ALRBF

0.003

4

CE-ALRBF vs .SRP-ALRBF

0.008

ACCEPTED MANUSCRIPT 4.4.1 Overall performance

RMAE

RRMSE (a) Accuracy

CR IP T

AN US

STD

MEAN

Fig. 10 shows the means and standard deviations of the RMAE and RRMSE metrics for all the ALRBF metamodeling approaches for numerical cases of small and large sample sizes. As illustrated in Fig. 10 (a), the accuracy of the SRP-ALRBF approach performs the best in terms of both local and global accuracy metrics, and WAE-ALRBF performs the second best, followed by the CE-ALRBF and GE-ALRBF. MMD-ALRBF has the largest mean RMAE and RRMSE values among the other four metamodeling approaches. As seen in Fig. 10 (b), the SRP-ALRBF approach also exhibits the best in terms of robustness. The robustness performance of WAE-ALRBF is close to CE-ALRBF. GE-ALRBF is slightly better than MMD-ALRBF in RRMSE. In summary, the prediction performance of the proposed SRPALRBF approach is the best in terms of both average accuracy and robustness for all numerical examples with small and large sample sizes.

RMAE

RRMSE

(b) Robustness

M

Figure 10. The overall performance of different ALRBF metamodeling techniques (a) Accuracy (b) Robustness.

ED

4.4.2 Performance for different types of problems under small and large sample sizes

AC

CE

PT

Fig.11 and Fig. 12 depict the prediction accuracy and robustness of each metamodeling approach for different types of problems with small and large sample sizes. In Fig. 11 and Fig.12, “Type 1” and “Type 2”denote the types of the test cases, while “Small” and “Large” represent the sizes of the sample points. That is, “Type1 Small” indicates the problem of Type 1 with small sample size. As illustrated in Fig. 11 (a) and Fig. 11 (b), the local average prediction accuracy and robustness of the proposed SRP-ALRBF are better than the other four metamodeling approaches for Type 2 problems with both small and large sample sizes. WAE-ALRBF has a comparable local prediction performance as CE-ALRBF, except that its local prediction robustness is slightly inferior to CE-ALRBF for Type 2 problems under large sample sizes. The local prediction accuracy and robustness of GE-ALRBF is between WAE-ALRBF and MMD-ALRBF for different types of problems under different sample sizes. As seen in Fig. 11 (c) and Fig. 11 (d), SRP-ALRBF performs the best in terms of global accuracy and robustness except for its slightly lower global prediction robustness than WAE-ALRBF for Type 2 problems with small sample sizes. It is noted that MMD-ALRBF shows good global accuracy and robustness for Type 1 problems with small sample sizes. This can be explained by the fact that the response characteristics of these test cases are smooth everywhere. As a result, MMD-ALRBF is a better match due to that it focuses on filling the entire input space

ACCEPTED MANUSCRIPT

CR IP T

STD

MEAN

uniformly. In terms of the global prediction performance for Type 1 problems with large sample sizes, there are no distinct differences among these five ALRBF methods.

Type 1 Small Type 1 Large Type 2 Small Type 2 Large (b) Robustness in RMAE

M

STD

MEAN

AN US

Type 1 Small Type 1 Large Type 2 Small Type 2 Large (a) Accuracy in RMAE

Type 1 Small Type 1 Large Type 2 Small Type 2 Large (c) Accuracy in RRMSE

Type 1 Small Type 1 Large Type 2 Small Type 2 Large (d) Robustness in RRMSE

ED

Figure 11.Prediction performances of different ALRBF metamodeling techniques for two types of problems under small and large sample sizes. (a) Accuracy in RMAE (b) Robustness in RMAE (c) Accuracy in RRMSE (d) Robustness in RRMSE.

AC

CE

PT

In Fig.12, the prediction accuracy and robustness are derived for single contributing factors, i.e., different problem types and different sample sizes. It is concluded that all ALRBF metamodeling techniques perform better for Type 1 problems than that for Type 2 problems. Meanwhile, it is obvious that the average accuracy and robustness of all the metamodeling techniques will improve as the sample set is increased from a small to large size. For the small sample size, it is observed that the local accuracy and robustness of the SRP-ALRBF approach outperforms other four metamodeling approaches, while its global accuracy and robustness is close to those of WAE-ALRBF. For the large sample size, it is concluded that the proposed SRP-ALRBF approach is the best in terms of both RMAE and RRMSE. CE-ALRBF achieves the next best prediction robustness, while its mean accuracy is worse than WAE-ALRBF. Considering the problem types, the proposed SRP-ALRBF method outperforms the other four approaches for Type 2, which is attributed to its ability to trace the sensitive region and then arrange a high-priority for this region to be filled with experiments. The proposed SRP-ALRBF approach has a comparable prediction performance as that of WAE-ALRBF and CE-ALRBF for Type1, in which the test cases exhibit smooth behavior in the design space.

Large Set Type 1 (a) Accuracy in RMAE

Type 2

Small Set

Large Set Type 1 (c) Accuracy in RRMSE

Type 2

Small Set

Large Set Type 1 (b) Robustness in RMAE

AN US

STD

MEAN

Small Set

CR IP T

STD

MEAN

ACCEPTED MANUSCRIPT

Large Set Type 1 (d) Robustness in RRMSE

Type 2

M

Small Set

Type 2

ED

Figure 12. Prediction performances comparison for single contributing factors (a) Accuracy in RMAE (b) Robustness in RMAE (c) Accuracy in RRMSE (d) Robustness in RRMSE. 4.5 Sensitivity analysis of the number of initial sample points for the proposed approach

AC

CE

PT

It is found that the number of initial sample points may influence the performance of the proposed method. Some problems may prefer fewer initial sample points to allow more fill-in points selected in the active learning process, while some other problems may be the opposite. Generally, it is suggested that the number of initial sample points depends on the dimension of the problems under study and the problem properties. To investigate the impact of initial sample points on the prediction performance of the proposed approach, five different levels of initial sample sizes, 30%, 40%, 50%, 60%, and 70% of the total available sample points in large sample sets, are taken into consideration for the SC (Type 1) and Hart6 (Type 2) functions. Taking the values of the two accuracy metrics under the 50% of the total available sample points as a reference, Fig. 13 plots the relative improvement/decrement percentages in RMAE and RRMSE for different levels of initial sample sizes compared to 50% of the total available sample points. In Fig. 13, a positive value implies a superior local/global performance than that of the 50% of the total available sample points. The larger the value is, the higher level of excellence is. It can be concluded from Fig.13 that the initial sample size, half of the total available sample points, adopted in this study is suitable for those numerical test cases. In addition, it can also be observed that for these numerical cases, a very large initial sample size will reduce the available number of sample points in the active learning sampling process, leading to the problem that some

ACCEPTED MANUSCRIPT

(a) Effects of initial sample sizes on SC function

CR IP T

Values of RMAE and RRMSE

Values of RMAE and RRMSE

sensitive regions cannot be fully exploited. Yet, a very small initial sample size may not be large enough to obtain a full knowledge of the original design space, which also results in some sensitive but undiscovered regions.

(b) Effects of initial sample sizes on Hart6 function

Figure 13.The sensitive analysis results of the initial sample sizes.

AN US

4.6 Running time of each active learning RBF approach

PT

ED

M

In this research, all the models are constructed on the computational platform with a 3.30GHz Intel (R) Core (TM)i3 CPU and 4GB RAM. Computational times required to obtain a new sample point and update the RBF metamodel for the numerical test cases are recorded. MMD-ALRBF is the most efficient approach, which requires less than five seconds to conduct new sample points searching and metamodel updating, even for the nonlinear cases with high dimensions. SRP-ALRBF needs about 10-20 seconds to obtain a new sample point and update the RBF metamodel, followed by WAE-ALRBF that requires 0.5-1 minutes to select a new sample point for updating the RBF metamodel. GE-ALRBF and CE-ALRBF require about 12 minutes, which attributes to the relatively time-consuming process of deriving the RBF functions in GE-ALRBF and CE-ALRBF. It is noted that although additional computational cost is required in the active learning process for each RBF approach, this cost is more likely offset by the computationally intensive simulation and analysis to obtain the response value at each sample point.

CE

5. Engineering cases

AC

In this section, three engineering cases, prediction of the maximum stress for a SWATH, prediction of the maximum stress for a long cylinder pressure vessel, and modeling aerodynamic data for four-digit NACA airfoil, are used to test the performance of the proposed approach. Since it is time consuming to obtain simulation results at sample points, only a single run is performed for all approaches in these three engineering cases. 5.1 Engineering case I: Prediction of the maximum stress for a SWATH The Small Waterplane Area Twin Hull (SWATH) catamaran integrating the advantages of submarines, hydrofoil boats and catamarans has gained great attention in recent years. In the concept design of a SWATH catamaran, the maximum structural stress generated by the waves hit should be considered because an excess structural stress will lead to the failure of the hull structure. For a SWATH catamaran, the external load mainly contains the transverse load in the wave. The maximum structural stress occurs in circumstances when the hull body

ACCEPTED MANUSCRIPT is subject to horizontal wave force. To simplify this engineering problem, in this work the SWATH catamaran at zero speed is considered. The structure of the SWATH catamaran is sketched in Fig.14. Three independent design variables are considered: strut shell thickness tt , strut thickness t s , and transverse bulk heads thickness th . Other design parameters are fixed in finite element modeling. According to the guide of design the SWATH from the China Classification Society (CCS), the horizontal wave force can be calculated by (22) FS  9.81DTL where  is the displacement, and D and T can be calculated by D  3.2376  0.5452log  T =1.754  d / 1/3 L  0.75  0.35tanh(0.5Lsd  6.0)

CR IP T

(23) (24) (25)

AN US

In the above equations, d is the designed draft, tanh denotes the hyperbolic tangent function, and Lsd can be calculated as (26) Lsd  3.2984  I S / ()1/3 where I S is the length of the pillar at the waterline The ranges and values for these design variables are listed in Table 5. The material property is shown in Table 6.

ED

M

Gravitational load

AC

CE

PT

Transverse bulkhead Pillar Waterline Torpedo-shaped lower body Baseline

Transverse force Buoyancy

Figure 14.Schematic plot of the structure of SWATH.

ACCEPTED MANUSCRIPT

Table 6. Material properties. Parameters

Values 300t 36m 3.9m 24m

CR IP T

Table 5. Ranges and values of the inputs. Inputs Ranges Strut shell thickness tt 5.5-8 mm Strut thickness ts 600-1000 mm Transverse bulk heads thickness th 5.5-8 mm Displacement  Whole hull length Loa Designed draft d Length of the pillar I S -

Values

M

AN US

2 105 Mpa Young’s Modulus E 8 103 kg / m3 Density  0.3 Poisson’s Ratio  The relationships between the maximum structural stress and the three design variables cannot be explicitly expressed. It is a typical black-box problem. The proposed SRP-ALRBF metamodeling technique is applied to predict the maximum structural stress levels for different combinations of design variables. In this paper, ANSYS 17.0 is adopted as a simulation tool to obtain the maximum structural stress. Due to the symmetry of the SWATH structural only half of the model is analyzed, which is shown in Fig. 15. In Fig. 15 (a), the structured grid is used in the finite element model and the total number of notes is about 7,000. One of the simulation results is shown in Fig. 15 (b).

AC

CE

PT

ED

Number of nodes:7,000

(a)Grid model

(b)Simulation analysis

Figure 15. The finite element model for the structure of the SWATH

To make a comparison between the ALRBF metamodeling approaches, the available sample points for the finite element simulation are limited to 30 and the initial number of sample points is set to be 15 in this engineering case. Additional 10 verification points are randomly generated to calculate the RMAE and RRMSE metrics. The sampling points and corresponding maximum structural stress for each metamodeling approach are summarized in

ACCEPTED MANUSCRIPT the supplemental material. Table 7 lists the comparison prediction result of each metamodeling approach. It is observed that the proposed SRP-ALRBF provides the most accurate metamodel in which both RMAE and RRMSE are the smallest. The accuracy of the metamodel constructed by CE-ALRBF is somewhere in between that of WAE-ALRBF and SRP-ALRBF. Fig. 16 plots the actual simulation values and the corresponding predicted values from the SRP-ALRBF metamodel. In Fig. 16, the red straight line denotes that the true response values are equal to the predicted values from the metamodel. It is found that the points of the proposed SRP-ALRBF approach are very close to the straight line. Table 7. Accuracy comparison results for the first engineering case.

 max

Accuracy metrics RMAE RRMSE

MMDALRBF 1.2499 0.6160

CEALRBF 1.2254 0.5343

WAEALRBF 1.2453 0.5962

SRPALRBF 0.4788 0.2099

M

AN US

Predicted values at verification points

( Mpa )

GEALRBF 4.1397 2.2609

CR IP T

Maximum stress

( Mpa )

ED

True values at verification points

PT

Figure 16. True and predicted values at the verification points for the proposed SRP-ALRBF approach.

CE

5.2 Engineering case II: Prediction of the maximum stress for a long cylinder pressure vessel

AC

The second engineering example is the prediction of the maximum stress for a long cylinder pressure vessel. The geometry, model parameters, and loading force of the long cylinder pressure vessel are illustrated in Fig. 17 (Zhou, et al., 2015). The cylinder pressure vessel is subject to a uniformly distributed load P  23MPa . Five continuous design variables are included: the height of the end part h1 , the inside diameter of the end part r1 , the thickness of the end part t1 , the inside diameter of the body part r2 and the thickness of the body part t2 . The range of the design variables are listed in Table 8. Other geometric parameters are predefined and fixed. For more details of this engineering case, please refer to Zhou et al. (Zhou, et al., 2015).

ACCEPTED MANUSCRIPT

Table 8. Ranges of the design variables. Range ( mm ) 280-320 40-50 19-27 165-205 13-23

M

AN US

CR IP T

Design variables the height of the end part h1 the inside diameter of the end part r1 the thickness of the end part t1 the inside diameter of the body part r2 the thickness of the body part t2

ED

Figure 17. Schematic plot of the cylinder pressure vessel (Zhou, et al., 2015).

AC

CE

PT

In this work, ANSYS 17.0 is used as a simulation tool for the stress response. The axial symmetry 3-D finite element model with Hexahedral meshes is adopted, which is depicted in Fig. 18. To make a comparison between each ALRBF metamodeling approaches, the available sample points for the FEM simulation are limited to 20 and the initial number of sample points is set to be 10 in this engineering case. Additional 10 verification points are randomly generated to calculate the RMAE and RRMSE metrics. The sampling points and corresponding maximum structural stress for each metamodeling approach is summarized in the supplemental material. Table 9 lists the comparison prediction result of each metamodeling approach. It is observed that the proposed SRP-ALRBF provides the most accurate metamodel in which both RMAE and RRMSE are the smallest.

ACCEPTED MANUSCRIPT

(b)Simulation analysis

CR IP T

(a)Grid model

Figure 18. 3-D simulation model for the cylinder pressure vessel.

Table 9. Accuracy comparison results for the second engineering case. Maximum stress

MMDALRBF 1.1937 0.6906

CEALRBF 1.3892 1.0075

GEALRBF 0.9490 0.5377

WAEALRBF 1.0428 0.4925

SRPALRBF 0.7496 0.3556

AN US

 max

Accuracy metrics RMAE RRMSE

5.3 Engineering case III: modeling aerodynamic data for four-digit NACA airfoil

Farfield 25c

25c Airfoil

50c

c

AC

CE

PT

ED

M

In this subsection, modeling aerodynamic data for four-digit NACA airfoil example is considered (Leifsson & Koziel, 2015). The proposed SRP-ALRBF approach is utilized to generate a metamodel for the lift coefficient ( CL ) as a function of three independent variables, the maximum ordinate of the mean camber line as a fraction of chord m , the chordwise position of the maximum ordinate p , and the thickness-to chord ratio t / c . The ranges for the independent variables are 0  m  0.03,0.4  p  0.8,0.07  t / c  0.12, respectively. A C-type solution boundary, as shown in Fig. 19, is used for the airfoil for CFD analysis. The boundaries are placed at 25 times the airfoil length above and below the airfoil, 25 times the airfoil length in front of it, and 50 times the airfoil length behind it.

Figure 19. Flow domain of NACA 0012 airfoil for CFD analysis. In this work, FLUENT 17.0 is used as a simulation tool for obtaining the lift coefficient. The grid of the simulation model consists of about 77,000 elements, which is shown in Fig. 20. To make a comparison between each ALRBF metamodeling approaches, the available sample points for the FEM simulation are limited to 20 and the initial number of sample

ACCEPTED MANUSCRIPT

CR IP T

points is set to be 10 in this engineering case. Additional 10 verification points are randomly generated to calculate the RMAE and RRMSE metrics. The sampling points and corresponding responses for each metamodeling approach is summarized in the supplemental material. Table 10 summarizes the accuracy comparison results from these five approaches. It is also observed that the proposed SRP-ALRBF performs the best.

(a)Farfield for the simulation model

(b)Close to the airfoil surface

AN US

Figure 20. CFD grids of the simulation model for four-digit NACA airfoil Table 10. Accuracy comparison results for the third engineering case.

CL

Accuracy metrics RMAE RRMSE

MMDALRBF 1.2175 0.7053

CEALRBF 1.4040 1.1162

M

Lift coefficient

6. Conclusion

GEALRBF 1.7070 0.6714

WAEALRBF 1.3625 0.6825

SRPALRBF 0.9210 0.4069

AC

CE

PT

ED

In this paper, an active learning RBF metamodeling approach is proposed to obtain a RBF metamodel with a desirable accuracy under the constraint of limited computational resources. In the SRP-ALRBF approach, the whole design space is divided into some subspace by SOMs. The division criterion is developed according to LOO errors, which can reflect the sensitivity of the RBF metamodel to the loss of sample information. Then the boundary of the sensitive region is determined by the topological graph generated by cluster analysis in SOMs. Finally, a single optimization mathematic model is constructed in the sensitive region to find the new sample points. The proposed SRP-ALRBF approach is compared to four other active learning RBF metamodeling techniques (MMD-ALRBF, CE-ALRBF, GE-ALRBF and WAE-ALRBF) using ten numerical examples with different degrees of complexity and three engineering design problems. It is seen that for the same number of simulation evaluations and in terms of both local and overall accuracy, the proposed SRP-ALRBF metamodeling approach significantly outperforms the other four metamodeling methods, especially for the problems in Types 2 in which the relationship between the response characteristics and design variables are multi-modal and non-smooth. The prediction performance of the CE-ALRBF is close to WAE-ALRBF, whereas its computational cost is about ten times higher than that of WAEALRBF. It is also observed that although the MMD-ALRBF is the most efficient approach, it provides the worst local and global prediction accuracy and robustness among the five because it does not consider the characteristics of the output space.

ACCEPTED MANUSCRIPT Since some complex engineering problems may yield multiple responses in one simulation, as a part of our future work, the proposed SRP-ALRBF approach will be extended to solve the engineering problems with multiple inputs and output responses. Overall, as a novel active learning RBF metamodeling approach, SRP-ALRBF shows good potential for simulation-based design problems. Acknowledgements This research has been supported by the National Natural Science Foundation of China

CR IP T

(NSFC) under Grant No. 51505163, No. 51421062 and No. 51323009, National Basic Research Program (973 Program) of China under grant No. 2014CB046703, and the Fundamental Research Funds for the Central Universities, HUST: Grant No. 2016YXMS272.

Appendix A. Supplementary material

AN US

The authors also would like to thank the anonymous referees for their valuable comments.

Supplementary data associated with three engineering cases can be found in the attached supplementary material. References

AC

CE

PT

ED

M

Ajdari, A., & Mahlooji, H. 2013. An Adaptive Exploration-Exploitation Algorithm for Constructing Metamodels in Random Simulation Using a Novel Sequential Experimental Design. Communications in Statistics - Simulation and Computation, 43, 947-968. Aute, V., Saleh, K., Abdelaziz, O., Azarm, S., & Radermacher, R. 2013. Cross-validation based single response adaptive design of experiments for Kriging metamodeling of deterministic computer simulations. Structural and Multidisciplinary Optimization, 48, 581-605. Bekel, H., Heidemann, G., & Ritter, H. 2005. Interactive image data labeling using selforganizing maps in an augmented reality scenario. Neural Networks, 18, 566-574. Christelis, V., & Mantoglou, A. 2016. Pumping Optimization of Coastal Aquifers Assisted by Adaptive Metamodelling Methods and Radial Basis Functions. Water Resources Management, 1-15. Chu, X.-Z., Gao, L., Qiu, H.-B., Li, W.-D., & Shao, X.-Y. 2010. An expert system using rough sets theory and self-organizing maps to design space exploration of complex products. Expert Systems with Applications, 37, 7364-7372. Coello, C.A.C. 2000. Use of a self-adaptive penalty approach for engineering optimization problems. Computers in Industry, 41, 113-127. Crombecq, K., De Tommasi, L., Gorissen, D., & Dhaene, T., 2009. A novel sequential design strategy for global surrogate modeling, Winter Simulation Conference. Publishing, pp. 731-742. Crombecq, K., Laermans, E., & Dhaene, T. 2011. Efficient space-filling and non-collapsing sequential design strategies for simulation-based modeling. European Journal of Operational Research, 214, 683-696. Eason, J., & Cremaschi, S. 2014. Adaptive sequential sampling for surrogate model generation with artificial neural networks. Computers & Chemical Engineering, 68, 220-232.

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

Eddy, D.C., Krishnamurty, S., Grosse, I.R., Wileden, J.C., & Lewis, K.E. 2015. A predictive modelling-based material selection method for sustainable product design. Journal of Engineering Design, 26, 365-390. Fang, H., & Horstemeyer, M.F. 2006. Global response approximation with radial basis functions. Engineering Optimization, 38, 407-424. Fang, H., Rais-Rohani, M., Liu, Z., & Horstemeyer, M.F. 2005. A comparative study of metamodeling methods for multiobjective crashworthiness optimization. Computers & Structures, 83, 2121-2136. Garcia, S., & Herrera, F. 2008. An extension on``statistical comparisons of classifiers over multiple data sets''for all pairwise comparisons. Journal of Machine Learning Research, 9, 2677-2694. Hardy, R.L. 1971. Multiquadric equations of topography and other irregular surfaces. Journal of geophysical research, 76, 1905-1915. Homaifar, A., Qi, C.X., & Lai, S.H. 1994. Constrained optimization via genetic algorithms. Simulation, 62, 242-253. Jiang, N., Luo, K., Beggs, P.J., Cheung, K., & Scorgie, Y. 2015. Insights into the implementation of synoptic weather‐type classification using self‐organizing maps: an Australian case study. International Journal of Climatology, 35, 3471-3485. Jiang, P., Shu, L., Zhou, Q., Zhou, H., Shao, X., & Xu, J. 2015. A novel sequential exploration-exploitation sampling strategy for global metamodeling. IFACPapersOnLine, 48, 532-537. Jin, R., Chen, W., & Simpson, T.W. 2001. Comparative studies of metamodelling techniques under multiple modelling criteria. Structural and Multidisciplinary Optimization, 23, 1-13. Jin, R., Chen, W., & Sudjianto, A., 2002. On sequential sampling for global metamodeling in engineering design, ASME 2002 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. Publishing, pp. 539-548. Jin, R., Chen, W., & Sudjianto, A. 2005. An efficient algorithm for constructing optimal design of computer experiments. Journal of Statistical Planning and Inference, 134, 268-287. Kohonen, T. 1982. Self-organized formation of topologically correct feature maps. Biological cybernetics, 43, 59-69. Leifsson, L., & Koziel, S. 2015. Simulation-driven aerodynamic design using variable-fidelity models. World Scientific. Li, G., Aute, V., & Azarm, S. 2009. An accumulative error based adaptive design of experiments for offline metamodeling. Structural and Multidisciplinary Optimization, 40, 137-155. Li, M., Li, G., & Azarm, S. 2008. A Kriging Metamodel Assisted Multi-Objective Genetic Algorithm for Design Optimization. Journal of Mechanical Design, 130, 031401. Liang, H., Zhu, M., & Wu, Z. 2014. Using Cross-Validation to Design Trend Function in Kriging Surrogate Modeling. AIAA Journal, 52, 2313-2327. Liu, H., Xu, S., Wang, X., Meng, J., & Yang, S. 2016. Optimal Weighted Pointwise Ensemble of Radial Basis Functions with Different Basis Functions. AIAA Journal, 1-17. Meng, K., Dong, Z., & Wong, K. 2009. Self-adaptive radial basis function neural network for short-term electricity price forecasting. IET generation, transmission & distribution, 3, 325-335. Mullur, A.A., & Messac, A. 2005. Extended Radial Basis Functions: More Flexible and Effective Metamodeling. AIAA Journal, 43, 1306-1315. Nikkilä, J., Törönen, P., Kaski, S., Venna, J., Castrén, E., & Wong, G. 2002. Analysis and visualization of gene expression data using self-organizing maps. Neural networks, 15, 953-966. Peng, L., Liu, L., Long, T., & Yang, W. 2014. An efficient truss structure optimization framework based on CAD/CAE integration and sequential radial basis function metamodel. Structural and Multidisciplinary Optimization, 50, 329-346.

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

Prebeg, P., Zanic, V., & Vazic, B. 2014. Application of a surrogate modeling to the ship structural design. Ocean engineering, 84, 259-272. Qasem, S.N., Shamsuddin, S.M., & Zain, A.M. 2012. Multi-objective hybrid evolutionary algorithms for radial basis function neural network design. Knowledge-Based Systems, 27, 475-497. Sarimveis, H., Alexandridis, A., Mazarakis, S., & Bafas, G. 2004. A new algorithm for developing dynamic radial basis function neural network models based on genetic algorithms. Computers & chemical engineering, 28, 209-217. Shi, R., Liu, L., Long, T., & Liu, J. 2015. An efficient ensemble of radial basis functions method based on quadratic programming. Engineering Optimization, 48, 1202-1225. Su, P.-L., & Chen, Y.-S. 2012. Implementation of a genetic algorithm on MD-optimal designs for multivariate response surface models. Expert Systems with Applications, 39, 32073212. Sun, G., Li, G., Gong, Z., He, G., & Li, Q. 2011. Radial basis functional model for multiobjective sheet metal forming optimization. Engineering Optimization, 43, 1351-1366. Tripathy, M. 2010. Power transformer differential protection using neural network principal component analysis and radial basis function neural network. Simulation Modelling Practice and Theory, 18, 600-611. Tugrul, B., & Polat, H. 2014. Privacy-preserving kriging interpolation on partitioned data. Knowledge-Based Systems, 62, 38-46. Tyan, M., Nguyen, N.V., & Lee, J.-W. 2014. Improving variable-fidelity modelling by exploring global design space and radial basis function networks for aerofoil design. Engineering Optimization, 47, 885-908. van Dam, E.R., Husslage, B., den Hertog, D., & Melissen, H. 2007. Maximin Latin Hypercube Designs in Two Dimensions. Operations Research, 55, 158-169. Viana, F.A., Simpson, T.W., Balabanov, V., & Toropov, V. 2014. Special Section on Multidisciplinary Design Optimization: Metamodeling in Multidisciplinary Design Optimization: How Far Have We Really Come? AIAA Journal, 52, 670-690. Volpi, S., Diez, M., Gaul, N.J., Song, H., Iemma, U., Choi, K.K., Campana, E.F., & Stern, F. 2014. Development and validation of a dynamic metamodel based on stochastic radial basis functions and uncertainty quantification. Structural and Multidisciplinary Optimization, 51, 347-368. Wang, Y., Ni, H., & Wang, S. 2012. Nonparametric bivariate copula estimation based on shape-restricted support vector regression. Knowledge-Based Systems, 35, 235-244. Wei, X., Wu, Y.-Z., & Chen, L.-P. 2012. A new sequential optimal sampling method for radial basis functions. Applied Mathematics and Computation, 218, 9635-9646. Xiao, M., Gao, L., Xiong, H., & Luo, Z. 2015. An efficient method for reliability analysis under epistemic uncertainty based on evidence theory and support vector regression. Journal of Engineering Design, 1-25. Xiong, F., Xiong, Y., Chen, W., & Yang, S. 2009. Optimizing Latin hypercube design for sequential sampling of computer experiments. Engineering Optimization, 41, 793-810. Xu, S., Liu, H., Wang, X., & Jiang, X. 2014. A Robust Error-Pursuing Sequential Sampling Approach for Global Metamodeling Based on Voronoi Diagram and Cross Validation. Journal of Mechanical Design, 136, 071009. Yao, W., Chen, X., & Luo, W. 2009. A gradient-based sequential radial basis function neural network modeling method. Neural Computing and Applications, 18, 477-484. Yao, W., Chen, X., Zhao, Y., & van Tooren, M. 2012. Concurrent subspace width optimization method for RBF neural network modeling. IEEE transactions on neural networks and learning systems, 23, 247-259. Ye, P., Pan, G., Huang, Q., & Shi, Y., 2015. A New Sequential Approximate Optimization Approach Using Radial Basis Functions for Engineering Optimization, International Conference on Intelligent Robotics and Applications. Publishing, pp. 83-93. Yin, H., Fang, H., Wang, Q., & Wen, G. 2016. Design optimization of a MASH TL-3 concrete barrier using RBF-based metamodels and nonlinear finite element simulations. Engineering Structures, 114, 122-134.

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

Zhao, D., & Xue, D. 2010. A comparative study of metamodeling methods considering sample quality merits. Structural and Multidisciplinary Optimization, 42, 923-938. Zhou, Q., Shao, X., Jiang, P., Gao, Z., Wang, C., & Shu, L. 2016. An active learning metamodeling approach by sequentially exploiting difference information from variable-fidelity models. Advanced Engineering Informatics, 30, 283-297. Zhou, Q., Shao, X., Jiang, P., Gao, Z., Zhou, H., & Shu, L. 2016. An active learning variablefidelity metamodelling approach based on ensemble of metamodels and objectiveoriented sequential sampling. Journal of Engineering Design, 1-27. Zhou, Q., Shao, X., Jiang, P., Zhou, H., & Shu, L. 2015. An adaptive global variable fidelity metamodeling strategy using a support vector regression based scaling function. Simulation Modelling Practice and Theory, 59, 18-35.