A robust and efficient stepwise regression method for building sparse polynomial chaos expansions

Accepted Manuscript A robust and efficient stepwise regression method for building sparse polynomial chaos expansions Simon Abraham, Mehrdad Raisee, ...

Download PDF

692KB Sizes 0 Downloads 138 Views

Report

PDF Reader
Full Text

Accepted Manuscript A robust and efficient stepwise regression method for building sparse polynomial chaos expansions

Simon Abraham, Mehrdad Raisee, Ghader Ghorbaniasl, Francesco Contino, Chris Lacor

PII: DOI: Reference:

S0021-9991(16)30668-4 http://dx.doi.org/10.1016/j.jcp.2016.12.015 YJCPH 7014

To appear in:

Journal of Computational Physics

Received date: Revised date: Accepted date:

19 November 2015 9 December 2016 13 December 2016

Please cite this article in press as: S. Abraham et al., A robust and efficient stepwise regression method for building sparse polynomial chaos expansions, J. Comput. Phys. (2016), http://dx.doi.org/10.1016/j.jcp.2016.12.015

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

A robust and eﬃcient stepwise regression method for building sparse polynomial chaos expansions Simon Abrahama,∗, Mehrdad Raiseeb , Ghader Ghorbaniasla , Francesco Continoa , Chris Lacora a Vrije

Universiteit Brussel (VUB), Department of Mechanical Engineering, Research Group Fluid Mechanics and Thermodynamics, Pleinlaan 2 1050 Brussels, Belgium b School of Mechanical Engineering, College of Engineering, University of Tehran, P.O.Box: 11155-4563, Tehran, Iran

Abstract Polynomial Chaos (PC) expansions are widely used in various engineering ﬁelds for quantifying uncertainties arising from uncertain parameters. The computational cost of classical PC solution schemes is unaﬀordable as the number of deterministic simulations to be calculated grows dramatically with the number of stochastic dimension. This considerably restricts the practical use of PC at the industrial level. A common approach to address such problems is to make use of sparse PC expansions. This paper presents a non-intrusive regression-based method for building sparse PC expansions. The most important PC contributions are detected sequentially through an automatic search procedure. The variable selection criterion is based on eﬃcient tools relevant to probabilistic method. Two benchmark analytical functions are used to validate the proposed algorithm. The computational eﬃciency of the method is then illustrated by a more realistic CFD application, consisting of the non-deterministic ﬂow around a transonic airfoil subject to geometrical uncertainties. To assess the performance of the developed methodology, a detailed comparison is made with the well established LAR-based selection technique. The results show that the developed sparse regression technique is able to identify the most signiﬁcant PC contributions describing the problem. Moreover, the most important stochastic ∗ Corresponding

author Email address: [email protected] (Simon Abraham)

Preprint submitted to Journal of Computational Physics

December 15, 2016

features are captured at a reduced computational cost compared to the LAR method. The results also demonstrate the superior robustness of the method by repeating the analyses using random experimental designs. Keywords: Uncertainty quantiﬁcation, Regression-based polynomial chaos, Sparse polynomial chaos expansion, Least angle regression, Stepwise regression

1. Introduction Over the last years, Uncertainty Quantiﬁcation (UQ) has become very popular due not only to signiﬁcant increase in computational power but also, and most importantly, to meet the growing industrial demand. The goal of UQ is to 5

quantify uncertainties in engineering system outputs propagated from uncertain inputs [1]. In this context, the use of surrogate models [2] has been an essential tool for a successful and eﬃcient propagation of uncertainties. The Polynomial Chaos (PC) expansion [1, 3, 4] is probably the most widely used metamodel for propagating uncertainties. PC was originally proposed by

10

Wiener [5], under the name of Homogeneous Chaos, to model stochastic response resulting from a stochastic process with normally distributed random variables using Hermite polynomials. Only years later, Xiu and Karniadakis [6, 7] generalized the method to other types of statistical distributions (uniform, beta, gamma...) and showed that the relative error in the statistical moments de-

15

creases exponentially when the Askey scheme is used [8]. The method is now better known as generalized PC (gPC). In gPC, the stochastic solution is expanded into series of orthogonal polynomials in the space of random variables. Those orthogonal polynomials are chosen in accordance with the probability density function of the input parameters following the so-called Askey scheme

20

of polynomials [8]. The ﬁrst applications of PC were intrusive in the sense that the PC expansion was inserted in the diﬀerential equations describing the problem. Intrusive application to Computational Fluid Dynamics (CFD) problems can be found in [9, 10, 11, 12]. Dinescu et al. [11] showed the ﬁrst application of intrusive PC

2

25

to the 3D Navier-Stokes equations. Intrusive PC however requires important changes to the CFD software, making it less attractive for industry who are relying on their own well validated codes, or for use with commercial softwares. As a result attention was focused on non-intrusive approaches requiring no change in the CFD software [13, 14, 15, 16].

30

To compute the PC coeﬃcients non-intrusively, two classes of methods are generally applied [17], namely (i) a projection method and (ii) a regression method. The former approach requires to perform a multi-dimensional numerical integration in the stochastic space, while in the latter approach the PC coeﬃcients are calculated by solving an over-determined linear system of equa-

35

tions with least squares. A practical application of these techniques to CFD problems is available in [18, 19]. In both cases, a series of deterministic simulations for diﬀerent realizations of the uncertain parameters must be calculated. A major drawback is that the number of model evaluations grows exponentially with both the number of uncertainties and the PC expansion order. In the lit-

40

erature [14, 16, 20], this drastic increase in computational cost with the number of random variables is often referred to as the curse of dimensionality. This issue seriously restricts the practical use of PC in industrial applications, which are inherently characterized by a large number of uncertainties. Therefore, an eﬃcient UQ scheme must be deﬁned to overcome this issue.

45

Several solutions have been proposed in the literature. For example, Raisee et al. [16] developed a POD-based model reduction technique, where the computational expense is reduced by expanding the model output into its principal components using an optimal expansion [16, 20]. Other eﬃcient methods were built upon the assumption that the PC representation of the model response is

50

sparse. As a matter of fact, the model inputs are most often not equally relevant in the sense that some parameters may contribute more signiﬁcantly to the variation of the output than other less important parameters. This translates into sparsity in the PC expansion. Some examples of well-established methods relying on this assumption are the compressive sampling method [21, 22, 23, 24]

55

and the sparse regression method of Blatman and Sudret [25, 26, 27]. The sparse 3

regression method identiﬁes sequentially the most relevant basis functions in the PC expansion from only few samples. The selection criterion for retaining the basis functions in the PC expansion can be based e.g. on the determination coeﬃcient R2 [25, 26]. A more eﬃcient strategy was proposed by the same 60

authors in [27]. This strategy relies upon a Least Angle Regression (LAR) algorithm [28] and has been successfully applied to mathematical problems as well as problems involving structural mechanics. A backward elimination strategy was also proposed by Choi et al. [29], where several modules relevant to probabilistic method are adopted to ﬁnd the main stochastic features describing the

65

problem. Despite its high potential, the method is impractical for tackling highdimensional stochastic problems (d > 10) as a full PC expansion must ﬁrst be calculated before discarding irrelevant contributions. In this paper, a non-intrusive sparse regression method is proposed for an eﬃcient propagation of uncertainties in high dimensional stochastic problems.

70

The most important PC contributions are added sequentially using eﬃcient tools derived from probabilistic method. A major breakthrough compared to the work of [29] is that the most signiﬁcant stochastic features driving response variability are now detected "on-the-ﬂy", starting from an empty regression model. Two benchmark analytical functions are used for validating and assess-

75

ing the performance of the proposed variable selection technique. A detailed comparison is made with the well-established LAR method. The computational eﬃciency of the method is also illustrated by a more realistic CFD application. The remainder of this paper is organized as follows. In Section 2, the classical description of the PC representation is introduced. Section 3 gives the general

80

methodology that is followed. It starts by presenting the general regression problem in a probabilistic context. Then, the classical regression-based PC solution scheme is described. The detailed description of the sparse regression method introduced in the present work closes this core section. Numerical applications are eventually given in Section 4.

4

85

2. Polynomial chaos representation Assume that Y (ξ) denotes the exact deterministic model, representing a complex engineering system. Let Y (ξ) be referred to as the expensive model. It is a function of a set of independent random variables ξ = (ξ1 , ξ2 , ..., ξd ), where d is the dimension of the stochastic space. The Polynomial Chaos (PC) representation of the model response can be expressed as follows: Yˆ (ξ) =

P

u ˆi ψi (ξ)

(1)

i=0

where u ˆi represents the PC coeﬃcient and ψi is the PC basis function. The basis functions are multivariate orthogonal polynomials in the input variables [27]. The orthogonal polynomials are chosen in accordance with the probability distributions of input variables following the so-called Askey scheme of polynomial. For example, Hermite polynomials are chosen if random variables are normally distributed and Legendre polynomials are selected if random variables are uniformly distributed. The PC expansion basis is commonly truncated by prescribing the total expansion order p [27]. As a result, the total number of terms retained in the PC expansion is given by [27]: p+d (p + d)! = P +1= p p!d!

(2)

The number of terms grows exponentially with both the number of stochastic dimension and the PC expansion order. This is referred to as the curse of dimensionality. To compute the polynomial coeﬃcients u ˆi , the preferred approach is the regression method [1]. The use of a regression model is an essential ingre90

dient in the present work, since it will enable to derive statistical information from the model using probabilistic tools [30], as will be shown below.

3. Methodology 3.1. Fitting regression models The regression method is a statistical technique used to study the relationship between a dependent variable and one or more independent variables. The 5

general linear regression equation can be expressed in matrix form as follows [30, 31]: y = Xu + e

(3)

where y is a n-dimensional vector of observations (or responses), X denotes a n × m matrix of constants (n is the number of observations and m is the number of basis functions), which contains information on the basis functions employed in the model. The term u is a m-dimensional column vector of regression coeﬃcients and e is the error made by approximating the exact model by the regression model. In regression analysis, the basis functions are often referred to as predictors, predictor variables or regressors. For a given regression model, the method of least-squares is generally applied to estimate the regression coeﬃcients [31]: −1 T ˆ = XT X u X y

(4)

The ﬁtted regression model yˆ and the residuals eˆ are: ˆ yˆ = X u,

eˆ = y − yˆ

(5)

ˆ is [29, 31]: The variance-covariance matrix of u ˆ, cov(u), T def ˆ − E[ˆ ˆ − E[u] ˆ ˆ = E u u] u cov(u) = σ 2 (X T X)−1

(6)

where E[.] denotes the expectation operator and σ 2 is the variance of the error. It can be shown [31] that the sampling distribution of the regression coeﬃcients tends to be normal asymptotically, no matter the distribution of the errors. This approximation will be made in this paper. Based on this assumption, a conﬁdence interval on the regression coeﬃcients ui is given by:

ˆi ± z[1−α/2] V(ˆ ui ) ui ∈ u

(7)

where V(ˆ ui ) is the ith diagonal term of Eq.(6) and z[1−α/2] is the 1 − α/2 95

quantile of the standard normal distribution. A 95% conﬁdence level (α = 0.05) is considered in this work. The size of the conﬁdence interval is often measured 6

relatively to the estimate. The measure is then called the relative margin of error or relative standard deviation. It is deﬁned as the ratio between half the width of the conﬁdence interval and the estimate and is expressed in percent. 100

3.2. Regression-based PC The regression-based PC is a particular type of regression problem where the predictors ψi are multivariate orthogonal polynomials in the input variables [27, 33]. In PC, the classical regression method consists in calculating a series of deterministic simulations for n realizations {ξ i , i = 1, ..., n} of the uncertain parameters. The response vector Yi = Y (ξi ) is computed by evaluating the deterministic solver at these points. The following linear system of equations can be obtained: ⎛ ψ (ξ 1 ) ⎜ 0 ⎜ ⎜ ψ0 (ξ 2 ) ⎜ ⎜ . ⎜ . ⎜ . ⎝ ψ0 (ξ n )

ψ1 (ξ 1 ) 2

···

ψ1 (ξ ) .. .

··· .. .

ψ1 (ξ n )

···

ψP (ξ 1 )

⎞⎛

u ˆ0

⎞

⎛

⎞ Y1

⎟⎜ ⎟ ⎜ ⎟ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ˆ 1 ⎟ ⎜ Y2 ⎟ ψP (ξ ) ⎟ ⎟ ⎜u ⎟ ⎜ ⎟ ⎟ ⎟=⎜ ⎟ .. ⎟ ⎜ ⎜ .. ⎟ ⎜ .. ⎟ . ⎟⎜ . ⎟ ⎜ . ⎟ ⎠⎝ ⎠ ⎝ ⎠ ψP (ξ n ) u ˆP Yn 2

(8)

or, in matrix form: ˆ=y Au

(9)

where A is called the design matrix and y is the response vector. It is generally recommended to use over-sampling in order to improve the accuracy of the ˆ are then calculated using polynomial coeﬃcients. The regression coeﬃcients u a least-squares method where the sum of the squares of the error is minimized. The least-squares solution is given by: −1 T A y u ˆ = AT A

(10)

In the literature [34], it was found that an over-sampling ratio of 2 gives a better approximation of the polynomial coeﬃcients. In terms of computational cost, this translates into the calculation of n = 2(P + 1) samples. The computational cost of the regression method is directly proportional to the number of terms in

7

105

the PC expansion. As a result, solving a full PC expansion rapidly becomes unaﬀordable as the number of stochastic dimension increases. To circumvent this issue, an eﬃcient stepwise regression method that builds sparse PC expansion will be developed. 3.3. Stepwise regression method

110

When building a regression model, the most important basis functions describing the problem are a priori not known but should be selected from a (possibly large) pool of candidate basis functions, e.g. a set of orthogonal polynomials. Those candidates are usually stored in a so-called dictionary. One of the most diﬃcult challenges in regression analysis is the selection of the set of

115

predictors to be employed in the model [32]. This is becoming a growing issue especially when there are many predictors available for only a limited number of samples, as is typically the case in regression-based PC (Section 3.2). In the sequel, an automatic search procedure will be proposed to ﬁnd adaptively the most important PC terms from a (potentially large) pool of candidate

120

basis functions. The search procedure exploits the probabilistic tools introduced earlier. 3.3.1. Description of the stepwise algorithm The ﬂowchart of the algorithm is sketched in Fig. 1. As depicted in the diagram, the algorithm starts with an initialization phase (Fig. 1). During

125

this phase, a design of experiment X = [ξ 1 , ..., ξ n ] is created using standard sampling technique such as random sampling or quasi-random Sobol sampling. The samples are evaluated using the deterministic solver and the outputs are stored in y = [Y1 , Y2 , ..., Yn ]. The number of samples n is a user-deﬁned constant chosen according to the available computational resources. Moreover, the

130

pool of candidate basis functions is deﬁned from the multivariate orthogonal polynomials ψj corresponding to a full PC expansion. The pool is initialized by setting the maximum PC expansion order p, deﬁning a set of P + 1 candidate

8

FIRST FORWARD STEP

START Initialization

Regress y on each 𝜓j individually

Add 𝜓j* with maximum selection criterion Eq.(11)

Update êi = y - ŷi

Is stopping condition met?

YES STOP

NO

Remove irrelevant basis function(s)

YES

Any confidence interval includes zero?

Regress êi on each 𝜓j not in the current model

Add 𝜓j* with maximum selection criterion Eq.(11)

FORWARD STEP

NO

BACKWARD STEP

Figure 1: Flowchart of the stepwise regression technique used to sequentially build sparse polynomial chaos expansion. At start, the regression model is empty, i.e. it contains no basis functions. The pool of candidate basis functions is deﬁned from the multivariate orthogonal polynomials ψj corresponding to a full PC expansion. The pool is initialized by setting the maximum PC expansion order p, deﬁning a set of P + 1 candidate basis functions. At each step, basis functions are added and/or deleted from the current regression model based on the probabilistic tools introduced earlier. The stopping criterion is based either on a maximum number of iterations or a maximum number of predictors entered in the regression model.

9

basis functions (Eq. (1)). The algorithm starts with an empty regression model, containing no basis functions. The next step after initialization is a ﬁrst forward step (Fig. 1). At the end of this phase, the very ﬁrst predictor will be added to the current regression model. The selection procedure works as follows. Each candidate basis function ψj is assessed individually (one-by-one) by ﬁtting the response y with one-predictor regression models, e.g. if the pool of candidate predictor variables is made of P +1 candidates, then P +1 one-predictor regression models are constructed independently from each other. The one-predictor regression models are solved using Eq. (4) and the variance of each regression coeﬃcient is calculated using Eq. (6). Based on these data, the "best" candidate basis function ψj ∗ is found using: ∗

j = argmaxj 135

|ˆ u |

j , j = 0, ..., P V(ˆ uj )

(11)

where u ˆj and V(ˆ uj ) are respectively the estimate and variance of the regression coeﬃcient when the sole candidate predictor ψj is used for ﬁtting the response. The candidate predictor variable that maximizes this quantity is deemed the "best" one among all candidates. As such, it is added to the current regression model. As a result, the basis functions with large regression coeﬃcient and

140

small standard deviation are more likely to enter the regression model. This will ensure the eﬀectiveness and robustness of the method. Once the "best" predictor ψj ∗ entered the regression model, the current residual eˆi is updated using Eq. (5) and ψj ∗ is removed from the pool of candidate predictor variables. At this stage, the current regression model contains one predictor. The

145

algorithm now enters a loop which combines a forward-backward step. The forward step (Fig. 1) follows the exact same principle as the ﬁrst forward step, apart from the fact that the one-predictor regression models are ﬁtted on the current residual eˆi . Also, as the inclusion of any new predictor in the regression model may have aﬀected the importance of the predictors already

150

entered in the model, the forward step is directly followed by a backward step (Fig. 1). The back step consists in checking the conﬁdence interval of 10

the regression coeﬃcients already entered in the model using Eq. (7). If any conﬁdence interval includes zero, then the predictor(s) is (are) removed from the current regression model and the residual is updated using Eq. (5). This 155

process continues until a maximum number of predictors is entered in the model or until a maximum number of iteration is reached (user-deﬁned constants). At the end of the stepwise procedure (Fig. 1), a ﬁrst set of important predictors is derived. The number of important predictors captured is mostly driven by the number of iterations the algorithm performs. As a result, the

160

resulting set of predictors might not be optimal. This was already pointed out in the work of Blatman [35]. Thus, as a second and ﬁnal phase of the algorithm, an extra criterion is deﬁned to come up with an optimal subset of predictors. 3.3.2. Optimal PC truncation strategy Generally, a regression coeﬃcient with a wide conﬁdence interval indicates

165

that there is little knowledge on the true value of the estimate, due to lack of information. If so, this means that the coeﬃcient estimate is inaccurate. A poor quality of the parameter estimates can deteriorate the quality of the global model. Therefore, a truncation strategy is proposed to come up with an "optimal" ﬁnal PC expansion. The truncation strategy operates as follows. The

170

"optimal" PC expansion is obtained by retaining, from the ﬁnal regression model generated by the stepwise algorithm, the predictor variables whose regression coeﬃcients is accurately estimated. The accuracy of the coeﬃcient estimates is measured using the relative margin of error. More speciﬁcally, all the predictors whose relative margin of error is smaller than a given threshold, e.g. 10-20%,

175

are considered as accurate predictors and are therefore retained in the ﬁnal model. The others are deemed as inaccurate predictors and, as such, are simply removed from the regression model. When all the inaccurate predictors have been removed, two possibilities exist: either we keep the remaining coeﬃcients unchanged or we may prefer the ordinary least squares estimates. In the present

180

work, the latter solution is preferred over the former. The choice of the cutoﬀ parameter is rather arbitrary and can be determined using a trial and error

11

approach. This truncation strategy will be validated through several application examples. 3.3.3. Eﬃcient treatment of large dictionary 185

The size of the dictionary rapidly becomes too large when dealing with high dimensional problems and/or high PC order. This might induce extra cost during the forward step, where the computational cost is directly proportional to the number of candidate predictor variables stored in the dictionary. To circumvent this issue, Blatman and Sudret [27, 35] introduced the so-

190

called hyperbolic index set, which is based on the sparsity-of-eﬀects principle. While of practical interest (it enables to deal with much higher stochastic dimensions), the hyperbolic index set does not rely on any physical observation related to the problem itself. In particular, the choice of the q parameter, deﬁning the shape of the hyperbola, is rather arbitrary. Hence, some important

195

contributions may be discarded if an inappropriate q is chosen. As an alternative, the use of multi-processors is a simple and easy solution to push back the limits arising from the curse of dimensionality. If nCPU processors are available, then the total number of candidates is divided into nCPU bins processed in parallel. This solution was preferred in this study, where a parallel

200

programming framework is developed in order to handle eﬃciently a large pool of candidates. In Fig. 2, the speed-up achieved to examine a pool of candidates of a certain size is represented against the number of CPUs. The results are averaged over 10 scans. A parallel eﬃciency of more than 90% is achieved when more than one million candidates are examined on 8 processors. This roughly

205

corresponds to a case with 200 random variables and a maximum PC order 3. It is worth noting that, in the latter plot, the design matrix was not stored, due to the potentially high memory requirement. In terms of computational cost, it appears that slightly less than 8 minutes is needed for examining those R R candidates on 16 Intel Xeon Processor E5-2697 v3 (2.6GHz, 35Mb Cache),

210

compared to 100 minutes on a single of these processors. Notice also that if the design matrix had been stored, then the time needed to scan the dictionary

12

16

Ideal case

O(1e4) O(1e5) Size = O(1e6) Size =

14

Size =

Speed-up

12

10

8

6

4

2 2

4

6

8

10

12

14

16

nCPU Figure 2: Speed-up achieved to assess a pool of candidate basis functions of a certain size. A parallel eﬃciency of more than 90% is achieved on when one million candidates are assessed on 8 processors. It is worth pointing out that the full design matrix was not stored due to the potentially high memory requirement.

would have been much smaller, e.g. only 30 seconds (versus 8 minutes without storage) for more than a million candidates in the dictionary (not shown). But, as already mentioned, the latter solution is more demanding in a memory point 215

of view. Note also that one scan is necessary per iteration of the algorithm. At the end of the day, one can therefore conclude that the computational cost associated with the screening phase has become - to a large extend - negligible compared to the cost of 1 CFD simulation.

4. Results and discussion 220

The present section is dedicated to the validation and assessment of the proposed sparse regression method. Two benchmark analytical functions are ﬁrst considered. The ﬁrst one is the well-known Ishigami function (d = 3), widely

13

used by the UQ community for benchmarking purposes [36]. The second one is a more challenging high-dimensional analytical function (d = 20). In both cases, 225

the proposed stepwise regression procedure (Fig. 1) is run for building sparse polynomial chaos expansions. The performance (eﬀectiveness, robustness) of the method is compared with the LAR-based selection technique. For the sake of consistency, the same settings are used in both analyses, unless explicitly stated otherwise. The quality of the various regression models is measured building an

230

independent testing set. The testing set is made of a large number of randomly selected samples which are exactly evaluated. The predictions Yˆi are then confronted with the exact responses Yi and the relative 2 error ˆ is calculated as follows: N def

ˆ =

i=1 N i=1

def 1 where Y¯ = N 235

N i=1

Yi − Yˆi

Yi − Y¯

2

2

,

N = 50, 000

(12)

Yi .

To further prove the good performance of the developed methodology, the computational eﬃciency of the method is eventually demonstrated on a more realistic CFD application, consisting of a 2D RAE2822 transonic airfoil subject to 10 geometrical uncertainties (d = 10). 4.1. Test case 1: Ishigami function The Ishigami function [36] is ﬁrst considered: Y (ξ) = sin(ξ1 ) + a sin2 (ξ2 ) + b ξ34 sin(ξ1 )

240

(13)

where a = 7.0, b = 0.1 and ξi (i = 1, 2, 3) are random variables uniformly distributed over [−π, π]. The Ishigami function is an interesting test case for UQ because it can be well approximated with series of polynomials as the function consists of sine and cosine functions. As a result, the sparsity of the PC solution is guaranteed. As the random variables are uniformly distributed, the 14

245

basis functions considered in this initial study are the multivariate Legendre polynomials. 4.1.1. Assessment of the method eﬀectiveness The performance of the proposed stepwise algorithm is compared with the LAR-based selection technique. Building upon the work of Blatman [35], two

250

quasi-random Sobol experimental designs of size n = 75 and n = 100 are created. The maximum PC expansion order is ﬁxed at 10, which means that the initial pool of candidate basis functions contains 3+10 = 286 multivariate 10 Legendre polynomials. An arbitrary threshold of 20% is selected for the "optimal" truncation of the ﬁnal regression model created by the stepwise regression

255

algorithm. The results are reported in Table 1. It is observed that the stepwise algorithm yields more sparse and more accurate metamodels than the LAR algorithm. For example, when the LAR procedure is run with n = 100, the best regression model contains 57 terms while a PC expansion consisting of only 17

260

terms is obtained using the stepwise algorithm. At the same time, the error is reduced by almost one order of magnitude. It is worth pointing out that the 17 predictor variables found by the stepwise procedure correspond to the "exact" solution. As a matter of fact, the actual important variables can be identiﬁed by writing the Taylor series expansion of Eq. (13), and truncating it after order 10.

265

It is also found that the results obtained with n = 75 using the stepwise algorithm are consistent as the exactly same 17 predictors are retained in the ﬁnal regression model. The only diﬀerence is that the relative standard deviation of the regression coeﬃcients is smaller with n = 100, resulting in a more accurate estimation of the regression coeﬃcients and hence a more accurate model, as

270

shown in Table 1 by the smaller relative 2 error. In particular, the maximum relative standard deviation of the regression coeﬃcients is approximately 15% with n = 75 whereas it is reduced to about 10% with n = 100. This makes sense as more information is available in the second case. Overall, these ﬁndings seem to indicate that the stepwise algorithm is more eﬃcient than the LAR

15

Table 1: Ishigami function - Comparison between the performance of the LAR-based selection technique and the performance of the proposed stepwise procedure to build sparse PC expansion. In both case, a ﬁxed PC order 10 is chosen resulting in a dictionary made of 286 multivariate Legendre polynomials. The experimental designs are based on quasi-random Sobol sequence. The LAR results are reported in [35].

LAR n = 75 Relative 2 error

2.3 × 10

Level of sparsity

275

−5

Stepwise

n = 100 −5

1.2 × 10

49

57

n = 75 −6

4.5 × 10 17

n = 100 3.7 × 10−6 17

algorithm. Some further comparisons are made available in Fig. 3 where the actual predictors captured by the stepwise selection technique are faced with the full PC solution. In this same ﬁgure, the eﬀectiveness of the method is demonstrated by increasing the maximum PC order from p = 10 to p = 20 while keeping the

280

same number of samples (n = 100). The tenth- and twentieth- order full PC expansions were calculated using regression with an over-sampling ratio of 2. The red crosses indicate the actual predictors selected by the stepwise method. It is clear, from Fig. 3, that the stepwise method is capable of capturing the most important predictor variables.

285

Finally, the convergence rate of the aforementioned cases is reported in Fig. 4. It is worth mentioning here that both adaptive techniques are run using degree adaptivity, as described in [27], to exploit the full potential of the methods. It is shown that the adaptive regression techniques present a faster convergence rate compared to the expensive full PC expansions. This conclusion is fully con-

290

sistent with the work of Blatman [35]. In addition, the proposed method yields faster convergence than the LAR technique. In particular, the LAR-based selection technique requires 150 samples to achieve an accuracy of 10−10 whereas the stepwise method needs approximately 50% less samples (80 samples) to achieve a similar level of accuracy.

16

10

p

1

=10, n =100

10

p

2

=20, n =100

Full PCE 10

10

0

Full PCE 10

Stepwise

10

-1

10

10

10

-4

-6

-3

10 10

Stepwise

-2

-2

10

|

uˆ |

10 10

0

-8

-4

10 -5

10

-6

0

50

100

150

200

250

300

10

-10

-12

-14

0

200

(a)

400

600

800

1000 1200 1400 1600 1800

(b)

Figure 3: Ishigami function - Comparison between the predictors captured by the stepwise regression algorithm using p = 10 (left, 17 predictors captured among 286) and p = 20 (right, 30 predictors captured among 1771). In both case, the exact same experimental design (quasi-random Sobol sequence, n = 100) is considered. The tenth- and twentieth- order full PC expansions were calculated using regression with an over-sampling ratio of 2. Red crosses indicates the most signiﬁcant contributions detected by the stepwise algorithm.

295

4.1.2. Assessment of the method robustness The robustness of both methods is now assessed by replicating the analyses using random experimental designs of the same size (n = 100). The results are provided in Fig. 5 under the form of box plots. It turns out that both the median and the interquartile range are signiﬁcantly smaller for the stepwise method. It

300

has also to be mentioned that most of the sparse metamodel generated by the LAR-based selection technique are diﬀerent from each other in the sense that they are made of diﬀerent number of predictors. In contrast, a large majority of metamodels (85%) generated by the stepwise selection technique is made of the same 17 terms. This is illustrated in Fig. 6 by means of an histogram.

305

This demonstrates the superior robustness of the proposed stepwise regression method over the LAR-based selection technique. 4.2. Test case 2: High dimensional function The second analytical function is a high dimensional arbitrary function, originally proposed in UQLab [37] as benchmark example. The function has been 17

Full PCE LAR Stepwise

100

Relative 2 error

10-2 10-4 10-6 10-8 10-10 10-12 10-14 0

200

400

600

800

Number of samples

1000

Figure 4: Ishigami function - Convergence curves of full and sparse PC expansions using quasi-random Sobol sequences. On the one hand, the full PC expansions (order: 2, 4, 6, 8, 10 and 12) are calculated using regression with an oversampling ratio of 2. On the other hand, the sparse PC expansions are built using degree-adaptivity to exploit the full potential of the techniques. The LAR results were obtained using UQLab [37].

slightly simpliﬁed compared to its original formulation to keep the calculation of the reference full PC expansion aﬀordable. The function is deﬁned as: d d d 2 1 1 3 5 4 kξk + kξk + ln k ξ k + ξk Y (ξ) = 3 − d d 3d k=1

k=1

(14)

k=1

where d = 20 and ξi (i = 1, 2, ...20) are random variables uniformly distributed over [1, 2]. As a result, the multivariate Legendre polynomials are used as can310

didate predictors in the following analyses. 4.2.1. Assessment of the method eﬀectiveness The developed methodology is used to identify the most important predictors among a dictionary consisting of multivariate Legendre polynomials. As in the previous example, the performance of the proposed method is compared with

18

100

error

10-3

Relative

10-2

2

10-1

10-4 10-5 10-6

LAR

Stepwise

Figure 5: Ishigami function - Box plots of the relative 2 error corresponding to 100 random experimental designs. The box is characterized by the ﬁrst quartile (bottom line), the median (red line) and the third quartile (upper line). The whiskers indicates the variability of the data outside the ﬁrst and third quartiles. The ends of the whiskers lie at a distance of 1.5 interquartile range from the ﬁrst/third quartile. Outliers are represented by blue crosses. The simulations were performed at equal settings, i.e. p = 10, n = 100.

315

the LAR-based selection technique. The LAR results are obtained with UQLab, a well-validated MATLAB-based UQ software developed by Marelli and Sudret [37]. The analyses are performed using two quasi-random experimental designs of size n = 200 and n = 300. The samples are directly imported from UQLab in our framework so that exactly the same information is provided to both

320

methods as inputs. The adaptive methods are run using degree-adaptivity to exploit the full potential of both techniques. A threshold of 10% is chosen for an "optimal" truncation of the ﬁnal regression model created by the stepwise regression algorithm. The results are gathered in Table 2. It turns out that the stepwise algorithm

325

yields more sparse and more accurate PC expansions than the LAR algorithm.

19

100

LAR Stepwise

Frequency

80 60 40 20 0 0

10

20

30

40

50

Level of sparsity

60

70

80

Figure 6: Ishigami function - Histograms showing the level of sparsity of the sparse metamodels based on 100 runs using random design of experiments. On the one hand, the stepwise method is robust, as 85% of the models are made of the exactly same 17 predictors resulting in good prediction capabilities (Fig. 5). On the other hand, the output of the LAR-based basis selection technique results in a wide range of diﬀerent models, showing some lack of robustness with respect to changes in the experimental design.

Those observations are in-line with previous ﬁndings. It is also observed that, with n = 200, the proposed stepwise method is capable of capturing the 3rd order contributions whereas LAR is not. As in the previous example, the actual predictors captured by the algorithm 330

are faced against a reference solution which is, in that case, a full PC expansion of order 3. The 3rd order full PC expansion was calculated using regression with an over-sampling ratio of 2 and using a quasi-random experimental design. The calculation of the reference solution requires the evaluation of 2× 20+3 = 3, 542 3 samples. The results are made available in Fig. 7 where red crosses denote the

335

actual basis functions selected by the proposed stepwise regression method. In light of the above, it is clear that the stepwise algorithm is perfectly capable of

20

Table 2: High dimensional function - Comparative study between the LAR-based basis selection technique and the stepwise method to build sparse PC expansions. The comparison is performed using two quasi-random experimental designs of size n = 200 and n = 300. The best PC order reported in the table is the one leading to the lowest leave-one-out cross validation error [35], to prevent from over-ﬁtting the data.

LAR n = 200 Relative 2 error

1.7 × 10

−3

Stepwise

n = 300 −4

3.5 × 10

n = 200 −4

2.3 × 10

n = 300 9.4 × 10−5

Level of sparsity

86

184

52

60

Best PC order

2

3

3

3

10

p

2

=3, n =200

10

p

2

=3, n =300

Full PCE 10

10

10

|

uˆ |

10

10

10

10

10

10

1

Full PCE 10

Stepwise

0

10

-1

10

-2

10

-3

10

-4

10

-5

10

-6

10

-7

0

200

400

600

800

1000 1200 1400 1600 1800

(a)

10

1

Stepwise

0

-1

-2

-3

-4

-5

-6

-7

0

200

400

600

800

1000 1200 1400 1600 1800

(b)

Figure 7: High dimensional function - Comparison between the predictors captured by the stepwise algorithm using two quasi-random design of experiments of size n = 200 (left) and n = 300 (right). The proposed stepwise algorithm successfully captures the most important basis functions. Also, the more samples, the more relevant predictors will be captured, showing some consitency in the results. Those results are the ones reported in Table 2.

identifying the most relevant predictors describing the problem. The successfull identiﬁcation of the most relevant basis functions is associated with a large reduction in computational cost. It is also worth mentioning the consistensy of 340

the results as the more samples is calculated, the more relevant predictors are

21

properly identiﬁed, as shown in Fig. 7. For the sake of completeness, convergence curves are provided in Fig. 8. It illustrates again the major beneﬁt of using sparse PC expansions, as a faster convergence is achieved with relatively small sample size. It is also found that, 345

for a suﬃciently large sample size, the two sparse regression methods converge to the same solution, where in both cases the major contributions will be properly captured.

101

Full PCE LAR Stepwise

100

Relative 2 error

10-1 10-2 10-3 10-4 10-5 10-6 10-7 0

500 1000 1500 2000 2500 3000 3500 4000

Number of samples

Figure 8: High dimensional function - Convergence curves of full and sparse polynomial chaos expansions using quasi-random Sobol sequence. On the one hand, the full PC expansions (order: 1, 2, 3) are computed using regression with an over-sampling ratio of 2. On the other hand, the sparse PC expansions are built using degree-adaptivity to exploit the full potential of the techniques.

4.2.2. Assessment of the method robustness As in the previous test case, the robustness of the method is assessed by 350

replicating the analyses using 100 random design of experiments. The results are provided in Fig. 9 under the form of box plots with n = 200. The variation of the results is in that case comparable, as shown by the similar interquartile range 22

for both methods. The robustness of the method is actually better illustrated by the consistency in the generated metamodels when changing the experimental 355

design. The level of sparsity of the resulting metamodels is reported in Fig. 10 under the form of a histogram. The proposed stepwise method consistently builds models containing approximately the same number of predictors. In particular, it is shown that 96% of the sparse regression models contain between 52 and 57 predictors whereas the sparse models build with the LAR method varies much more, containing from less than 60 to more than 100 predictors. This analysis conﬁrms the superior robustness of the proposed method against the state-of-the-art technique.

10-3

2

error

10-2

Relative

360

10-4

10-5

LAR

Stepwise

Figure 9: High dimensional function - Box plots of the relative 2 error based on 100 random design of experiments. The box is characterized by the ﬁrst quartile (bottom line), the median (red line) and the third quartile (upper line). The whiskers indicates the variability of the data outside the ﬁrst and third quartiles. The ends of the whiskers lie at a distance of 1.5 IQR from the ﬁrst/third quartile. Outliers are represented by blue crosses. The simulations were performed at optimal settings, allowing to exploit the full potential of the methods.

23

100

Stepwise LAR

Frequency

80 60 40 20 0 40

50

60

70

80

Level of sparsity

90

100

110

Figure 10: High dimensional function - Histograms showing the level of sparsity of the sparse metamodels based on 100 runs using random design of experiments. Most of the sparse metamodels generated by the proposed stepwise method are made of the same basis functions (96% of the metamodels contains between 52 and 57 terms). In contrast, the level of sparsity of the models generated by the LAR method varies much more (from less than 60 to more than 100 terms).

4.3. Test case 3: 2D RAE2822 The last application consists of the non-deterministic ﬂow around a 2D 365

RAE2822 transonic airfoil at M = 0.734, angle of attack of 2.79◦ and a Reynolds number of 6.5 × 106 . This application is used to demonstrate the computational eﬃciency of the method when applied to a more realistic test case. 4.3.1. Description of the test case The model is directly taken from [20]. A detailed description of the test case is available in the aforementioned reference. Only a brief summary is provided in the following. The geometry of the airfoil is assumed uncertain. This is a simple way to introduce roughness or manufacturing tolerances in the model.

24

The following Gaussian shaped covariance with zero mean is considered: (si − sj )2 C(si , sj ) = σ(si )σ(sj ) exp − 2b2

(15)

where si and sj are surface coordinates along the airfoil with s = 0 at the trailing edge. The random ﬁeld can be written as a linear combination of modes, using a Karhunen-Loève (KL) expansion [20]: ¯ X(s, ξ) ≈ X(s) +

d

λk φk (s) ξk · n

(16)

k=1

¯ is the airfoil mean geometry, where X is the airfoil geometry at sample ξ, X 370

φk and λk are respectively the eigenfunctions and eigenvalues, solution of a socalled Fredholm integral equation [38]. The term n is the direction normal to the proﬁle and ξk are random variables which are assumed to follow a uniform law over [-1,1]. This assumption was used in the literature by many researchers [20, 23, 39]. The correlation length and standard deviation of the stochastic

375

process are set to b = 0.2 and σ = 0.002, respectively. The resulting KL expansion Eq. (16) is truncated after 10 modes, which means the stochastic problem is described with 10 random variables. The nominal geometry and some realizations are depicted in Fig. 11.

Nominal geometry Realization 1 Realization 2 Realization 3 Realization 4

y

0.08 0.06 0.04 0.02 0.00 −0.02 −0.04 −0.06 −0.08 −0.2

0.0

0.2

0.4

x

0.6

0.8

1.0

1.2

Figure 11: RAE2822 - Nominal geometry and a few realizations. The nominal geometry is subject to 10 geometrical uncertainties. A Gaussian process is assumed with b = 0.2 and σ = 0.002

25

The CFD results are computed by solving the compressible Reynolds av380

eraged Navier-Stokes (RANS) equations. The Spalart-Allmaras one-equation turbulence model is used for ﬂow predictions along with a second-order upwind scheme for the approximation of non-linear convective terms in all transport equations. The grid is a C-type mesh, made of 4.4 × 104 nodes, as illustrated in [20]. The CFD results were validated against experiment data in [20].

385

4.3.2. Stochastic analysis As random variables are assumed uniformly distributed, the multivariate Legendre polynomials were used in the analyses. Initially, a convergence study is performed to determine the optimal PC order to achieve convergence of the ﬁrst two statistical moments. This investigation showed that a 3rd order PC

390

is suﬃcient to get an accurate estimate for the statistics of interest. Therefore, the latter is chosen as the reference solution for the analyses. The calculation of the reference solution requires 2 × 10+3 = 572 full model evaluations. The 3 adaptive techniques are run using a quasi-random experimental design of size 50. A cut-oﬀ level of 20% is applied for the proposed stepwise regression method.

395

The estimated statistical moments of the aerodynamic coeﬃcients are summarized in Table 3. It is clear that the statistical information is accurately captured, but at a much cheaper cost than the reference estimates. The adaptive methods estimate both the mean and the standard deviation accurately. In each case, the stepwise algorithm builds more sparse metamodel than the LAR

400

algorithm (not shown).

5. Conclusion In the present paper, a computational eﬃcient framework for uncertainty quantiﬁcation was implemented and validated. A major contribution of this paper is the development of an adaptive regression method for building sparse 405

polynomial chaos expansions. The criterion adopted for ﬁnding the most important basis functions is based on tools relevant to probabilistic methods (e.g. variance of the regression coeﬃcient, conﬁdence intervals). Moreover, in order 26

Table 3: 2D RAE2822 - Estimated statistical moments of the aerodynamic coeﬃcients using various sparse regression techniques. The reference solution is calculated using a full PC order 3, which requires the evaluation of 572 CFD samples. In each case, a quasi-random experimental design is chosen.

Reference

LAR

Stepwise

Mean

Std.

Mean

Std.

Mean

Std.

Drag coeﬃcient∗

240.3

10.4

239.9

10.1

239.9

10.1

Lift coeﬃcient

0.76

0.0093

0.76

0.0091

0.76

0.0093

Moment coeﬃcient

0.093

0.0032

0.093

0.0032

0.093

0.0032

Full model evaluations ∗

572

50

50

in drag count

to handle eﬃciently a large number of candidate basis functions, a parallel programing framework was suggested, enabling to extend considerably the range 410

of applications of the proposed methodology. The strength of the proposed methodology is twofold. Firstly, the basis selection process is eﬃcient and reliable as the most important stochastic features are successfully captured. Secondly, the methodology is robust in the sense that the dependence with respect to the experimental design is weak. In particular,

415

the proposed methodology tends to build equally accurate regression models consisting of approximately the same basis functions. In addition, a new truncation criterion was proposed for building an "optimal" PC expansion. It aims at removing inaccurate predictors from the sparse PC expansion generated by the stepwise algorithm to enhance the quality of

420

the metamodel. Those inaccurate contributions are identiﬁed by calculating the conﬁdence interval on the regression coeﬃcients. All predictors whose associated relative standard deviation exceeds some pre-deﬁned threshold (cut-oﬀ value) are removed. In the present work, a cut-oﬀ level between 10% and 20% was shown to be eﬀective based on trial and error approach. An automatic

425

procedure shall be proposed in the future. 27

For validation purposes, several applications were considered. Two benchmark analytical functions were initially studied. An in-depth comparison with the LAR-based selection technique was carried out. It was shown that the proposed stepwise algorithm yields more sparse and more accurate metamodel 430

than LAR. Furthermore, a robustness study of the methods is also presented, by repeating the analyses using random design of experiments. It is shown that, unlike the LAR basis selection technique, the proposed stepwise algorithm consistently builds accurate metamodel consisting of about the same predictors. As a second step, a 2D RAE2822 airfoil in transonic regime, subject to 10 geomet-

435

rical uncertainties, was investigated. This application was used to demonstrate the computational eﬃciency of the method when applied to CFD. It was shown that the ﬁrst two statistical moments are recovered using as few as 50 samples. Overall, it can be concluded that the proposed method achieves superior performance compared to the LAR-based selection technique both in terms of

440

accuracy and number of samples. Moreover, the method can still be improved by using an adaptive design of experiment and/or an adaptive dictionary, as already suggested in [35]. A future area of research will be to perform further validations on relevant industrial applications and to incorporate this eﬃcient uncertainty quantiﬁcation procedure into a global optimization framework.

445

Acknowledgement This work was funded by the SBO EUFORIA project (IWT-140068). This support is gratefully acknowledged. References [1] H. Najm, Uncertainty Quantiﬁcation and Polynomial Chaos Techniques

450

in Computational Fluid Dynamics, Annu. Rev. Fluid Mech. 41 (1) (2009) 35–52. [2] A. Forrester, A. Sobester, A. Keane, Engineering Design via Surrogate Modelling: A Practical Guide, Wiley, ISBN: 978-0-470-06068-1, 2008. 28

[3] H. Cheng, A. Sandu, Eﬃcient uncertainty quantiﬁcation with the polyno455

mial chaos method for stiﬀ systems, Math. Comput. Simulat. 79 (11) (2009) 3278–3295. [4] A. De Gennaro, C. Rowley, L. Martinelli, Uncertainty Quantiﬁcation for Airfoil Icing Using Polynomial Chaos Expansions, J. Aircraft 52 (5) (2015) 1404–1411.

460

[5] N. Wiener, The Homogeneous Chaos, Am. J. Math. 60 (4) (1938) 897–936. [6] D. Xiu, G. Karniadakis, The Wiener-Askey polynomial chaos for stochastic diﬀerential equations, SIAM J. Sci. Comput. 24 (2) (2002) 619–644. [7] D. Xiu, G. Karniadakis, Modeling uncertainty in ﬂow simulations via generalized polynomial chaos, J. Comput. Phys. 187 (1) (2003) 137–167.

465

[8] R. Askey, J. Wilson, Some Basic Hypergeometric Orthogonal Polynomials That Generalize Jacobi Polynomials, AMS, 1985. [9] C. Lacor, S. Smirnov, Uncertainty Propagation in the solution of compressible Navier-Stokes Equations using Polynomial Chaos Decomposition, in: CD Rom Proc. of NATO AVT symposium, Athens, 13, 2007.

470

[10] S. Smirnov, C. Lacor, Non-Deterministic Compressible Navier-Stokes Simulations using Polynomial Chaos, in: Proc. ECCOMAS Conf, Venice, 2008. [11] C. Dinescu, S. Smirnov, C. Hirsch, C. Lacor, Assessment of intrusive and non-intrusive non-deterministic CFD methodologies based on polynomial chaos expansions, Int. J. of Eng. Systems Modeling and Simulations 2 (1)

475

(2010) 87–98. [12] D. Kumar, C. Lacor, Heat conduction in a 2D domain with geometrical uncertainty using intrusive polynomial chaos method, in: Proc. of the 9th National Congress on Theoretical and Applied Mechanics, Brussels (Belgium), 2012.

29

480

[13] G. Loeven, J. Witteveen, H. Bijl, Probabilistic collocation: an eﬃcient nonintrusive approach for arbitrarily distributed parametric uncertainties, in: Proceedings of the 45th AIAA Aerospace Sciences Meeting and Exhibit, vol. 6, AIAA Paper 2007-317, Reno, Nevada, 3845–3858, 2007. [14] M. Eldred, J. Burkardt, Comparison of Non-Intrusive Polynomial Chaos

485

and Stochastic Collocation Methods for Uncertainty Quantiﬁcation, in: Proceedings of the 47th AIAA Aerospace Sciences Meeting including The New Horizons Forum and Aerospace Exposition, AIAA Paper 2009-976, 2009. [15] S. Hosder, Stochastic response surfaces based on non-intrusive polynomial

490

chaos for uncertainty quantiﬁcation, International Journal of Mathematical Modelling and Numerical Optimisation 3 (1-2) (2012) 117–139. [16] M. Raisee, D. Kumar, C. Lacor, A non-intrusive model reduction approach for polynomial chaos expansion using proper orthogonal decomposition, Int. J. Numer. Meth. Eng. 103 (4) (2015) 293–312.

495

[17] M. Pettersson, G. Iaccarino, J. Nordström, Polynomial Chaos Methods for Hyperbolic Partial Diﬀerential Equations, Springer, ISBN: 978-3-31910714-1, 2015. [18] M. Reagan, H. Najm, R. Ghanem, O. Knio, Uncertainty quantiﬁcation in reacting-ﬂow simulations through non-intrusive spectral projection, Com-

500

bust Flame 132 (3) (2003) 545–555. [19] M. Berveiller, B. Sudret, M. Lemaire, Stochastic ﬁnite element:

A

non intrusive approach by regression, Revue Européenne de Mécanique Numérique 15 (1) (2006) 81–92. [20] D. Kumar, M. Raisee, C. Lacor, An eﬃcient non-intrusive reduced basis 505

model for high dimensional stochastic problems in CFD, Comput. Fluids 138 (2016) 67–82.

30

[21] D. Donoho, Compressed sensing, IEEE Trans. Inform. Theory 52 (4) (2006) 1289–1306. [22] E. Candès, M. Wakin, An introduction to compressive sampling, IEEE Sig. 510

Proc. Mag. 25 (2) (2008) 21–30. [23] A. Doostan, H. Owhadi, A non-adapted sparse approximation of PDEs with stochastic inputs, J. of Comput. Phys. 230 (8) (2011) 3015–3034. [24] J. Hampton, A. Doostan, Compressive sampling of polynomial chaos expansions: Convergence analysis and sampling strategies, J. Comput. Phys.

515

280 (2015) 363–386. [25] G. Blatman, B. Sudret, Sparse polynomial chaos expansions and adaptive stochastic ﬁnite elements using a regression approach, C.R. Mecanique 336 (2008) 518–523. [26] G. Blatman, B. Sudret, An adaptive algorithm to build up sparse poly-

520

nomial chaos expansions for stochastic ﬁnite element analysis, Probabilist. Eng. Mech. 25 (2) (2010) 183–197. [27] G. Blatman, B. Sudret, Adaptive sparse polynomial chaos expansion based on Least Angle Regression, J. Comput. Phys. 230 (2011) 2345–2367. [28] B. Efron, T. Hastie, I. Johnstone, R. Tibshirani, Least angle regression,

525

Ann. Stat. 32 (2) (2004) 407–499. [29] S. Choi, R. Grandhi, R. Canﬁeld, C. Pettit, Polynomial chaos expansion with Latin Hypercube sampling for estimating response variability, AIAA Journal 42 (6) (2004) 1191–1198. [30] N. Draper, H. Smith, Applied regression analysis, Wiley series in Probabil-

530

ity and Statistics, Wiley-Interscience, 3rd edn., ISBN: 978-0-471-17082-2, 2014. [31] J. Fox, Applied regression analysis and generalized linear models, Sage, third edn., 2016. 31

[32] J. Neter, W. Wasserman, M. Kutner, Applied linear regression models, 535

R.D. Irwin, 1985. [33] S. Hosder, R. Walters, Non-intrusive polynomial chaos methods for uncertainty quantiﬁcation in ﬂuid dynamics, in:

AIAA 2010-129, 48th

AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, Orlando, Florida, 2010. 540

[34] S. Hosder, R. Walters, M. Balch, Eﬃcient sampling for non-intrusive polynomial chaos applications with multiple uncertain input variables, in: AIAA 2007-1939, 48th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Material Conference, Honolulu, Hawaii, 2007. [35] G. Blatman, Adaptive sparse polynomial chaos expansions for uncertainty

545

propagation and sensitivity analysis, Ph.D. thesis, Université Blaise Pascal, Clermont Ferrand, 2009. [36] T. Ishigami, T. Homma, An importance quantiﬁcation technique in uncertainty analysis for computer models, in: Proc. ISUMA’90, First Int. Symp. Uncertain Mod. An., University of Maryland, 398–403, 1990.

550

[37] S. Marelli, B. Sudret, UQLab, Retrieved November 10, 2015, from http: //www.uqlab.com/#!sensitivity-analysis---high-dimensional-/ cpvc, 2015. [38] R. G. Ghanem, P. D. Spanos, Stochastic Finite Element: A Spectral Approach, Springer New York, ISBN: 978-1-4612-7795-8, 1991.

555

[39] D. Xiu, D. Tartakovsky, Numerical methods for diﬀerential equations in random domains, SIAM J. Sci. Comput. 28 (3) (2006) 1167–1185.

32

A robust and efficient stepwise regression method for building sparse polynomial chaos expansions

A robust and efficient stepwise regression method for building sparse polynomial chaos expansions

Recommend Documents