Parameter identification and uncertainty quantification in stochastic state space models and its application to texture analysis

Applied Numerical Mathematics 146 (2019) 38–54 Contents lists available at ScienceDirect Applied Numerical Mathematics www.elsevier.com/locate/apnum...

Download PDF

2MB Sizes 0 Downloads 77 Views

Report

PDF Reader
Full Text

Applied Numerical Mathematics 146 (2019) 38–54

Contents lists available at ScienceDirect

Applied Numerical Mathematics www.elsevier.com/locate/apnum

Parameter identiﬁcation and uncertainty quantiﬁcation in stochastic state space models and its application to texture analysis B. Pedretscher a,b,∗ , B. Kaltenbacher a , O. Pfeiler b a b

Alpen-Adria-Universität, Klagenfurt, Austria KAI – Kompetenzzentrum Automobil- und Industrieelektronik GmbH, Villach, Austria

a r t i c l e

i n f o

Article history: Received 26 June 2019 Accepted 26 June 2019 Available online 5 July 2019 Keywords: Stochastic state space model Parameter identiﬁcation Uncertainty quantiﬁcation Proﬁle likelihood Adjoint approach Fokker-Planck equation Thermo-mechanical fatigue Texture analysis

a b s t r a c t In this paper, a computational framework, which enables eﬃcient and robust parameter identiﬁcation, as well as uncertainty quantiﬁcation in state space models based on Itô stochastic processes, is presented. For optimization, a Maximum Likelihood approach based on the system’s corresponding Fokker-Planck equation is followed. Gradient information is included by means of an adjoint approach, which is based on the Lagrangian of the optimization problem. To quantify the uncertainty of the Maximum-A-Posteriori estimates of the model parameters, a Bayesian inference approach based on Markov Chain Monte Carlo simulations, as well as proﬁle likelihoods are implemented and compared in terms of runtime and accuracy. The framework is applied to experimental electron backscatter diffraction data of a fatigued metal ﬁlm, where the aim is to develop a model, which consistently and physically meaningfully captures the metal’s microstructural changes that are caused by external loading. © 2019 IMACS. Published by Elsevier B.V. All rights reserved.

1. Introduction Dynamical system modeling in ﬁnance, biology and mechanics by stochastic process models has become popular and accepted in recent years. Usually, parameter identiﬁcation in real applications is not straightforward, due to a lack of observability of the quantities of interest. This means that most frequently the states of the dynamical system cannot be observed directly, but mostly (if at all) only indirectly, which complicates investigations of the physical or economic law that triggers the observed effects. In addition, observations are usually prone to random measurement error causing illposedness properties of the problem. Nowadays, power semiconductor devices are used in a large and increasing number of applications, also for driving assistance, where safe operation and reliable lifetime models are obligatory for obvious reasons. In automotive industry it has therefore become mandatory to test the robustness of the devices not only at nominal use, but also at overload conditions by accelerated lifetime test. During such a power cycling stress test, thermo-mechanical loads are applied by electric pulses, causing material degradation and in worst case device failure. To forecast the safe operating area of the devices, reliable lifetime models are indispensable.

*

Corresponding author at: KAI – Kompetenzzentrum Automobil- und Industrieelektronik GmbH, Villach, Austria. E-mail address: [email protected] (B. Pedretscher).

https://doi.org/10.1016/j.apnum.2019.06.020 0168-9274/© 2019 IMACS. Published by Elsevier B.V. All rights reserved.

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

39

Previously used regression models, based on Arrhenius type equations or the Coﬃn Manson law, show prediction and extrapolation weaknesses because they are not able to capture the complex degradation process of the devices [15,41]. Thus, the need for mathematical degradation modeling, based on physical laws that describe the degradation process, became apparent. First investigations in this direction were done in a diploma thesis [39], where a state space model (SSM) based on ordinary differential equations was developed. Detailed analysis showed that the developed model could not explain the damage process and thus the lifetime data reasonably, due to a lack of understanding of the material response to loading. Therefore, further research on the characteristic behavior of the used materials, as well as an extension of the mathematical models and methods are necessary. In this work, we focus on the behavior of a metal ﬁlm, which is investigated by help of a test structure that will be introduced and explained later in the text. Different types of measurements of the fatigued metal provided by scanning acoustic microscope, scanning electron microscope or electron backscatter diffraction (EBSD) combined with literature knowledge from material science allows the conclusion that stochastic processes are present, which need to be considered and included in any reasonable model. Since the entire degradation process of the structure is not yet completely understood nor modeled, a mathematical framework that enables to check and compare different modeling hypotheses in a reasonable amount of computation time is required. Exactly this requirement is the reason why spatial models based on ﬁnite elements cannot be used for this purpose, because of their high computational cost. More speciﬁcally, the aim of this work is to provide a framework that ﬁrst enables to combine the available indirect observation data with the internal states of the system, and second, can be eﬃciently implemented and numerically solved. Therefore, state space modeling combined with adjoint based gradient computation, as well as proﬁle likelihood based uncertainty quantiﬁcation (UQ) are used. In general, SSMs allow to link internal (not directly measurable) system behavior with noisy experimental data under consideration of the experimental setting. The experimental setting refers to the input variables, the internal dynamics refer to the states and the observables to the output. Input, states and output are linked by means of a parametrized state and observation equation (system). The aim is to identify the modeling parameters of this complex system. At this point note that when speaking of identiﬁability one has to distinguish between structural and practical identiﬁability; concepts that are discussed in section 2.4. As internal stochastic dynamics need to be considered, as mentioned above, an SSM framework based on Itô stochastic processes needs to be considered. Due to the noise and sparsity of the observations, parameter identiﬁcation is likely to be an ill-posed inverse problem. To nonetheless ensure reliable and robust parameter estimates, the system’s corresponding Fokker-Planck (FP) equation is considered for the optimization task; an approach proposed by Dunker and Hohage in 2014 [22]. A similar approach is used for optimal control in the articles of Annunziato and Borzì [3,4]. Basically, the FP equation is a partial differential equation (PDE) of parabolic type, which describes the evolution of the transition probability density of the states over time. Due to the use of probability densities, non-negativity and mass conservation of the solution of the FP equation have to be ensured at any time. For that reason, adequate boundary conditions, as well as the so-called Chang-Cooper (CC) discretization scheme are implemented [18,35]. The density data, which is required for the optimization, is computed out of the experimental data by means of kernel density estimation (KDE). Under certain smoothness conditions, it can be shown that the kernel density estimate is asymptotically normal, see therefore [26,19,38,27,50]. This result is important to validly deﬁne the total error in the observation equation as not only the random measurement error but also the approximation error of the KDE needs to be included. The considered l2 loss function of the squared distances between the densities given by the FP equation and the densities estimated out of the empirical observations is weighted by the variance and centered around the mean of the total observation error. Due to the fact that in the limit the loss function coincides with the negative log-likelihood of the system, these terms are used synonymously. Effective optimization requires gradient information, for which reason the problem’s Lagrangian is introduced. By combining optimality conditions, the properties of Nemytskii operators and the adjoint system, it is possible to give a representation of the gradient of the loss function with respect to the parameters, which can be computed accurately and eﬃciently compared to routines that are based on ﬁnite differences or sensitivity analysis. The quantiﬁcation of the uncertainty of the parameter estimates is mandatory in any application. Here, a proﬁle likelihood approach is followed, as literature refers it to be most promising in terms of accuracy and numerical eﬃciency. It is based on repeated local optimizations, where further details are given in [16,49,34,42]. The parameter ﬁtting procedure and UQ framework is validated for a synthetic test setting where an Ornstein-Uhlenbeck (OU) state process is considered. After validation, the framework is applied to EBSD data of the fatigued metal ﬁlm of interest, where it is investigated whether the loading induced changes in microstructure can be explained by an FP approach. The understanding of the texture evolution is important, as it is the process driving force for local stress concentrations triggering degradation. The goodness-of-ﬁt is illustrated visually and checked by means of the Kullback-Leibler (KL) divergence. Further, the UQ results obtained by proﬁle likelihoods are compared with the credible sets, which are computed by parallel tempering (PT) Markov Chain Monte Carlo (MCMC) simulations. The implementation is done in MATLAB® , by help of the open source GitHub repository toolboxes MTEX [7,8], which is used for the pre-processing of the data, and AMICI [24] and PESTO [46], which are used for the parameter estimation and UQ.

40

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

2. Mathematical & statistical fundamentals In this chapter the modeling framework is introduced and the mathematics required to do eﬃcient parameter estimating and UQ is summarized. For well understanding, the chapter is divided in four sections, introducing -

the the the the

modeling framework based on stochastic SSMs, CC discretization of the FP equation, gradient computation by an adjoint approach, as well as UQ based on proﬁle likelihoods.

2.1. Problem statement based on stochastic state space modeling As outlined in the introduction, the state dynamics of interest are captured by a stochastic process

Xθ = X tθ : t ∈ I ⊆ R+ 0 ,

(1)

i.e., a family of random variables

X tθ : → D ⊆ Rm

(2)

deﬁned on a probability space (, F , P ). More precisely, the process in our setting is an Itô process deﬁned by an SDE of the form

d X t = aθ (t , X t )dt + bθ (t , X t )dW t ,

(3)

where aθ corresponds to the drift, bθ to the volatility or diffusion coeﬃcient, and W to a standard Wiener process or Brownian motion [32]. The superscript θ indicates the model parameters θ ∈ ⊂ Rd , which have to be identiﬁed by means of the available observation data. To do so, instead of using the SDE (3) directly, its corresponding FP equation is used as state equation in the SSM framework [22]. In statistics, the FP equation is also known as forward Kolmogorov equation. Basically, it is a linear PDE of parabolic type that models the evolution of the transition probability density

u (t , x) = u t , x X 0 ∼ u θ0 ,

(4)

for t > 0 and x ∈ D. The time interval I in our application refers to the ﬁnite and closed interval [0, T ]. By means of

A θ (t , x) = and

B θ (t , x) =

1 2

T ∇ · bθ bθ − aθ (t , x),

(5)

1 θ θT b b (t , x), 2

(6)

the FP equation can be expressed as a continuity equation in divergence form as follows:

∂ u (t , x) = ∇ · B θ ∇ u + A θ u (t , x), ∂t

(7)

=: −∇ · J θ (t , x), where J θ denotes the probability ﬂux, and ∇ · C =

m ∂ j =1 ∂ x j

Cij

T i =1, ..., m

refers to a matrix divergence.

Due to the fact that the PDE in (7) models the evolution of the probability density u, its solution needs to stay nonnegative for every t ∈ I , as well as needs to conserve its mass, i.e., the integral D u (t , x) dx has to be normed to one at any time t ∈ I . For the latter reason, a no-ﬂux boundary condition

J θ (t , x) · n D (x) = 0,

x ∈ ∂ D,

(8)

ensuring that no probability mass can enter nor leave the system, is used. Non-negativity requires further studies, in particular, maximum principles do not apply directly. The case with a scalar, positive diffusion coeﬃcient was investigated by Carillo et al. in 2011 [17]. The authors showed the decay of the relative entropy, which allowed them to conclude that the solution converges to a stationary solution (equilibrium) and stays non-negative. Generalizing to the multidimensional isotropic case, characterized by a diagonal diffusion matrix with equal positive elements, non-negativity is ensured provided that the initial density is non-negative, the diffusion matrix is symmetric and positive deﬁnite, which is obviously fulﬁlled, and the fact that diffusion dominates convection at the boundary, which can be formulated mathematically as:

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

Aθ · nD > 0

on ∂ D .

41

(9)

The general anisotropic case without boundary conditions was investigated in Arnold et al. [6], where the authors also proved that the relative entropy is decreasing with time. It is expected that a combination of the techniques presented in the two mentioned papers ensures non-negativity in the anisotropic case even with no-ﬂux boundary conditions. At this point it is also interesting to mention the results of Droniou and Vázquez [21], who studied elliptic PDEs with Neumann boundary conditions comparable to the no-ﬂux condition. Due to similarity, the results can be applied to the stationary FP equation. One of their results gives almost everywhere positive or negative, or constant solutions if certain smoothness conditions on the domain D and the coeﬃcients A θ and B θ are fulﬁlled. For parameter estimation we assume to have indirect, discrete density data f (., x j † ) at prescribed time points 0 < t 1 <

· · · < tn† = T , at the spatial positions x j † ∈ D, j † ∈ 0, . . . , N † . This leads to the following strong hybrid formulation of the SSM on the k† -th sub time interval:

⎧ ⎪ ⎨

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨

k† (θ, u ) :

⎪ ⎩

⎪ ⎪ ⎪ ⎪ ⎪ ⎩

∂ u (t , x) ∂t

= −∇ · J θ (t , x)

(tk† −1 , tk† ) × D ,

J θ (t , x) · n D = 0

=

f k†

(tk† −1 , tk† ) × ∂ D , hθ† (u (tk† , .)) + ηk† k

(10)

k † ∈ 1, . . . , n † ,

which in variational form looks as follows [29]:

⎧ ∂u θ ⎪ ⎪ ( t , x ) v ( x ) − J ( t , x ) · ∇ v ( x ) dx = 0 ∀ v ∈ H 01 ( D ) , ⎨ ∂t D ⎪ ⎪ ⎩ f k† = hkθ† (u (tk† , .)) + ηk† k † ∈ 1, . . . , n † .

kweak (θ, u ) : †

(11)

The operators hθ† in the observation equation are maps of the form: k

hkθ† : C ( D ; R) → R .

(12)

The observation equation accounts for the total error ηk† ∈ R , which combines the random measurement error and the KDE approximation error. In our application, we consider independent Gaussian measurement noise for every time instance tk† of available observations. Further, the density is estimated out of empirical data by KDE, whose pointwise approximation error is asymptotically normal, see e.g. [19,38,50]. Thus, the vector of total noise ηk† additionally contains a systematic component we consider independent (with respect to the time instances tk† ) normal noise of the form μk† . Summarized,

2

ηk† ∼ N μk† , σkθ† I for k† ∈ 1, . . . , n† .

As l2 loss function, we use the asymptotic negative log-likelihood function for the problem (10) or (11), respectively, which due to independence and normality of the error reads as follows: †

ψ( f ) =

n 1

2

⎡

θ2 k†

⎣l log 2πσ

k † =1

⎤ f † − h † u (t † , .) − μ † 2 k k k k ⎦ + . θ σ† k

(13)

2

Reliable numerical simulation requires further well-posedness of the SSM in ﬁnite and inﬁnite time. Standard analysis [23, Theorems 2, 3, 4, Section 7.1.2] guarantees well-posedness of (7) in ﬁnite time, if the coeﬃcients A θ and B θ are in L ∞ ((0, T ) × D ), u θ0 ∈ L 2 ( D ), and B θ is uniformly symmetric and positive deﬁnite on (0, T ) × D. Regarding the parameter to drift and parameter to volatility if respectively, this means that well-posedness is guaranteed map,

−1 θ → aθ ∈ C R d ; L ∞ ((0, T ); D ) , and θ → bθ ∈ C R d ; bθ ∈ L ∞ (0, T ); W 1,∞ ((0, T ); D ) : bθ L ∞((0, T )× D ) ≤ M , for a constant M > 0. Global in time well-posedness requires further that condition (9) is fulﬁlled, as mentioned in [29]. Finally, parameter identiﬁcation amounts to solve the following optimization problem:

min (θ, u ; f ) θ, u

s.t.

(θ, u ; f )

(14)

where the operator is deﬁned by the loss function ψ of equation (13), and refers to the SSM on I , which is deﬁned by the sub SSMs (10) or (11), respectively.

42

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

2.2. A mass and non-negativity conserving discretization of the Fokker-Planck equation due to Chang and Cooper For simplicity of notation, we skip the superscript θ in this subsection and consider a cubic domain D. To numerically solve the FP equation, a semi-discretization, which is based on a spatial discretization of the PDE by a method-of-lines approach, is implemented. For this purpose, an equidistant spatial discretization of the bounded hypercube D = x0 + L · Q m , where Q m is the m-dimensional unit hypercube [0, 1]m and x0 is an arbitrary corner of D, with the meshing x = L / N is considered. This gives the discretized hypercube

D x = x j ∈ Rm : x j = x0 + j x, j ∈ N m ∩ D ,

(15)

where j = ( j 1 , . . . , jm ) is a multi-index that indicates the spatial position. The time is discretized based on the steps tk = k t, for k = 1, . . . , n. Thus, the solution u of the FP equation is represented by the discrete set ukj = u j (tk ) = u (tk , x j ). Due to the fact that standard numerical ﬁnite difference schemes do not provide mass and non-negativity conservation of the solution of the FP equation, the CC scheme is introduced and summarized in this section, as it conserves these two intrinsic properties. More precisely, in [18] Chang and Cooper demonstrated that commonly used difference schemes (even implicit schemes) are not necessarily particle (mass) conserving.1 Indeed, they showed that the change in the number of particles per time step is proportional to the squared mesh size x2 . Thus, even if very ﬁne grids are applied, the numerical error can become severe because of an accumulation over time until a quasi-equilibrium of the PDE is reached [18]. To overcome this issue, the authors discretized the FP equation according to

D t u j t k +1 ≈

1

t

u j (tk+1 ) − u j (tk ) =

m 1 i ,k i ,k J j +1 /2 − J j −1 /2 , i i

x

(16)

i =1

i ,k

where J j +1 /2 represents the generalized ﬂux at x j +1i /2 in direction i, expressed in implicitly known values of u. Due to the i i ,k +1 implicit formulation in (16), the term ukj + 1i /2 , which is required to compute J j +1i /2 , is not known directly, for which reason it was approximated by 1 ukj + +1 /2 ≈ i

1 2

1 k +1 ukj + , +1 + u j

(17)

i

i ,k in a ﬁrst approach. The spatial derivative ∂∂x u in the i-th direction at x j +1i /2 in J j +1 /2 was approximated according to i

∂ u k+1 ∂x

1 k +1 ≈ u j +1 − ukj +1 . i

x j +1i /2

(18)

The problem that arises at this point lies in the fact that by applying (17), negative eigenvalues can emerge unless a very restrictive condition on the mesh x is implemented. Due to runtime issues, such a condition is not suitable for real data simulations. Chang and Cooper solved the problem by means of a convex combination of values of u at adjacent spatial 1 positions and deﬁned ukj+ +1 /2 as i

1 ukj + +1 /2 = 1 − δ j

i ,k

i

1 k +1 ukj + +1 + δ j u j , i ,k

(19)

i

i ,k

for values δ j ∈ [0, 1] to be chosen as below. This approach is consistent with the difference scheme.2 Due to this deﬁnition, u j +1i /2 does not automatically refer to the mid-point between u j +1i and u j , as it is the case with the common central i ,k

difference scheme, but denotes any value between these two adjacent points. Thus, depending on the choice of δ j , the scheme switches between backward, centered and forward differencing. Based on the introduced convex combination in (19), the generalized ﬂux in the i −th direction at position x j +1i /2 can be deﬁned implicitly as i ,k

J j +1 /2 = i

i ,k

1 − δj

i ,k

A j +1 /2 + i

1

1 i ,k i ,k u k +1 − − δ ij,k A ij,+k 1i /2 ukj +1 , B B

x j +1i /2 j +1i

x j +1i /2

(20)

i ,k

where the value of δ j is chosen in such a way that the scheme restores equilibria and a non-negative spectrum. To preserve the equilibrium condition remember that the ﬂux is zero in an equilibrium, which gives that

1 2

Mass and particle conservation are used as synonyms in this context. Originally, the scheme was developed for FP type equations with positive coeﬃcients A θ and B θ , leading δ to be any value in the interval [0, 1/2]. In

later works it was proved that it is suﬃcient to use a positive diffusion matrix B θ and a Lipschitz continuous convection term A θ , which increased the interval for δ up to [0, 1], see [36,25].

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

1 ukj + +1

i

ukj +1

=

43

i ,k i ,k i ,k 1

x B j +1i /2 − δ j A j +1i /2 . i ,k i ,k i ,k 1 − δj A j +1 /2 + 1x B j +1 /2 i i

(21)

The ratio in (21) should coincide with the exact ratio 1 ukj + +1 i ukj +1

⎛

x j +1 i

⎜ = exp ⎝−

i

i

ukj +1

(22)

⎞

i ,k

≈ exp ⎝−

⎟ dxi ⎠ ,

B i t k +1 , x

xj

⎛

⎞

A t k +1 , x

which can be approximated by 1 ukj + +1

A j +1 /2 i

i ,k

B j +1 /2

x⎠ ,

(23)

i

i ,k

for suﬃciently small values of x. By combining (21) and (23), δ j

δ ij,k =

1 i ,k

wj

−

1

i ,k

exp w j

−1

can be approximated by

(24)

,

with i ,k

i ,k

w j = x

A j +1 /2 i

i ,k

B j +1 /2

(25)

.

i

i ,k

i ,k

i ,k

To obtain the range for δ j , the limits w j → ±∞ have to be evaluated, which gives that δ j ∈ [0, 1], see [25]. Non-negativity of the solution of the FP equation can be ensured for a Lipschitz continuous convection term A, see equation (7):

| A (t , x + x) − A (t , x)| ≤ γ x, where

(26) 1

γ > 0 is the Lipschitz constant. In fact, it was proved that if t < γ , the CC discretization scheme preserves the

non-negativity of the solution of the FP equation [36, Theorems 3.3, 3.6]. Finally, to preserve the mass, a no-ﬂux boundary condition as in equation (8), is used. This condition is implemented by introducing the ghost variables x−1 and x N +1 , and fulﬁlling k k J− 1i /2 = J N +1i /2 = 0,

(27)

for i = 1, . . . , m. To further guarantee particle conservation for any backward time differencing scheme, used to approximate the partial derivative ∂∂t u on the left hand side of equation (16), note that Lemma 3.1 in the work of Mohammadi and Borzì [36], which treats the special case of the backward Euler method, can be generalized by induction with respect to time. Additionally, in [36] stability of the schemes is proven and local error estimates are established, leading to optimal convergence rates of ﬁrst and second order in time CC difference schemes. Summarizing, the implicit CC scheme takes the form

D t u j t k +1 ≈

=

1

t

ukj +1 − ukj

m 1

x

#

i =1

i ,k 1 − δj

i ,k A j +1 /2 i

+

1

x

i ,k B j +1 /2 i

1 ukj + +1

i

1 i ,k i ,k i ,k i ,k i ,k i ,k − B j +1 /2 + B j −1 /2 + 1 − δ j −1 A j −1 /2 − δ j A j +1 /2 ukj +1 i i i i

x

+

1

(28)

$

1 − δ j −1 A j −1i /2 ukj + B −1 i ,

x j −1i /2 i ,k

i ,k

i ,k

where (16), (19), (20), (24), (25) and (27) are combined. Due to the numerical no-ﬂux boundary condition in (27), corresponding summands in (28) vanish. Finally, by replacing tk in (28) by an arbitrary time instance t ∈ (0, T ), and assuming A and B to be time independent, a system of ordinary differential equations (ODEs)

u˙ = Au ,

(29)

44

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

where A = A(t , x, δ, θ) represents the discretization matrix in (28),3 remains to be solved. Note that depending on the dimension m of the space D, one obtains a band matrix consisting of the main diagonal and the two secondary diagonals in one dimension, and a block band matrix with two further block diagonals in two dimensions. The scheme can be generalized to non-equidistant spatial grids, as shown in [14]. Simulations show that depending on the problem, such an adaptive mesh can improve the numerical results signiﬁcantly. 2.3. Adjoint based gradient computation with discrete time observations Eﬃcient parameter identiﬁcation techniques require gradient information to compute descent directions for the optimization. To compute the gradient of the negative log-likelihood ψ with respect to the unknown parameters θ at low computational costs, an adjoint approach, which is based on the Lagrangian of the system, is followed. At this point note that we use the semi-discretization (29) of the state equation and additionally introduce a space discretization for the observations by introducing the function † † hkN† : R N +1 → R ,

(30)

which satisﬁes †

hkN† ( v ) ≈ hk† ( I v ),

∀ v ∈ RN

†

+1

(31)

,

† where I : R N +1 → C D ; R is a piecewise multilinear interpolant such that I v (x j † ) = v j † , ∀ j † ∈ 0, . . . , N

that

† h N† k

is differentiable with Jacobian

† h N† k

∈

L∞

† R N +1 .

†

. We assume

Under consideration of the above introduced discretizations of the state and observation equation in case of empirical observations, the Lagrangian of the discretized problem can be formulated as:

†

L u , θ, p = ψuN ( f ) +

t † n† k k† =1t

pkT† u˙ − Au dt

(32)

k † −1

† To set up the adjoint problem, the optimality condition Lu s = 0, for s ∈ C 1 [0, T ]; R N +1

with s(0) = 0, integration by

parts with respect to time on the sub intervals, and the differentiability properties of linear and Neymitskii operators [2], are combined. Therewith it can be concluded that

Lu u , θ, p s =

† ψuN

s+

tk† n† k † =1 t

pkT† (˙s − As) dt

†

k −1 ⎡ tk† n T ⎢ N† = ψu s − p˙ k + A T pk† s dt + pkT (tk† )s(tk† ) − ⎣

⎤

†

k † =1

lim

t →t +†

⎥

(33)

pk† (t ) T s(tk† −1 )⎦

k −1

t k † −1

!

= 0, where t 0 := 0 and † ψuN

†

s = −

n

σkθ†

−2

†

T

f k† − hkN† (u (tk† ))

†

hkN†

(34)

u (tk† ) s(tk† ).

k † =1

'

Based on (33), the adjoint system for the discretized problem for each sub interval tk† −1 , tk† is an ODE of the form

p˙ k† (t ) = −A T pk† (t ),

(35)

with ﬁnal conditions

pk† (tk† ) =

⎧ ⎪ ⎨ ρn†

k † = n† ,

⎪ ⎩ lim+ pk† +1 (t ) − ρk†

k † = 1, . . . , n † − 1,

t →t

3

k†

In case A and B depend on time in a smooth way, (28) gives a valid approximation of (29).

(36)

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

where

ρk† = σkθ†

−2

T

†

f k† − h N† (u (tk† )) k

†

h N†

k

45

u (tk† ) . The Lagrange multipliers pk† are referred to as the adjoint states of the

boundary value problem. If the FP equation is not discretized by the CC scheme but the PDE setting is used directly, the adjoint system is a PDE of the form:

−

∂ p k† 1 θ θT T − b b : ∇ 2 pk† − aθ ∇ pk† = 0, in tk† −1 , tk† × D , ∂t 2 1 θ θT b b ∇ pk† · n D = 0, on tk† −1 , tk† × ∂ D ,

(37)

2

where “ : ” denotes the Frobenius inner product. This equation represents the Kolmogorov backward equation combined with no-ﬂux boundary conditions [29,32]. However, in both settings the gradient of the negative log-likelihood with respect to the unknown parameters is obtained by applying the chain rule on the Lagrangian. Therewith, the gradient can be expressed as

∇θ ψ = Lu u θ + Lp p θ + Lθ ,

(38)

where the ﬁrst two summands vanish since u and p are the solutions of the FP and adjoint equation, respectively. Thus, it only remains to compute the direct derivative of the Lagrangian with respect to the parameters, giving

∇θ ψ = Lθ .

(39)

The time consuming computation of the gradient via sensitivity analysis, where sensitivities of the states and output by further PDE systems would have to be computed successively, is avoided due to the adjoint approach, which makes it highly advantageous compared to straightforward gradient computation methods [48]. 2.4. Proﬁle likelihood based uncertainty quantiﬁcation Since predictions, apart from the prediction error, are affected by the estimation error of the parameter estimates, which is unavoidable when calibrating a model with experimental data, it is mandatory to infer how well model parameters are determined by experimental data. Therefore, identiﬁability and parameter conﬁdence have to be studied in detail. In literature, different approaches to compute conﬁdence intervals are offered, where one has to distinguish between asymptotic and likelihood-based conﬁdence intervals. In a general SSM setting, likelihood based conﬁdence intervals are used to quantify the conﬁdence of the parameter estimates. Assuming a Gaussian likelihood, the likelihood ratio statistic is approximately χ 2 distributed, giving the following deﬁnition of the hyperelliptic conﬁdence region in the parameter space Rd for an 1 − α conﬁdence level:

ˆ = C R 1−α (θ)

θ ∈ 2 ψ(θ) − ψ(θˆ ) ≤ 1−α ,

(40)

2 where 1−α denotes the 1 − α quantile of the chi-squared distribution χdf with df = #θ = d, see [49,34,42]. To compute conﬁdence intervals of single parameters, a proﬁle likelihood approach is followed. This concept is based on exploring the parameter space by re-optimizing the likelihood in such a way that it is reduced to a single parameter function, which is reached by treating the remaining parameters as nuisance parameters. Mathematically speaking, the proﬁle likelihood for β is given as

P L j (β) = min ψ(θ), θ ∈ j (β)

(41)

where j (β) = θ ∈ : θ j = β , see [49]. By means of the results of (41) for different values of β , likelihood based conﬁdence intervals are obtained as

ˆ − P L j (β) ≤ 1−α C I 1−α (θˆ j ) = β 2 ψ(θ) ˆ ∧ θj = β , = β ∃ θ ∈ C R 1−α (θ)

(42) (43)

2 where 1−α is the 1 − α quantile of χdf with df = 1. If the conﬁdence interval of a parameter is ﬁnite, the parameter is said to be practically identiﬁable. On the contrary, unbounded conﬁdence intervals indicate non-identiﬁability of a parameter, resulting from data insuﬃciency (practical non-identiﬁability) or modeling/parameter redundancy (structural non-identiﬁability) [47,42]. At this point it is important to mention that in our application we only investigate whether or not a parameter is practically identiﬁable. This means we investigate whether the credible intervals are ﬁnite for suﬃciently small noise. The concept of practical identiﬁability differs from the usual concept of identiﬁability, where the lengths of conﬁdence intervals additionally need to converge to zero for “inﬁnitely” large samples.

46

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

3. Validation and application In order to validate the introduced parameter identiﬁcation framework an exemplary setting based on the OrnsteinUhlenbeck process is considered. This standard example is used, as errors in the implementation would be visible immediately, and the drawbacks of MCMC simulations can be highlighted in this simple setting. Then, the approach is applied to measurement data of a polycrystalline metal ﬁlm, which reveals the change in microstructural properties due to applied stress. 3.1. Validation of the parameter identiﬁcation framework For the test setting, a process of the form

d X t = ζ (μ − X t )dt + σ dW t ,

(44)

with the parameters (ζ, μ, σ ), where it is of importance that ζ > 0, σ > 0 and μ ∈ R, is considered as state SDE. In literature, such a process is referred to as OU process. Generally speaking, it describes the movement of particles under Brownian motion subject to friction [45]. For ease of exposition, an identity observation operator h ≡ Id is considered in this setting, leading to the SSM

⎧ 1 2 ∂ ⎪ ⎨ u (t , x) = ∇ · σ u − ζ (μ − x)u (t , x) 2 (ζ, μ, σ ) : ∂ t ⎪ ⎩ f k† (x j † ) = uk† (x j † ) + ηk† , j †

(45)

with no-ﬂux boundary conditions, where x j † ∈ D x , 0 < t 1 < . . . < tn† = T . The noise ηk† , j † accounts for the bias and variation of the KDE. To obtain the density data f k† (x j † ), trajectories of the state SDE representing the empirical data are required ﬁrst. Usually one would (if possible) apply e.g. a Lamperti transformation, such that the volatility becomes state independent, and then do path sampling in the transformed setting by means of the Milstein scheme with suﬃciently small time increments. However, since the OU process is explicitly solvable, namely

X t = μ + ( X 0 − μ) e

−ζ t

t +σ

e −ζ (t −s) dW s ,

0

= μ + ( X 0 − μ) e −ζ t + σ e −ζ t

(46)

t e ζ s dW s , 0

the analytic solution is used for sampling instead. For this purpose, the stochastic integral in equation (46), i.e.,

t e ζ s dW s ,

(47)

0

is transformed by Doob’s transformation [1,20]. To this end, the Dambis, Dubins-Schwartz theorem [43] is applied, which guarantees the existence of a Wiener process fulﬁlling

t e ζ s dW s = W τ (t ) , 0

where

τ (t ) =

1 2θ

(

t ≥ 0,

(48)

'

e 2ζ t − 1 . Combining (46) with (48) gives the analytic solution of the OU process:

X t = X 0 e −ζ t + μ 1 − e −ζ t + σ e −ζ t W τ (t ) .

(49)

Based on this representation, ten thousand trajectories of the process within the time interval [0, 2], the parameters (ζ, μ, σ ) = (1, 1.2, 0.3) and an uniform initial distribution X 0 ∼ U [0, 0.1] were sampled. The gained sample paths are visualized in Fig. 1, where the horizontal dotted line indicates the mean or long-time equilibrium of the process. In particular, 2

one can show for the large time behavior of the process that E limt →∞ X t = μ and V ar limt →∞ X t = σ2ζ , resulting in converging sample paths that ﬂuctuate around the long time mean. Thus, if the domain D for the FP equation is chosen large enough, in the sense that the sample paths do neither touch nor cross the boundary ∂ D, the introduced ﬁtting procedure can be applied, as the no-ﬂux boundary condition does not falsely affect the evolution of the density.

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

47

Fig. 1. Trajectories of the OU process with true parameters.

Fig. 2. KDE estimated FP density and its temporal evolution.

For parameter estimation, density data f k† (x j † ) out of the sampled trajectories at the prescribed time instances

tk† = (0.0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8, 2.0)

(50)

was computed by KDE, where a uniform kernel was used. The estimated density at the time instances (50) is shown in Fig. 2. At this point note that estimated data with an absolute value smaller than 1E − 30 is neglected and set to NaN. Thus, for parameter identiﬁcation only the data, which is highlighted by the black lines at the spatial grid points in Fig. 2, is used. The optimization routine based on the CC discretization and the adjoint approach is implemented in MATLAB® , where especially for the parameter identiﬁcation and UQ tasks the toolboxes AMICI and PESTO are used [24,46]. At this point it is interesting to mention that the model needs to be implemented symbolically, as the toolboxes are based on the symbolic toolbox from MATLAB® . Latin hypercube sampling within the lower [0.1, 0.5, 0.1] and upper [2.0, 2.9, 0.5] bounds for the parameters was performed to obtain random start parameter vectors for the optimization. At this point note that the l2 loss function is treated as the negative log-likelihood function, as this is true in the asymptotic sense. Further, prior information for the parameters, namely uniform distributions within the lower and upper bounds, is applied. For these two reasons, the estimate is biased and called Maximum-A-Posteriori (MAP) estimate. Thus, by following Bayes’ law, the global MAP esˆ , σˆ ) = (1.0006, 1.2025, 0.2970) was reached. A comparison with the true parameters give relative errors of timate (ζˆ , μ (0.06%, 0.21%, 1%). Goodness-of-ﬁt, as well as identiﬁability are checked by help of the posterior probability proﬁles of the parameters, which are shown in Fig. 3. These proﬁles are obtained by following the proﬁle likelihood approach, described in section 2.4. Due to the usage of the prior knowledge, posterior proﬁles instead of only likelihood proﬁles could be computed. For the computation of the proﬁles, an optimization based method, highlighted by the red lines in Fig. 3, is applied. For further details on these two methods, the reader is referred to [16]. However, a close look at the proﬁles indicates that each of the parameters is practically identiﬁable. This can be concluded by visual inspection only, since every arbitrary statistical conﬁdence level gives ﬁnite conﬁdence intervals for each of the three parameters. To highlight the eﬃciency of the above introduced parameter estimation technique based on optimization and adjoint based gradient information, the results were compared with estimates obtained by following a Bayesian inference approach based on MCMC simulations. More precisely, a PT algorithm was used to simulate Markov chains for the three unknown parameters. Generally speaking, PT rescales the negative log-likelihood by introducing an artiﬁcial temperature level. Therewith, multi-modality of any cost functional can be handled reliably, as the chains do not stop in local minima

48

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

Fig. 3. Posterior probability proﬁles of the parameters of the OU process. (For interpretation of the colors in the ﬁgure(s), the reader is referred to the web version of this article.)

Fig. 4. Results of PT MCMC simulations.

if an appropriate rescaling is implemented [9,10,44]. Here, six different temperature levels were considered to generate the Markov chains shown in Fig. 4. In total, each of the chains consists of one hundred thousand samples. A detailed look at the ﬁgure indicates that each of the three chains converged to a stationary posterior probability, after the Burn-In phase, i.e., the (initial) time the chain is run until it is close to stationarity, is passed. When comparing the conﬁdence intervals for the parameter estimates gained by sampling with those gained by computing the posterior probability proﬁles, one can recognize differences, even when the Burn-In numbers are neglected. Therefore, compare the Figs. 5 and 6, where (80%, 90%, 95%) conﬁdence intervals with and without Burn-In are plotted. For explanation, the longest intervals (lightest color) refer to the 95% credible intervals, the middle (medium color intensity) to the 90% credible intervals, and the shortest (darkest color) to the 80% credible intervals, respectively. The abbreviations on the y-axis indicate whether the intervals are gained by MCMC (Bay., Sampl.), or based on the proﬁle likelihood approach (Proﬁle L.) and corresponding posterior probability proﬁles, respectively, which are shown in Fig. 3. At this point note the

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

49

Fig. 5. Conﬁdence intervals due to MCMC (Burn-In included) and proﬁle likelihood.

Fig. 6. Conﬁdence intervals due to MCMC (no Burn-In included) and proﬁle likelihood.

different scaling of the horizontal axes in Figs. 5 and 6 and the fact that the intervals computed by Bayesian Sampling are excessively extended to the left. One can see that even for the simple example with an OU process, conﬁdence intervals gained by MCMC simulations are not that accurate as those generated via posterior probability proﬁles; a fact conﬁrmed by literature. In fact, it can be seen, that the credible intervals based on sampling and proﬁle likelihoods differ. Due to the fact that in practical applications the Burn-In period is deﬁned mostly heuristically, i.e., it is the user who declares this threshold by visual inspection of the Markov chain, the proﬁle likelihood based conﬁdence intervals are more reliable and independent from user actions. Further, it has to be mentioned that the computation time for the MCMC simulation is much higher than the time needed for the optimization routine that includes the gradient information, highlighting the usability of the introduced approach in practical applications. However, in case only a few trajectories of a process are observed, KDE is very inaccurate, for which reason it is advised to do MCMC simulations directly on the path observations. Here the Euler approximation, which is used to sample the trajectories of the SDE, is the basis for the likelihood construction [28]. To ﬁnally visualize the difference between the density estimated out of the empirical data (due to true θ ) and the ﬁtted density (due to MAP θˆ ), the Kullback-Leibler (KL) divergence is visualized in Fig. 7. Note that due to the exact inclusion of the initial density as initial condition in the optimization setting, the KL distance is zero at the beginning.

50

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

Fig. 7. Temporal evolution of KL distance between prescribed and ﬁtted FP density.

3.2. Application to EBSD data of a fatigued metal to explain texture evolution EBSD is a prominent investigation tool in material science to study the microstructure and the characteristic properties of materials. As mentioned previously, the focus of this work lies in the characterization of a metal ﬁlm, which shows textural changes because of applied thermo-mechanical stress. More precisely, the metal of interest is a polycrystalline solid that consists of a large number of single crystals, termed as grains. During loading, these grains change their integrity, which leads to local stress concentrations at the grain boundaries, triggering the initialization of intergranular voids. The voids grow, coalesce and ﬁnally accumulate to cracks, which in worst case spread through the ﬁlm. The mentioned observations are from experiments of a test structure, where further details can be found in [37]. To reliably model and predict the degradation of the metallization, it ﬁrst is important to reasonably model the changes in texture that are caused by loading, as they are the process driving force for degradation. A useful concept in this context is the so-called Grain Boundary Character Distribution (GBCD) – a distribution of relative lengths (2D) or facets (3D) with a given misorientation. In literature, this statistic has been studied heavily by Barmak et al. and Kinderlehrer et al., who modeled coarsening cellular networks based on curvature driven growth of grain boundaries [12,11,30,31]. In a simpliﬁed setting, the authors were able to prove a FP type evolution for the GBCD, which is deﬁned as

ρ (α , t ) = relative length of arc of misorientation α ∈ D at time t , normalized such that

ρ d α = 1.

(51)

D

In some more details, they started from a local dissipation principle, which was upscaled from the local level by means of the GBCD. To account for critical events like grain deletion they added an entropy term and included the kinetic energy, which ﬁnally gave them a FP equation for the GBCD:

μ

∂ ∂ ∂ d λ ρ+ ρ= ζρ , ∂t ∂α ∂α dα

in D , 0 < t < ∞,

(52)

where μ > 0 is a constant, ζ = ζ (α ) ≥ 0 is the surface energy potential of the grain boundaries, which depends on the misorientation α between adjacent grains, λ denotes the diffusion parameter of the metal, and D is the space of misorientation angles [11]. The underlying Langevin equation of (52) shows high similarity to the SDE model of Barmak et al. [12], who in a ﬁrst approach studied the relation between coarsening and random critical events by an SDE of the form:

dαt = −

d dα

γ dt + dW t ,

(53)

with energy potential γ and volatility . However, the authors found that this SDE model is too naive, as it only considers the misorientation angle. This was the reason why they started modeling based on curvature driven growth and the GBCD, which incorporates the relative lengths of the grain boundaries. In the following, we follow the framework introduced in the sections above, i.e., we consider the FP equation (52) as state equation. For the deﬁnition of an appropriate observation equation it has to be remarked that the GBCD can directly be computed out of the experimental observation data, giving an identity observation operator in the SSM framework (10). To reliably set up the model, the misorientation space D and the surface energy η have to be deﬁned carefully. Due to crystal symmetry reasons, the interval for the misorientation angles α is the closed interval [0, 62.8◦ ], see [33], and the surface energy is assumed to be of periodic nature. Mathematically speaking, in a ﬁrst approach the surface energy is modeled by a periodic potential of the form ζ (α ) = ε sin( f α ), where ε is a modeling parameter and the frequency f is ﬁxed to f = 180/125.6. Summarized, we end up with the following FP state equation:

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

51

Fig. 8. The empirical GBCD and its evolution over stress cycles.

Fig. 9. Conﬁdence intervals based on MCMC simulations and proﬁle likelihoods.

μ

∂ ∂ ∂ λ ρ + ε f cos ( f α ) ρ , ρ= ∂t ∂α ∂α

in D , 0 < t < ∞,

(54)

where D = [0, 62.8◦ ], μ and ε are the free modeling parameters that need to be inferred from the measurement data, and the diffusion parameter λ is ﬁxed by means of the material’s characteristic activation energy and the applied stress condition in the experiment. The empirical data is allocated out of a time series of EBSD scans of the fatigued metal ﬁlm. The post-processing of the data, including the removal of inaccurate measurements and the extraction of the GBCD, is implemented in MATLAB® , where the open source GitHub repository toolbox MTEX is used [7,8]. The evolution of the experimental GBCD is shown in Fig. 8, where the misorientations are given in rad. On basis of the data shown in Fig. 8, the Maximum-A-Posteriori estimates for the model parameters are computed, giving a maximum KL value of 0.0608, which indicates a reasonable ﬁt. To quantify the uncertainty of the estimates, the parameters’ corresponding conﬁdence intervals are computed, where the results of MCMC simulations are compared against proﬁle likelihood results, as shown in the Figs. 9 and 10. Visual inspection of Fig. 9 enables the conclusion that both modeling parameters are practically identiﬁably, as each of the CIs is ﬁnite. Further, the difference between the CIs out of MCMC simulations and the credible sets obtained from proﬁle likelihoods is eye-catching. The mentioned difference is illustrated in more details in Fig. 10, where the MCMC corresponding histograms are plotted against the posterior probability proﬁles for the parameters. The convoluted results of the multistart optimization are highlighted by the red circles (MS – conv.). The results presented in this section are important to better understand the process of void formation. Interesting at this point is the decrease of probability mass around 60◦ , which was already reported in the work of Bigl [13], who showed a decrease of twin boundaries. Although the heating method of his experiments was different, the results are comparable to the observations in this experiment. A detailed discussion of the model can be found in the paper [40].

52

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

Fig. 10. Detailed comparison between MCMC results and posterior probability proﬁles.

4. Conclusion In this paper, parameter identiﬁcation and UQ for SSMs based on Itô stochastic processes was performed by the system’s corresponding FP equation. For eﬃcient numerical optimization, the gradient of the cost functional with respect to the unknown parameters was computed by an adjoint approach. The UQ results based on proﬁle likelihoods were compared with the results reached by PT MCMC simulations. The algorithmic optimization approach based on gradient information shows beneﬁts over usually applied MCMC simulations due to two reasons; ﬁrstly, the conﬁdence intervals obtained by computing the likelihood or posterior probability proﬁles, respectively, are more accurate; secondly, the computation time for the optimization is much less than the expensive time required for the MCMC simulations. Further, the presented ﬁtting results for the GBCD of a fatigued Cu ﬁlm are important as they help in understanding the response of the metal due to thermo-mechanical fatigue, which is crucial for future degradation modeling. Future work incorporates the extension of the ﬁtting procedure to Lévy processes, as the void initialization has to be modeled by a Poisson processes, whose intensity is assumed to depend on the change in local misorientation density. For this purpose, the work of Annunziato and Gottschalk [5], who present an optimal control approach for the calibration of a model based on a Lévy process, will be considered. Acknowledgement The authors want to thank Ralf Hielscher (TU Chemnitz, Germany) and Paul Stapor (Helmholtz Zentrum Munich) for their assistance, as well as Michael Nelhiebel (KAI GmbH, Austria), Max Hermann Poech (Fraunhofer-Institut für Siliziumtechnologie (ISIT), Germany), Florian Peter Pribahsnik (KAI GmbH, Austria), Johannes Zechner (KAI GmbH, Austria) and Thomas Svensson (Ingenjörsstatistik, Borås, Sweden) for fruitful discussions. This work was funded by the Austrian Research Promotion Agency (FFG, Project No. 863947). Moreover, the authors wish to thank all reviewers for their very careful reading of the manuscript and their detailed reports with valuable comments and suggestions that have led to an improved version of the paper. References [1] L. Alili, P. Patie, J.L. Pedersen, Representations of the ﬁrst hitting time density of an Ornstein-Uhlenbeck process, Stoch. Models 21 (2005) 967–980, https://doi.org/10.1080/15326340500294702. [2] H. Amann, J. Escher, Analysis II. Grundstudium Mathematik, Birkhäuser, Basel, 2008. [3] M. Annunziato, A. Borzì, Optimal control of probability density functions of stochastic processes, Math. Model. Anal. 15 (2010) 393–407, https:// doi.org/10.3846/1392-6292.2010.15.393-407. [4] M. Annunziato, A. Borzì, A Fokker–Planck control framework for multidimensional stochastic processes, J. Comput. Appl. Math. 237 (2013) 487–507, https://doi.org/10.1016/j.cam.2012.06.019. [5] M. Annunziato, H. Gottschalk, Calibration of Lévy processes using optimal control of Kolmogorov equations with periodic boundary conditions, Math. Model. Anal. 23 (2018), https://doi.org/10.3846/mma.2018.024.

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

53

[6] A. Arnold, E. Carlen, Q. Ju, Large-time behavior of non-symmetric Fokker-Planck type equations, Commun. Stoch. Anal. (2008) 153–175. [7] F. Bachmann, R. Hielscher, H. Schaeben, Texture analysis with MTEX – free and open source software toolbox, in: Texture and Anisotropy of Polycrystals III, in: Trans Tech Publications Ltd., 2010, pp. 63–68. [8] F. Bachmann, R. Hielscher, H. Schaeben, Grain detection from 2D and 3D EBSD data—speciﬁcation of the MTEX algorithm, Ultramicroscopy 111 (2011) 1720–1733, https://doi.org/10.1016/j.ultramic.2011.08.002. [9] B. Ballnus, S. Hug, K. Hatz, L. Görlitz, J. Hasenauer, F.J. Theis, Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems, BMC Syst. Biol. 11 (2017) 63, https://doi.org/10.1186/s12918-017-0433-1. [10] B. Ballnus, S. Schaper, F.J. Theis, J. Hasenauer, Bayesian parameter estimation for biochemical reaction networks using region-based adaptive parallel tempering, Bioinformatics 34 (2018) i494–i501, https://doi.org/10.1093/bioinformatics/bty229. [11] K. Barmak, E. Eggeling, M. Emelianenko, Y. Epshteyn, D. Kinderlehrer, R. Sharp, S. Ta’asan, Critical events, entropy, and the grain boundary character distribution, Phys. Rev. B 83 (2011) 134117, https://doi.org/10.1103/PhysRevB.83.134117. [12] K. Barmak, M. Emelianenko, D. Golovaty, D. Kinderlehrer, S. Ta’asan, Towards a statistical theory of texture evolution in polycrystals, SIAM J. Sci. Comput. 30 (2008) 3150–3169, https://doi.org/10.1137/070692352. [13] S. Bigl, S. Wurster, M.J. Cordill, D. Kiener, Advanced characterisation of thermo-mechanical fatigue mechanisms of different copper ﬁlm systems for wafer metallizations, Thin Solid Films 612 (2016) 153–164, https://doi.org/10.1016/j.tsf.2016.05.044. [14] J.P.S. Bizarro, P. Rodrigues, Numerical Fokker–Planck calculations in nonuniform grids, Phys. Plasmas 8 (2001) 1903–1909, https://doi.org/10.1063/1. 1348331. [15] O. Bluder, Prediction of Smart Power Device Lifetime Based on Bayesian Modeling, Ph.D. thesis, Alpen-Adria Universität, Klagenfurt, 2011. [16] R. Boiger, J. Hasenauer, S. Hroß, B. Kaltenbacher, Integration based proﬁle likelihood calculation for PDE constrained parameter estimation problems, Inverse Probl. 32 (2016) 125009. [17] J.A. Carrillo, S. Cordier, S. Mancini, A decision-making Fokker-Planck model in computational neuroscience, J. Math. Biol. 63 (2011) 801–830, https:// doi.org/10.1007/s00285-010-0391-3. [18] J.S. Chang, G. Cooper, A practical difference scheme for Fokker-Planck equations, J. Comput. Phys. 6 (1970) 1–16. [19] Y.C. Chen, A tutorial on kernel density estimation and recent advances, Biostat. Epidemiol. 1 (2017) 161–187, https://doi.org/10.1080/24709360.2017. 1396742. [20] J.L. Doob, The Brownian movement and stochastic equations, Ann. Math. 43 (1942) 351–369, https://doi.org/10.2307/1968873. [21] J. Droniou, J.L. Vázquez, Noncoercive convection-diffusion elliptic problems with Neumann boundary conditions, Calc. Var. Partial Differ. Equ. 34 (2009) 413–434, https://doi.org/10.1007/s00526-008-0189-y. [22] F. Dunker, T. Hohage, On parameter identiﬁcation in stochastic differential equations by penalized maximum likelihood, Inverse Probl. 30 (095001) (2014), https://doi.org/10.1088/0266-5611/30/9/095001, 20. [23] L.C. Evans, Partial Differential Equations, Graduate Studies in Mathematics, vol. 19, American Mathematical Society, Providence, RI, 1998. [24] F. Fröhlich, J. Hasenauer, D. Weindl, P. Stapor, AMICI – Advanced Matlab Interface to CVODES and IDAS. GitHub repository, http://icb-dcm.github.io/ AMICI/. [25] B. Gaviraghi, M. Annunziato, A. Borzì, Analysis of splitting methods for solving a partial integro-differential Fokker–Planck equation, Appl. Math. Comput. 294 (2017) 1–17. [26] A. Gramacki, Nonparametric Kernel Density Estimation and Its Computational Aspects, Studies in Big Data, Springer International Publishing, 2017. [27] W. Härdle, M. Müller, S. Sperlich, A. Werwatz, Nonparametric and Semiparametric Models, Springer Series in Statistics, Springer, Berlin Heidelberg, 2012. [28] S. Hermann, Bayesian Prediction for Stochastic Process Models in Reliability, Ph.D. thesis, Technische Universität Dortmund, 2016. [29] B. Kaltenbacher, B. Pedretscher, Parameter estimation in SDEs via the Fokker–Planck equation: likelihood function and adjoint based gradient computation, J. Math. Anal. Appl. 465 (2018) 872–884, https://doi.org/10.1016/j.jmaa.2018.05.048. [30] D. Kinderlehrer, C. Liu, Evolution of grain boundaries, Math. Models Methods Appl. Sci. 11 (2001) 713–729, https://doi.org/10.1142/ S0218202501001069. [31] D. Kinderlehrer, I. Livshits, S. Ta’asan, A variational approach to modeling and simulation of grain growth, SIAM J. Sci. Comput. 28 (2006) 1694–1715, https://doi.org/10.1137/030601971. [32] P.E. Kloeden, E. Platen, Numerical Solution of Stochastic Differential Equations, Applications of Mathematics – Stochastic Modelling and Applied Probability, vol. 23, Springer, Berlin Heidelberg, 2010. [33] J.K. Mackenzie, Second paper on statistics associated with the random disorientation of cubes, Biometrika 45 (1958) 229–240, https://doi.org/10.1093/ biomet/45.1-2.229. [34] W.Q. Meeker, L.A. Escobar, Teaching about approximate conﬁdence regions based on maximum likelihood estimation, Am. Stat. 49 (1995) 48–53, https://doi.org/10.1080/00031305.1995.10476112. [35] M. Mohammadi, Analysis of Discretization Schemes for Fokker-Planck Equations and Related Optimality Systems, Ph.D. thesis, Julius-MaximiliansUniversität, Würzburg, 2015. [36] M. Mohammadi, A. Borzì, Analysis of the Chang–Cooper discretization scheme for a class of Fokker–Planck equations, J. Numer. Math. 23 (2015) 271–288, https://doi.org/10.1515/jnma-2015-0018. [37] S. Moser, G. Zernatto, M. Kleinbichler, M. Nelhiebel, J. Zechner, M.J. Cordill, R. Pippan, A novel setup for in-situ monitoring of thermo-mechanically cycled thin ﬁlm metallizations, JOM J. Mineral. Metal. Mater. Soc. (2019), submitted for publication. [38] E. Parzen, On estimation of a probability density function and mode, Ann. Math. Stat. 33 (1962) 1065–1076, https://doi.org/10.1214/aoms/1177704472. [39] B. Pedretscher, An Inverse Problem to Simulate the Degradation Process of Semiconductor Devices, Master’s thesis, Alpen-Adria-Universität, Klagenfurt, 2015. [40] B. Pedretscher, M. Nelhiebel, B. Kaltenbacher, Applying a statistical model to the observed texture evolution of fatigued metal ﬁlms, J. Mech. Phys. Solids (2019), submitted for publication. [41] K. Plankensteiner, Application of Bayesian Networks and Stochastic Models to Predict Smart Power Switch lifefetime, Ph.D. thesis, Alpen-AdriaUniversität, Klagenfurt, 2015. [42] A. Raue, C. Kreutz, T. Maiwald, J. Bachmann, M. Schilling, U. Klingmüller, J. Timmer, Structural and practical identiﬁability analysis of partially observed dynamical models by exploiting the proﬁle likelihood, Bioinformatics 25 (2009) 1923–1929, https://doi.org/10.1093/bioinformatics/btp358. [43] D. Revuz, M. Yor, Continuous Martingales and Brownian Motion, Grundlehren der mathematischen Wissenschaften, Springer, Berlin Heidelberg, 2013. [44] M. Sambridge, A parallel tempering algorithm for probabilistic sampling and multimodal optimization, Geophys. J. Int. 196 (2014) 357–374, https:// doi.org/10.1093/gji/ggt342. [45] R. Serfozo, Basics of Applied Stochastic Processes, Probability and Its Applications, Springer, Berlin, Heidelberg, 2009. [46] P. Stapor, D. Weindl, B. Ballnus, S. Hug, C. Loos, A. Fiedler, S. Krause, S. Hroß, F. Fröhlich, J. Hasenauer, PESTO: Parameter EStimation TOolbox, Bioinformatics 34 (2018) 705–707, https://doi.org/10.1093/bioinformatics/btx676. [47] I. Swameye, T.G. Müller, J. Timmer, O. Sandra, U. Klingmüller, Identiﬁcation of nucleocytoplasmic cycling as a remote sensor in cellular signaling by databased modeling, Proc. Natl. Acad. Sci. 100 (2003) 1028–1033, https://doi.org/10.1073/pnas.0237333100.

54

B. Pedretscher et al. / Applied Numerical Mathematics 146 (2019) 38–54

[48] F. Tröltzsch, Optimal Control of Partial Differential Equations: Theory, Methods, and Applications, Graduate Studies in Mathematics, vol. 112, American Mathematical Society, 2010. [49] D.J. Venzon, S.H. Moolgavkar, A method for computing proﬁle-likelihood-based conﬁdence intervals, J. R. Stat. Soc., Ser. C, Appl. Stat. 37 (1988) 87–94, https://doi.org/10.2307/2347496. [50] L. Wasserman, All of Nonparametric Statistics, Springer Texts in Statistics, Springer, New York, 2006.

Parameter identification and uncertainty quantification in stochastic state space models and its application to texture analysis

Parameter identification and uncertainty quantification in stochastic state space models and its application to texture analysis

Recommend Documents