Development of a standard calibration procedure for the DEM parameters of cohesionless bulk materials – Part II: Efficient optimization-based calibration

Development of a standard calibration procedure for the DEM parameters of cohesionless bulk materials – Part II: Efficient optimization-based calibration

Journal Pre-proof Development of a standard calibration procedure for the DEM parameters of cohesionless bulk materials – Part II: Efficient optimizat...

2MB Sizes 2 Downloads 79 Views

Journal Pre-proof Development of a standard calibration procedure for the DEM parameters of cohesionless bulk materials – Part II: Efficient optimization-based calibration Christian Richter, Thomas Rößler, Günter Kunze, Andre Katterfeld, Frank Will PII:

S0032-5910(19)30875-7

DOI:

https://doi.org/10.1016/j.powtec.2019.10.052

Reference:

PTEC 14819

To appear in:

Powder Technology

Received Date: 20 May 2019 Revised Date:

13 September 2019

Accepted Date: 9 October 2019

Please cite this article as: C. Richter, T. Rößler, Gü. Kunze, A. Katterfeld, F. Will, Development of a standard calibration procedure for the DEM parameters of cohesionless bulk materials – Part II: Efficient optimization-based calibration, Powder Technology (2019), doi: https://doi.org/10.1016/ j.powtec.2019.10.052. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier B.V.

Development of a standard calibration procedure for the DEM parameters of cohesionless bulk materials – Part II: Efficient optimization-based calibration Christian Richtera , Thomas R¨oßlerb , G¨ unter Kunzea , Andre Katterfeldb , Frank Willa a Endowed

Chair of Construction Machines, Technical University Dresden, M¨ unchner Platz 3, 01062 Dresden, Germany b Chair of Material Handling, University of Magdeburg, Universit¨ atsplatz 2, 39106 Magdeburg, Germany

Abstract The numerical complexity of Discrete Element Method (DEM) simulations generally forces an idealisation of DEM models, making the calibration process the key to realistic simulation results. When calibrating cohesionless, free-flowing bulk materials, individual simple experiments are commonly used as reference for the calibration, such as the angle of repose in various test methods. Regardless of the experiment, the calibration is regularly performed by trial and error, systematic variation of the parameters, or using optimization algorithms until a suitable combination of parameters is found. This paper deals with the development and test of a highly efficient optimization-based calibration procedure. First, a brief overview of various optimization methods is given and explains why the use of surrogate models seems the best choice for DEM calibration. Subsequently, a new modular algorithm called ”generalized surrogate modeling-based calibration” (GSMC) is presented. For testing the new algorithm, a modified draw down test is used. Keywords: Discrete Element Method, Calibration, Optimization, Surrogate Model, Draw Down

Email addresses: [email protected] (Christian Richter), [email protected] (Thomas R¨ oßler), [email protected] (G¨ unter Kunze), [email protected] (Andre Katterfeld), [email protected] (Frank Will)

Preprint submitted to Powder Technology

May 15, 2019

1. Introduction A variety of work and production processes are characterized by the handling and interaction with bulk materials. For an efficient design of these processes, as well as the correct design of the associated machines and plants, it is necessary to be able to reproduce the behavior of the corresponding materials in a realistic manner. The discrete element method (DEM) offers a solution to the problem of simulating granular materials. Although developed in the 1970s [1], the method has only been used increasingly in recent years [2, 3]. This is due to advances in affordable computational power, reflected in a large number of publications and application examples from a wide range of industrial sectors, such as construction machinery [4, 5], mining [6], agriculture [7], geotechnique [8] or pharmacy [9]. The core idea of DEM is the abstraction of granular materials through a finite number of distinct elements (also called particles). Particles interact with each other as soon and as long as they are in contact. The manner of interaction is controlled by so-called contact models. Each contact model has a variety of different parameters that must be assigned numeric values. These parameters as well as other particle properties influence the macroscopic behavior of the simulated material. Since there is no computational approach to derive numerical parameter values directly from a system response, the models must be calibrated. Calibration mostly is an iterative process, attempting to identify a parameter set whose resulting system response corresponds as closely as possible to a desired one. Due to very long calculation times for DEM simulation runs, the number of necessary computations should be kept as low as possible. Calibration faces the user with a lot of problems. Identifying parameter values in a pure trial-and-error procedure usually is very inefficient, requires an experienced user and rarely leads to an optimal parameter set.

2

Improvements of the calibration process have been a topic to scientific research for years [10]. One approach is the usage of design of experiments (DoE) methods. [11, 12]. The main disadvantage of these methods is the usually high number of necessary simulation runs. In addition, the optimal parameter set often remains hidden because it is located somewhere between the generated sampling points. An alternative approach is to use optimization methods. Many different optimization methods have already been tested and investigated. For example, Hess [4] uses particle swarm optimization to find an optimal parameter set while Do [13] uses genetic algorithms. Further optimization-based calibration techniques use surrogate models such as response surface models [14], artificial neural networks [15] or Kriging [16]. Although nearly all of these methods are capable of finding an optimum, they still have some weaknesses. First, it should be noted that none of the above methods is truly efficient in terms of practical application. While Hess, Do and Benvenuti still need several hundred simulation runs to identify an optimum, Rackl uses very small parameter ranges to facilitate optimization. From the authors’ point of view, a practicable optimization method must be capable of handling large parameter ranges as well as finding an optimum within a few simulation runs. Second, many of the algorithms mentioned concentrate on optimizing one objective function value. As already shown in Part I [17], multiple objective function values must be optimized simultaneously to take into account the ambiguity of the parameter sets. This paper deals with the development and test of a highly-efficient optimizationbased calibration procedure. For this purpose different optimization methods will be presented and the requirements that DEM places on these methods will be clarified. Furthermore, a new optimization algorithm is presented, which was designed especially for the requirements of expensive optimization problems, i.e. problems with long computation times. Using the draw down test as standard test and gravel as non-cohesive material, the new algorithm will be tested and benchmarked.

3

2. Calibration procedure Calibration, also referred to as parameter identificaton [15], means the process of finding an adequate parameter set, that results in nearly the same simulative bulk material behavior as observable in real world experiments. In general, each calibration procedure can be described using the flow chart shown in figure 1. At the beginning I ∈ N initial parameter sets forming the parameter matrix X = {x1 , . . . , xI } are generated. This can be done manually by the user or automatically using DoE methods. Each parameter set xi = {x1 , . . . , xN } with i = 1, . . . , I contains N ∈ N parameter values. These values act as inputs for an objective function f (x) : RN → RM calculating M ∈ N objectives. Main difference between normal optimization and calibration is that in case of calibration, the objective function is subdivided into two steps. First some simulation re0 sults yi0 = {y10 , . . . , yM } are calculated. Afterwards these results are assessed by

comparing them to given reference values coming from real world experiments. After all objective sets yi = {y1 , . . . , yM } have been calculated a decision takes place - either a stopping criterion has reached or not. If yes, the calibration procedure is over. If not J ∈ N new parameter sets must be generated. Generating initial new parameter sets is done by some kind of supervisor. This can either be a human or an algorithm. For automatic calibration procedures optimization algorithms can be used. Figure 1: General calibration procedure

3. Optimization methods 3.1. Overview In literature there are a lot of proposals for classification of different optimization methods. Depending on the achievable accuracy Zinlinskas and Torn [18] specify a subdivision into exact and approximate procedures.

4

A further possibility for classification consists in the distinction between local, global or hybrid search. As global methods search the entire allowed range, local methods focus on finding an optimum near the origin. In general, the number of required iterations respectively computations in local methods is much lower. An application of these methods for global search can take place over several randomly selected starting points. Hybrid methods are a combination of slow global and fast local methods. A classification of optimization methods, based on the basic concept of their operation is proposed in [19]. Figure 2 shows the different types of classification. It is expressly pointed out that this figure does not claim to be exhaustive. The mentioned methods are briefly presented below and their properties explained. Figure 2: Overview and classification of different optimization methods

3.1.1. Gradient-based methods As the name implies, these methods use first order or higher order derivatives of the objctive function for calculating new parameter sets. The prerequisite for this is that the objective function is continuous and differentiable. Restricted optimization usually can not be solved with these methods. Since the parameter vector always converges towards the closest optimum, gradient-based methods are among the local optimization methods. Successful search for a global optimum significantly depends on the choice of a good starting point. For topologies with many local optima these methods are unsuitable. 3.1.2. Simplex- and Complex-based methods The simplex and complex-based methods belong to the derivation-free optimization methods. Therefore these methods are suitable for tasks with difficultto-calculate gradients. A big disadvantage is the slow convergence. In order to realize a larger search volume and thereby achieve better robustness against rough surfaces and local optima a set of linked simplices can be used. These are called complexes.

5

3.1.3. Stochastic algorithms Stochastic algorithms are characterized by the fact that they use essentially the same methods as are used in statistical analysis. The search for an optimum usually requires many iterations, resulting in poor convergence. The main advantages are that any limitations of the search space are supported and the computation can be parallelized very well. The methods are also very suitable for global optimization tasks. 3.1.4. Population-based Methods Population-based optimization methods attempt to exploit nature’s optimization strategies by abstracting and copying them. It always considers a set of different individuals (each individual represents a parameter set), which together form a population. Through interactions between individuals, the population changes as the number of cycles progresses. This method brings a lot of advantages. First of all no gradients are necessary. In addition to that, it is above all the high efficiency and high performance qualifying this kind of optimization methods for high dimensional problems and complex tasks. The disadvantages are the relative slow convergence and the high number of necessary objective function calls, which make the application unfavorable for timeconsuming simulations. 3.1.5. Surrogate modeling based-methods Surrogate modeling techniques are of particular interest for expensive optimization problems in which the objective function calculation takes a lot of time. For this, the surrogate model uses the available information to approximate the entire objective function topology. They can be used to quickly find global optima. In addition, the suitability for filtering numerical noise, solving multi-objective problems and the possibility of parallelizing the computation are advantageous.

6

3.1.6. Hybrid methods Hybrid methods try to combine different optimization methods in order to improve their properties in terms of convergence speed, number of necessary objective function calls and the handling of rough topologies. 3.2. Choice of an optimization method According to the no free lunch theorem of Wolpert and Macready [20] there is no ultimate optimization method that can solve all optimization problems equally well. Therefore, each optimization problem must be considered separately and choice respectively development of an optimization algorithm should take place in relation to the special requirements of this problem. Hence it’s important to get as many as possible a-priori information about the optimization problem as possible. For discrete element simulations following facts can be summarized. • nonlinear objective function topology • expensive computation • gradients can not be obtained directly • limitations of the search space in form of parameter boundaries (restricted optimization problem) • there can be more than one optimum or many local optima • rough objective function surface due to stochastic effects and numerical noise • more than one objective function value is necessary for calibration (see [17]) • discontinuities can occur (e.g. bridge building in funnel flow experiments due to high friction values)

7

Referring to the characteristics above most of the methods are not suitable for the given calibration problem. Considering the facts of expensive computation and that gradients can not be obtained directly, gradient-based methods are out of question. Numerical determination of gradients would dramatically increase the computational effort as well as calibration times. Simplex- and complex-based methods would be a good choice if there wouldn’t be no or just a few local optima. On a rough objective function surface these methods would rapidly converge into a local optimum. Furthermore these methods are not good in handling multi-objective problems. The same facts apply to stochastic algorithms. Population-based methods are good for rough surfaces but often need a lot of objective function calls to identify a global optimum. This contradicts the target of less computational effort. All in all an algorithm must be capable of identifying a global optimum on an objective function surface with many local optima within a few iterations. Most surrogate modeling-based methods fulfil this requirement. Furthermore, most of these methods can handle rough surfaces as well as parameter limitations and multi-objectve problems. There are also concepts how to handle discontinuities [21]. In summary, surrogate modeling-based methods seem like a very good choice for calibration of DEM models.

4. Generalized surrogate modeling-based calibration In this section, the generalized surrogate modeling-based calibration (GSMC) is presented. There are a lot of of optimization methods using surrogate models [22, 23, 24]. Different terminology often leads to confusion when it comes to the analysis and classification of these methods. Just for the term surrogate model, a multiplicity of synonyms exists, such as response surface model, meta model, approximation model or emulator [25]. Furthermore, most of these algorithms have a similar basic structure and differ only in individual points, such as the type of surrogate model used. Despite all these similarities, most of these algorithms are not compatible with each other. This also means, an exchange of

8

individual components usually results in a completely new algorithm. Strictly speaking, GSMC is an attempt to define a common terminology and classification for the variety of existing optimization methods that use surrogate models. At the same time, this classification forms the base for a modularized optimization algorithm, that can be modified quickly and easily by exchanging certain components. By virtue of different components, almost any existing surrogate modeling-based algorithm can be modelled. In addition, there is the possibility of new component combinations. Hence GSMC is an extension of existing algorithms. Figure 3 shows the basic component structure and process of GSMC. The individual components are briefly explained below. Figure 3: Component structure and process of Generalized surrogate modeling-based calibration (GSMC)

4.1. Initial sampling strategy The initial sampling strategy is used to generate the first parameter sets at the beginning of every calibration process. The aim is to obtain a good coverage of the entire parameter space with a few sampling points. Every sampling is based on a specific design of experiments (DoE). The difference is that values of a design are normalized to 1 while sampling points refer to the real parameter limits. Frequently used and most simple are randomly distributed designs (RDD). These are generated using monte carlo methods. To measure the ”quality” of a sampling various metrics can be used. Widely accepted for all kind of design issues are metrics measuring the space filling properties of the design. Probably the most important representives are called maximin- and minimax -criterion [26]. Considering this criteria RDDs are often not very good and therefore rarely used for calibration. Also popular are full factorial designs (FFD), creating a grid of uniform distributed points. For many parameters, however, the number of required sampling points is very high. Nevertheless, this method is still state of the art for calibrating discrete element models. 9

To reduce the number of necessary samples, fractional factorial designs as well as Plackett-Burman designs (PBD) [27] can be used. For constructing response surface models from simulation data Box Behnken designs (BBD) are well suited [28]. One disadvantage is that they can only be constructed for 3 parameters or more. Latin hypercube designs [29] (LHD) combine the advantages of FFDs and RDDs. Samples are randomly, yet evenly, distributed about parameter space. Orthogonal Latin Hypercube Designs (OLHD) try to reach a better space filling than normal LHDs by dividing parameter space into equally probable subspaces and ensure that each subspace is sampled with the same density. Ye [30] shows that OLHDs are very good for fitting second order response surface models. A limitation of OLHDs is that the number of sampling points increases dramatically as the number of parameters grows. Symmetric Latin Hypercube Designs (SLHD) remove this limitation [31]. Figure 4 shows examples of different designs for 2 parameters. Figure 4: Various design of eperiments for 2 parameters

Many of the methods mentioned use random variables, which leads to the fact that even designs created by the same method can differ from each other. As result, the quality of a design can also vary widely. Therefore every design respectively sampling should be optimized using the metrics mentioned above. For example, this can be done by creating 100 designs and selecting the one with the best metric value. As shown in Figure 3 such a valuation of initial samplings takes place in GSMC. 4.2. Objective function The task of the objective function is, as already mentioned, the calculation of the objective function values y = f (x) = {y1 , . . . , yM }. For calibration tasks the objective function is subdivided into a result function and a distance metric. If the analytical form of the result function is not known, it is called a black box function (BBF). Every numerical simulation is such a BBF. Among the characteristics of BBFs is that they usually require very long computation 10

times, gradients can not be obtained directly and the function topology has many local optima. In the present case the result function is represented by a discrete element simulation. The simulation results, which can be both scalar and vectorial quantities (e.g. time series), are then evaluated by a distance metric. The aim is to compare the current results with previously determined reference data. The best-known distance measures for scalar values are the absolute error (1) and relative error (2). AE RE

= |ycalc − ymeas | ycalc − ymeas = ymeas

(1) (2)

For calculation of distance measures between vectors, a variety of approaches and solutions exist in literature. Very widespread are Lp metrics (3). These represent a generalization of the euclidean distance measure.

dLp =

n X

! p1 p

|xi − yi |

(3)

i=1

Examples for other distance measures are Dissimilarity Metric (DISSIM)[32], Minimal Jump Cost Dissimilarity (MJC)[33] or Dynamic Time Warping (DTW)[34]. For multi-objective optimization problems it is possible to combine all distance values into a single objective function value. A common method for this is weighted sum W S (4). W S(y) =

M X

(wm · ym )

m=1

with

M X

wm = 1

(4)

m=1

Whether the objective function values should be combined strongly depends on whether there exists a global optimum that minimizes all objectives. If this is not the case, the criterion of Pareto-efficiency should be considered. Here, specific points are searched that are not dominated by other points. This condition is also called Pareto optimality and means a state in which it is not possible to improve one objective function value without at the same time having to worsen

11

another one. Figure 5 exemplary shows solutions dominated by an arbitrary objective function vector y as well as solutions dominating y. Points respectively solutions in white areas are considered neither better nor worse. Figure 5: Solutions dominated by an arbitrary objective funtion vector y as well as solutions dominating y (M = 2)

4.3. Stopping criterion To prevent the optimization process respectively the calibration process from running forever it is necessary to define one or more suitable stopping criteria. One of the simplest and most popular methods for avoiding infinite loops is the termination after a predefined maximum value of calculation cycles, evaluations or CPU time has reached. This approach, also referred to as Exhaustion-based criterion [35], should always be implemented to ensure the algorithm will stop at any case. Other often used stopping criteria are the definition of Minimum Fitness Values or Convergence Detection [36]. Other stopping criteria for multiobjective optimization problems have been proposed in [37]. If no stopping criterion is met, new points will be generated by the adaptive sampling strategy. 4.4. Adaptive sampling strategy Task of the adaptive sampling strategy is to search for new points which should be evaluated next by the objective function. For this, an optimizer looks for points that maximize an acquisition function that evaluates the calculated estimates of a surrogate model. 4.4.1. Surrogate Model A surrogate model fˆ represents an approximation of the real objective function f , therefore it attempts to map the relationship between the inputs of a DEM simulation and the evaluated outputs of the distance function as well as ˆ = (ˆ possible. The goal is to provide an estimate y y1 , . . . , yˆM ) for an unknown parameter set x∗ with little computational effort. Meanwhile, a variety of surrogate model types exist which have different properties. 12

The most popular representatives include linear regression models (LR). The use of such models is often referred to as response surface method (RSM). Other examples, for commonly used surrogate models, are radial basis functions (RBF) [38], artificial neural networks (ANN), support vector machines (SVM) [39], multi-adaptive regression splines (MARS) [40], Kriging (KRG) or gaussian process regression (GPR). Basically, all surrogate models can be divided into two groups - probabilistic and non-probabilistic models. While the latter only provide an estimated value ˆ , the probability models also take into account the probability ˆs. This is an y additional measure for estimation uncertainty. An assignment of the individual model types to the different groups is given in figure 6. Figure 6: Overview and classification of surrogate models

4.4.2. Acquisition function The values predicted by the surrogate model are evaluated by an acquisition function a in order to assign a scalar fitness value z to the unknown parameter set x∗ . z = a(f (x∗ ))

(5)

Again, a distinction can be made between probabilistic and non-probabilistic functions. The simplest non-probabilistic acquisition function for single-objective problems just measures the improvement (I) between the currently best objective function value ymin and an estimated value yˆ = fˆ(x∗ ). I(x∗ ) = max{ymin − fˆ(x∗ ), 0}

(6)

The lower the number of training data, the greater the uncertainties of the surrogate model. To take this into account, probabilistic acquisition functions are used, which include the approximated estimation error ˆs in the calculation. One of the best-known probabilistic acquisition functions is expected improvement (EI) [41]. Other representatives, for problems with only one objective function value, are upper confidence bound (U CB) [42], probability of 13

improvement (P OI) [43] or generalized expected improvement (GEI) [44]. Especially at the beginning of an optimization, the predictive inaccuracy of a surrogate model is often still very high but improves as number of training data increases. To pay attention to this fact most of these functions, have additional weighting parameters that can be used to change the behavior of the function. Depending on these weights, it either looks for points improving the model, or points representing an optimum. Often these weights are adjusted during optimization. For multi-objective problems the expected improvement of hypervolume (EIHV ) [45] can be used. This is done by determining the Pareto-optimal solutions and calculating the dominated hypervolume between these solutions and a reference point yref . The coordinates of the reference point should contain the maximum errors for each objective function value. For example, using the relative error RE as distance metric, a reference point yref = (1, 1, . . . , 1) can be assumed. More acquisition functions can be found in [46]. At this point should be mentioned that using probabilistic surrogate models in combination with probabilistic acquisition functions is often referred to as Bayesian optimization. 4.4.3. Optimizer An optimizer is used to find a parameter set x∗ that fulfills the following requirement min a(f (x∗ ))

x∗ ∈B

(7)

For this, robust optimization methods should be used that perform a global search, e.g. population-based methods. 5. Experimental investigations The experimental setup used for the calibration is the same as described by Roessler [17] and will be briefly explained below. For all experimental investigations a sample of dry coarse gravel classified as 16/8mm is used. The material is easy to handle and represents a typical 14

cohesive less, free flowing material which can be simulated using the real particle size distribution (PSD) in reasonable computing time. The moisture content is measured by mass to 0.5%. The measured averaged bulk density is 1478 kg/m3 . The wall friction coefficients between the bulk material sample and stainless steel and acrylic plastic are determined by Jenike-wall-friction-tester to 0.30 respectively 0.36. All measurements have been conducted at least five times. The experiment used for calibration is the draw down test. The wide and heights of the boxes are given in Figure 7. The depths are 100 mm for the upper Figure 7: Principal of the draw down test

and 180 mm for the lower box. Both boxes are made of acrylic plastic. The bottom of the upper box is made of steel and has a variably adjustable central rectangular opening, which can be suddenly opened to allow the bulk material flow out of the upper box. Further, the test allows to measure the mass flow rate during the experiment. Due to that, the upper box is positioned on load cells to measure the changes of the total bulk material mass inside the upper box during the discharge. In the conducted experiment a total mass of 32 kg of coarse gravel is evenly poured into the upper box resulting to a filling height of 405 mm. The outlet is set to a width of 100 mm. After opening the outlet the material flows into the bottom box and the mass flow rate is measured. The outflowing bulk material forms a pile in the bottom box, while the remaining bulk material forms two slopes in the upper box. After the discharge the angle of repose β and the shear angle ϕ of the slopes are measured via image processing. The test is repeated three times. All in all the test provides 4 measurement values that can be used for calibration - angle of repose β, shear angle ϕ, mass in the lower box m and average mass flow rate m. ˙ Table 1 summarizes the experimental results. Table 1: Measurement values

15

6. Calibration Setup Since GSMC just provides a generalized and modularized algorithm for surrogate modeling-based calibration, the individual components used for calibration still need to be defined. This includes choosing a suitable surrogate model, an initial sampling strategy, a stopping criterion, a distance function and last but not least defining simulation setup as well as objectives, calibration parameters and associated parameter ranges. 6.1. Structure and procedure of the simulations R For DEM simulations the open source software LIGGGHTS in version

3.8.0 is used. In DEM simulations the flow behavior of the particles including pile formations and mass flow rates is mainly influenced by the contact parameters particle-on-particle friction µ and rolling friction µr [47]. Since spherical particles are usually used in the DEM for the simulation of real, non-spherical particles, the coefficient of rolling friction must take into account the shape of the real particles [48]. For calibration, only these two parameters are varied. The parameter boundaries are set to 0.1 and 0.8. All other simulation parameters are listed in table 2. Table 2: DEM simulation parameters

The actual number of particles, which is in the upper box before discharge starts, strongly depends on the mentioned parameters. To ensure a constant initial mass of 32kg, the density of the particles is therefore adjusted after the filling process. 6.2. Choice of surrogate model As already mentioned, there are several surrogate model types that can be used. In order to check which model performs best in the present case, various

16

representatives were selected and tested. The tested model types are artificial neural networks (ANN), gaussian process regression (GPR), multi adaptive regression-splines (MARS) and universal kriging (UK). For testing and benchmarking the different model types, data from a full factorial design with 64 points is used. The aim of a surrogate model should be to provide a good approximation of the entire objective function space with as few training sets as possible. In order to check the approximation accuracy of the models, the number of points, used for training of the models is gradually increased from 8 to 64. After each training, the estimates of all 64 points are calculated and compared to the real values. As metrics for evaluation, coefficient of determination R2 (8) and mean absolute error M AE (9) between estimates and actual values are used. Ideal values would be R = 1 and M AE = 0. P (yi − yˆi )2 R2 = 1 − P (yi − y¯)2

(8)

I

M AE =

1X |ˆ yi − yi | I i=1

(9)

The results of the investigations are shown in Figure 8. It can be seen that GPR performs best on both metrics. That’s why for the subsequent calibration, this model type was chosen. How gaussian process regression works should be briefly explained below. Figure 8: Benchmark of different surrogate models

Gaussian Process Regression. Gaussian processes (GPs) are stochastic processes in which every finite subset of random variables follows a multivariate Gaussian distribution. Expressed differently it is assumed that the results ym = {y1,m , . . . , yI,m }T with m = 1, . . . , M produced by any point set X = {x1 , . . . , xI }T are subject to a multivariate normal distribution (Gaussian) with mean vector m and covariance matrix K (eq. 10).

p(ym | X, Ψ) = N (ym | m, K) 17

(10)

In difference to Gaussian distribution, a Gaussian process is specified by a mean function µ(x) and a covariance function or kernel k(x, x0 , Ψ) (eq. 11). Most covariance functions need some additional parameters called hyperparameters Ψ, which basically control smoothness of the model. f (x) ∼ GP(µ(x), k(x, x0 , Ψ))

(11)

For surrogate modeling we use Gaussian process regression, which is very well suited for regression of nondeterministic systems. For this a mean function and kernel must be chosen a priori. Several examples for different covariance functions are given by Williams and Rasmussen [49]. For the following investigations a combination of squared exponential-kernel and Matern-kernel was used (eq. 13). In both equations, d indicates the Euclidean distance between points x and x0 , while l and ν are hyperparameters. Furthermore mean value is assumed to be constant. d2 kSE (d, l) = exp − 2 2l 

21−ν kM atern (d, ν, l) = Γ(ν)



2νr l

 (12) √

!ν Kv

2νr l

! (13)

Since mean function and kernel are chosen a priori, hyperparameters must be determined by computing the posterior function fˆ from a given dataset D = {(xi , yi )}. The hyperparameter ν is excluded here, since it was set to 3/2 a priori. The exact procedure for determining the posterior function as well as the hyperparameters should be omitted here. For further readings see [49]. Once a posterior has been determined the function value yˆ = fˆ(x∗ ) for an unknown point x∗ can be estimated. Furthermore, the model offers the possibility to calculate the standard deviation σ ˆ (x∗ ), which can be used to evaluate the prediction accuracy. Figure 9 shows an example for a simple 1D gaussian process regression.

18

Figure 9: Example of gaussian process regression. The solid line describes the mean function of the approximate posterior process model and the gray area covers 2 standard deviations. The red points are the given dataset that was used for training.

6.3. Components and settings for the calibration algorithm As initial sampling strategy, a symmetric latin hypercube sampling with 9 points is used, which is optimized using the maximin-criterion. Distance measurement is done by calculating the weighted sum of the relative errors of all objectives. . To search for new points to be evaluated next by the objective function, a combined acquisition function is used. First, we look for a point that maximizes Expected Improvement, then for a point that maximizes Probability of Improvement. Thus, two new points are determined per iteration, which can then be evaluated in parallel. A calibration sequence will be stopped if a maximum number of 21 evaluations has reached, which is equivalent to a reduction of computational effort of 67,2 % compared to the full factorial design used in [17]. With the first 9 points evaluated in parallel and 2 new points per iteration the maximum number of iterations won’t exceed 7 cycles.

7. Results and discussion Table 3 shows the results from 5 independent calibration runs. It can be seen that each time a different parameter set has been identified as optimum. Table 3: Calibration results - The best found parameter sets of 5 independent calibration runs as well as corresponding objective function values.

To see if the results are correct, Figure 10 shows them in parameter space. Using the data from the FFD, the weighted sum of the relative errors was calculated and is displayed as contour plot. The white area in the upper right corner indicates invalid simulation runs. This was due to the high coefficients 19

Figure 10: Presentation of calibration results in parameter space. Weighted sum of relative errors is color coded and represents the objective function surface.

of friction causing bridging and preventing material flow. It can be seen that all identified parametersets are approximately at µ = 0.2 and µr = 0.8. This result agrees with the optimum already identified by Roessler [17]. To find out if the results are within the systemic stochastic fluctuation range, the results of 5 different FFDs with 64 parameter sets each have been examined. In each case, the standard deviation from the common mean value was calculated for each parameter set and objective function value as well as for the weighted sum of errors. The variation of the standard deviations, shown in Figure 11, represents a measure of the stochastic fluctuation range of the DEM simulations. It can be seen that the median of standard deviation σ of weighted sum W S Figure 11: Standard deviation of relative errors and weighted sum of errors

is approximately 1.5%. This in turn means a more accurate calibration result than 1.5% is hardly possible. Thus, all solutions are within the permissible range, which covers the stochastic deviations. This proves that the developed algorithm reliably and efficiently provides optimal calibration results.

8. Conclusions In the presented paper, the problem of developing and testing highly-efficient optimization-based calibration procedures for DEM models is discussed. After comparing the pros and cons of various optimization methods, surrogate modeling-based methods seem to be most appropriate for DEM calibration. The limitation to surrogate modeling-based methods does not yet clarify which type of surrogate model is best for the present case and how, for example, initial points should be generated. To facilitate the search for a suitable algorithm, a novel method, called generalized surrogate modeling-based optimization is presented.

20

This method attempts to provide a generalized and modularized flowchart for surrogate modeling-based calibration. The modular design allows certain components to be replaced quickly and easily. Thus, for example, new surrogate models or initial sampling strategies can be quickly checked and compared against each other. For validating and benchmarking the new algorithm, the behavior of gravel as a non-cohesive bulk material has been examined. The calibration experiment used was the draw down test and calibration parameters were particle sliding and rolling friction. Examining various surrogate models has shown that Gaussian process regression maps the objective function surface best. The investigations have shown that in all cases the algorithm has led to a parameter set which is in the optimum range. The optimal range is defined by the stochastic fluctuation range of the DEM simulations. Further research should focus on the following contents. The present work has shown that calibration of two parameter values leads to good results. In the next step, it should be checked whether the developed algorithm is also suitable for tasks with more parameter values. Another aim should be a benchmark of the proposed algorithm and other optimization methods. A meaningful comparison with the methods proposed in other publications is not possible at the moment. This is due to the fact that the calibration tests as well as objectives, parameters and parameter ranges differ from calibration setups used by other authors. In order to obtain an exact comparison, all algorithms must be tested under same conditions.

9. Acknowledgement This work was supported by the DFG - Deutsche Forschungsgemeinschaft [grant number: KU1074/16-1, 2017]

21

10. References References [1] P. A. Cundall, O. D. L. Strack, A discrete numerical model for granular assemblies, G´eotechnique 29 (1) (1979) 47–65. arXiv:https://doi.org/ 10.1680/geot.1979.29.1.47, doi:10.1680/geot.1979.29.1.47. URL https://doi.org/10.1680/geot.1979.29.1.47 [2] H. Zhu, Z. Zhou, R. Yang, A. Yu, Discrete particle simulation of particulate systems:

Theoretical developments, Chemical Engineer-

ing Science 62 (13) (2007) 3378 – 3396, frontier of Chemical Engineering - Multi-scale Bridge between Reductionism and Holism. doi:https://doi.org/10.1016/j.ces.2006.12.089. URL

http://www.sciencedirect.com/science/article/pii/

S000925090700262X [3] A. Katterfeld, C. Wensrich, Understanding granular media: from fundamentals and simulations to industrial application, Granular Matter 19 (11 2017). doi:10.1007/s10035-017-0765-y. [4] G. Hess, C. Richter, A. Katterfeld, et al., Simulation of the dynamic interaction between bulk material and heavy equipment: Calibration and validation, in: 12th International Conference on Bulk Materials Storage, Handling and Transportation (ICBMH 2016), The, Engineers Australia, 2016, p. 427. [5] M. Obermayr, C. Vrettos, Anwendung der diskrete-elemente-methode zur vorhersage von kr¨ aften bei der bodenbearbeitung, geotechnik 36 (4) (2013) 231–242.

arXiv:https://onlinelibrary.wiley.com/doi/pdf/

10.1002/gete.201300009, doi:10.1002/gete.201300009. URL

https://onlinelibrary.wiley.com/doi/abs/10.1002/gete.

201300009

22

[6] A. Grima, P. Wypych, Discrete element simulation of a conveyor impactplate transfer: Calibration, validation and scale-up, Australian Bulk Handling Review (2010) 64–72Cited By 9. [7] E. Tijskens, H. Ramon, J. Baerdemaeker, Discrete element modelling for process simulation in agriculture, Journal of Sound and Vibration 266 (3) (2003) 493 – 514, first International ISMA Workshop on Noise and Vibration in Agricultural and Biological Engineering. doi:https://doi.org/10.1016/S0022-460X(03)00581-9. URL

http://www.sciencedirect.com/science/article/pii/

S0022460X03005819 [8] J. Zhao, J. Xiao, M. L. Lee, Y. Ma, Discrete element modeling of a mining-induced rock slide, SpringerPlus 5 (1) (2016) 1633. doi:10.1186/ s40064-016-3305-z. URL https://doi.org/10.1186/s40064-016-3305-z [9] M. Kodam, J. Curtis, B. Hancock, C. Wassgren, Discrete element method modeling of bi-convex pharmaceutical tablets: Contact detection algorithms and validation, Chemical Engineering Science 69 (1) (2012) 587 – 601. doi:https://doi.org/10.1016/j.ces.2011.11.011. URL

http://www.sciencedirect.com/science/article/pii/

S0009250911008104 [10] C. Coetzee, Review: Calibration of the discrete element method, Powder Technology 310 (2017) 104 – 142.

doi:https://doi.org/10.1016/j.

powtec.2017.01.015. URL

http://www.sciencedirect.com/science/article/pii/

S0032591017300268 [11] J. Favier, D. Curry, R. LaRoche, Calibration of dem material models to approximate bulk particle characteristics, in: Proceedings of the 6th World Congress on Particle Technology, Nuremberg, Germany, 2010.

23

[12] M. W. Johnstone, Calibration of dem models for granular materials using bulk physical tests (2010). [13] H. Q. Do, A. M. Arag´on, D. L. Schott, A calibration framework for discrete element model parameters using genetic algorithms, Advanced Powder Technology 29 (6) (2018) 1393 – 1403. doi:https://doi.org/10.1016/j.apt.2018.03.001. URL

http://www.sciencedirect.com/science/article/pii/

S0921883118300773 [14] J. Yoon, Application of experimental design and optimization to pfc model calibration in uniaxial compression simulation, International Journal of Rock Mechanics and Mining Sciences 44 (6) (2007) 871 – 889. doi:https://doi.org/10.1016/j.ijrmms.2007.01.004. URL

http://www.sciencedirect.com/science/article/pii/

S1365160907000135 [15] L. Benvenuti,

C. Kloss,

S. Pirker,

Identification of dem simu-

lation parameters by artificial neural networks and bulk experiments, Powder Technology 291 (2016) 456 – 465.

doi:https:

//doi.org/10.1016/j.powtec.2016.01.003. URL

http://www.sciencedirect.com/science/article/pii/

S003259101630002X [16] M. Rackl, K. J. Hanley, A methodical calibration procedure for discrete element models, Powder Technology 307 (2017) 73 – 83. doi:https://doi.org/10.1016/j.powtec.2016.11.048. URL

http://www.sciencedirect.com/science/article/pii/

S0032591016308403 [17] T. Roessler, C. Richter, A. Katterfeld, F. Will, Development of a standard calibration procedure for the dem parameters of cohesionless bulk materials – part i:

Solving the problem of am-

biguous

Powder

parameter

combinations, 24

Technology

(2018).

doi:https://doi.org/10.1016/j.powtec.2018.11.034. URL

http://www.sciencedirect.com/science/article/pii/

S0032591018309380 [18] A. Torn, A. Zilinskas, Global optimization, Springer-Verlag New York, Inc., 1989. [19] J. Meier, Parameterbestimmung mittels inverser Verfahren f¨ ur geotechnische Problemstellungen, Univ.-Verlag, 2009. [20] D. H. Wolpert, W. G. Macready, No free lunch theorems for optimization, IEEE transactions on evolutionary computation 1 (1) (1997) 67–82. [21] C. Boursier Niutta, E. J. Wehrle, F. Duddeck, G. Belingardi, Surrogate modeling in design optimization of structures with discontinuous responses, Structural and Multidisciplinary Optimization 57 (5) (2018) 1857–1869. doi:10.1007/s00158-018-1958-7. URL https://doi.org/10.1007/s00158-018-1958-7 [22] F. A. C. Viana, R. T. Haftka, L. T. Watson, Efficient global optimization algorithm assisted by multiple surrogate techniques, Journal of Global Optimization 56 (2) (2013) 669–689. doi:10.1007/s10898-012-9892-5. URL https://doi.org/10.1007/s10898-012-9892-5 [23] C. Wang, Q. Duan, W. Gong, A. Ye, Z. Di, C. Miao, An evaluation of adaptive surrogate modeling based optimization with two benchmark problems, Environmental Modelling & Software 60 (2014) 167 – 179. doi:https://doi.org/10.1016/j.envsoft.2014.05.026. URL

http://www.sciencedirect.com/science/article/pii/

S1364815214001698 [24] W. Gong, Q. Duan, J. Li, C. Wang, Z. Di, A. Ye, C. Miao, Y. Dai, Multiobjective adaptive surrogate modeling-based optimization for parameter estimation of large, complex geophysical models, Water Resources Research 52 (3) 1984–2008. arXiv:https://agupubs.onlinelibrary.wiley.com/ 25

doi/pdf/10.1002/2015WR018230, doi:10.1002/2015WR018230. URL https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1002/ 2015WR018230 [25] Z.-H. Han, K.-S. Zhang, Surrogate-based optimization, in: Real-world applications of genetic algorithms, InTech, 2012. [26] M. Johnson, L. Moore, D. Ylvisaker, Minimax and maximin distance designs 26 (1990) 131–148. [27] R. L. Placket, J. P. Burman, The design of optimum multifactorial experiments,

Biometrika 33 (4) (1946) 305–325.

arXiv:

/oup/backfile/content_public/journal/biomet/33/4/10.1093/ biomet/33.4.305/2/33-4-305.pdf, doi:10.1093/biomet/33.4.305. URL http://dx.doi.org/10.1093/biomet/33.4.305 [28] A. I. Khuri, S. Mukhopadhyay, Response surface methodology, Wiley Interdisciplinary Reviews:

Computational Statistics 2 (2) (2010)

128–149. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/ wics.73, doi:10.1002/wics.73. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/wics.73 [29] M. D. McKay, R. J. Beckman, W. J. Conover, A comparison of three methods for selecting values of input variables in the analysis of output from a computer code, Technometrics 21 (2) (1979) 239–245. URL http://www.jstor.org/stable/1268522 [30] K. Q. Ye, Orthogonal column latin hypercubes and their application in computer experiments, Journal of the American Statistical Association 93 (444) (1998) 1430–1439. [31] W. Li, K. Ye, Optimal symmetric latin hypercube designs (03 1998). [32] E. Frentzos, K. Gratsias, Y. Theodoridis, Index-based most similar trajectory search, in: Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, IEEE, 2007, pp. 816–825. 26

[33] J. Serra, J. L. Arcos, A competitive measure to assess the similarity between two time series, in: International Conference on Case-Based Reasoning, Springer, 2012, pp. 414–427. [34] Dynamic Time Warping, Springer Berlin Heidelberg, Berlin, Heidelberg, 2007, pp. 69–84. doi:10.1007/978-3-540-74048-3_4. URL https://doi.org/10.1007/978-3-540-74048-3_4 [35] K. Zielinski, R. Laur, Stopping criteria for differential evolution in constrained single-objective optimization, in: Advances in differential evolution, Springer, 2008, pp. 111–138. [36] G. Haeser, V. V. de Melo, Convergence detection for optimization algorithms: Approximate-kkt stopping criterion when lagrange multipliers are not available, Operations Research Letters 43 (5) (2015) 484 – 488. doi:https://doi.org/10.1016/j.orl.2015.06.009. URL

http://www.sciencedirect.com/science/article/pii/

S0167637715000826 [37] I. Voutchkov, A. Keane, Multiobjective optimization using surrogates (April 2006). URL https://eprints.soton.ac.uk/37984/ [38] M. J. Orr, et al., Introduction to radial basis function networks (1996). [39] C. Cortes, V. Vapnik, Support-vector networks, Machine learning 20 (3) (1995) 273–297. [40] J. H. Friedman, Multivariate adaptive regression splines, The annals of statistics (1991) 1–67. [41] D. R. Jones, M. Schonlau, W. J. Welch, Efficient global optimization of expensive black-box functions, Journal of Global Optimization 13 (4) (1998) 455–492. doi:10.1023/A:1008306431147. URL https://doi.org/10.1023/A:1008306431147

27

[42] N. Srinivas, A. Krause, S. M. Kakade, M. W. Seeger, Gaussian process bandits without regret: An experimental design approach, CoRR abs/0912.3995 (2009). arXiv:0912.3995. URL http://arxiv.org/abs/0912.3995 [43] J. Snoek, H. Larochelle, R. P. Adams, Practical bayesian optimization of machine learning algorithms, in: Advances in neural information processing systems, 2012, pp. 2951–2959. [44] M. Schonlau, W. J. Welch, D. R. Jones, Global versus local search in constrained optimization of computer models, Vol. Volume 34 of Lecture Notes–Monograph Series, Institute of Mathematical Statistics, Hayward, CA, 1998, pp. 11–25. doi:10.1214/lnms/1215456182. URL https://doi.org/10.1214/lnms/1215456182 [45] M. Emmerich, Single-and multi-objective evolutionary design optimization assisted by gaussian random field metamodels, Dissertation, LS11, FB Informatik, Universit¨ at Dortmund, Germany (2005). [46] T. Wagner, M. Emmerich, A. Deutz, W. Ponweiser, On expectedimprovement criteria for model-based multi-objective optimization, in: R. Schaefer, C. Cotta, J. Kolodziej, G. Rudolph (Eds.), Parallel Problem Solving from Nature, PPSN XI, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010, pp. 718–727. [47] Z. Yan, S. K. Wilkinson, E. H. Stitt, M. Marigo, Discrete element modelling (dem) input parameters: understanding their impact on model predictions using statistical analysis, Computational Particle Mechanics 2 (3) (2015) 283–299. doi:10.1007/s40571-015-0056-5. URL https://doi.org/10.1007/s40571-015-0056-5 [48] C. Wensrich, A. Katterfeld, Rolling friction as a technique for modelling particle shape in dem, Powder Technology 217 (2012) 409–417.

28

[49] C. K. Williams, C. E. Rasmussen, Gaussian processes for machine learning, the MIT Press 2 (3) (2006) 4.

29

0.7 0.6 0.5

0.18 Weighted sum of relative errors

coefficient of rolling friction

r

0.8

0.15 0.12

0.4

0.09

0.3

0.06

0.2

0.03

0.1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 coefficient of friction

0.00

Standard deviation

0.05 0.04 0.03 0.02 0.01 0.00

m

m

WS

Calibration Process Initial Sampling Strategy

parameters and boundaries

create sampling

valuate sampling

Distance Metric

Sampling quality criterion fulfilled?

reference data / target values

Result Function (Simulation) Optimizer Surrogate Acquistion Function

Objective Function Adaptive Sampling Strategy

Stopping criterion fulfilled?

start calibration

calculate distance values

calculate results

parameters and boundaries

generate candidate points

predict

evaluate

Best candidate points found?

train

finish calibration

0

0

x1

1

Latin Hypercube Design

x2

1

x2

1

x2

1

Full Factorial Design

0

0

x1

1

Symmetric Latin Hypercube Design

1

x2

Randomly Distributed Design

0

0

x1

1

0

0

x1

1

Coefficient of determination R 2 Mean absolute error

1.0 0.5 0.0 0.5 1.0 1.5

10

ANN

GPR

MARS

20

30

40

UK

50

0.06 0.04 0.02 0.00

10

20 30 40 50 Number of parameter sets used for training

Highlights •

Overview of various optimization methods and evaluation of their suitability for calibration of DEM contact models



Presentation of a new calibration procedure called „generalized surrogate modeling-based calibration“ (GSMC)



Benchmarking and comparison of various surrogate models



Test of the new procedure for calibration of cohesionless bulk material