Optimal experimental design of ill-posed problems: The METER approach

Available online at www.sciencedirect.com Computers and Chemical Engineering 32 (2008) 115–124 Optimal experimental design of ill-posed problems: Th...

Download PDF

969KB Sizes 18 Downloads 317 Views

Report

PDF Reader
Full Text

Available online at www.sciencedirect.com

Computers and Chemical Engineering 32 (2008) 115–124

Optimal experimental design of ill-posed problems: The METER approach Andr´e Bardow a,b,∗ a

Institute of Polymers, Department of Materials, ETH Zurich, 8093 Zurich, Switzerland b Lehrstuhl f¨ ur Prozesstechnik, RWTH Aachen University, 52056 Aachen, Germany

Received 30 October 2006; received in revised form 12 April 2007; accepted 6 May 2007 Available online 16 May 2007

Abstract With modern measurement techniques it has become possible to extract unknown functional dependencies directly from the data. The underlying inverse problems, however, are much more demanding than standard parameter estimation. Still, systematic strategies for experimental design of these generally ill-posed problems are missing. An optimal experimental design criterion for ill-posed problems is proposed in this work. It is based on the minimization of the expected total error (abbreviated as METER) between true and estimated function. Thereby, the sound integration of the bias–variance trade-off critical to the solution of ill-posed problems is achieved. While standard optimal experimental design criteria are shown to give even qualitatively wrong predictions METER designs prove to be robust and sound. The approach is introduced for linear problems and exemplified in case studies from reaction kinetics and particle sizing by light scattering. © 2007 Elsevier Ltd. All rights reserved. Keywords: Experimental design; Inverse problem; Parameter estimation; Reaction kinetics; Numerical differentiation; Robustness; Light scattering

1. Introduction In model-based experimentation, the goal is often to extract an unknown functional relationship from the data. Standard examples are e.g. reaction rates or phase equilibria as function of the state variables. The usual approach to this identification problem is to reduce the problem complexity by introducing a two-step procedure (e.g. Asprey & Macchietto, 2000): • first, a model structure is specified or determined from a set of model candidates in a model discrimination step; • then, the unknown parameter values contained in the model are determined to sufficient precision in a parameter estimation step. As it is well known, e.g. in the context of process design (Yee, Grossmann, & Kravanja, 1990), such a two-step divideand-conquer strategy may lead to inferior solutions as later ∗ Correspondence address: Lehrstuhl f¨ ur Prozesstechnik, RWTH Aachen University, 52056 Aachen, Germany. Tel.: +49 241 809 48 55; fax: +49 241 809 23 26. E-mail address: [email protected].

0098-1354/$ – see front matter © 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.compchemeng.2007.05.004

steps depend on the choices in the first step. In the context of model identification, a poor selection of candidate models might exclude the possibility of finding the relationship actually searched for and thus lead to failure of the experimental endeavor. It is therefore desirable to avoid the separation of the problem into two parts and to determine the unknown function directly (Beck & Woodbury, 1998). With the advent of high-resolution measurement techniques, modern process information management systems and advanced mathematical methods this direct route is now becoming increasingly feasible (Marquardt, 2005). Still, the identification of unknown functions represents an infinite dimensional inverse problem. In addition, these problems are generally ill-posed, i.e. the solution is not unique or does not depend continuously on the data (Engl, Hanke, & Neubauer, 1996). The solution of ill-posed problems for function estimation therefore poses much higher requirements on the data than standard parameter estimation problems, where a finite number of parameters are determined within a known model structure. Despite the increased complexity in the estimation problem, the systematic generation of optimal experimental conditions for ill-posed problems has received only little attention. For

116

A. Bardow / Computers and Chemical Engineering 32 (2008) 115–124

pure parameter estimation, model-based optimal experimental design theory – pioneered by Wald (1943) and Box and Lucas (1959) – is now well established. For ill-posed problems, the design approaches available are generally direct extensions of these classical design methods (Bitterlich & Knabner, 2003; Ito & Kunisch, 1995; Liu, 2001; Sylte et al., 2002). Since they are set in the maximum likelihood framework they assume unbiased estimates. However, in the solution of ill-posed problems, bias is systematically introduced to stabilize the problem. The trade-off between variance and bias is actually the key element (Engl et al., 1996). A sound approach to optimal experimental design for ill-posed problems therefore has to incorporate this trade-off. However, none of the approaches currently available includes the bias effect. While these approaches might provide reasonable designs in some cases in general they lead to experimental schemes that are even qualitatively wrong as shown below. While previous approaches to optimal experimental design for ill-posed problems started from the design theory of Box and Lucas (1959), the present work rather takes the perspective of Box and Draper (1959, 1963). These authors identified the bias–variance problem in the design context for response surface models. Hence, only restricted function classes such as polynomial or spline models are regarded (Draper, Guttman, & Lipow, 1977). Usually, the design is chosen to guard against the impact of higher order terms. Due to focus on local approximations for this problem class, Box and Draper (1959) analyzed the integrated mean squared error between estimated and true function over the region of interest. The integrated mean squared error consists of a bias and a variance contribution. In practice, the bias contribution only is usually minimized giving so-called all-bias designs. In this work, the idea of using the error between estimated and true function as design measure is extended to general ill-posed problems. A criterion for the optimal experimental design for ill-posed problems is proposed based on the Minimum Expected Total ERror (METER) between the true and the estimated function. The total error in particular includes both error contributions: bias and variance. Thereby, the sound integration of the bias–variance trade-off critical to the solution of ill-posed problems is achieved. In Section 2 ill-posed problems and regularization solutions techniques are briefly introduced to prepare the derivation of the METER design criterion in Section 3. In order to limit the discussion to the essence of the method the presentation focuses on linear problems. Linear ill-posed problems occur in many practical applications ranging from the estimation of relaxation time spectra in polymers (Honerkamp & Weese, 1989) to inverse heat conduction problems (L¨uttich, Mhamdi, & Marquardt, 2005). An illustrative numerical example drawn from the study of reaction kinetics is presented in Section 4. In Section 5, the major assumptions underlying the approach are discussed and tailored formulations for different levels of a priori knowledge are developed. The METER approach is applied to particle sizing by light scattering in Section 6. Finally, conclusions are given in Section 7.

2. Ill-posed problems and regularization 2.1. Direct problem In experiments, the quantity of interest is often not directly accessible but has to be inferred from measurement data in a so-called inverse problem. Linear ill-posed problems are often obtained in form of integral equations (Engl et al., 1996): K(t, s; d)f (s) ds, (1) g(t) = T

where K denotes the kernel function acting on the unknown function f to be recovered from the measurements g. The domain of integration is denoted by T. Data g˜ (ti ) are usually available only at discrete points ti and corrupted by measurement errors (assumed here to be additive, i.e. g˜ = g + ε, Gaussian with zero mean and variance σ 2 ). The kernel function K is usually known and describes the experimental setup. It contains design parameters d that can be chosen by the experimenter. It is the goal of experimental design to find the optimal settings for these parameters. 2.2. Tikhonov regularization In many cases, the direct problem f → g has a smoothing property. Therefore, direct inversion of Eq. (1) leads to unstable solutions since measurement noise is strongly amplified rendering the solution meaningless. Thus, the inverse problem in Eq. (1) is generally ill-posed. Regularization methods have to be employed for its solution. The most common approach is Tikhonov regularization where the estimate fˆ for the unknown function f is determined as (Engl et al., 1996): 2 n 1 fˆ = argmin 2 g˜ (ti )− K(ti , s; d)f (s) ds + λLf 2 . σ T i=1 (2) Here, the first term is the data error and corresponds to the common least-squares type objective of parameter estimation. For multivariate measurements, extensions from nonlinear parameter estimation can directly be employed (Bard, 1974). The second term adds a penalty to ensure smoothness of the solution. For the operator L, the identity or the second derivative are frequently used. The regularization parameter λ gives the relative weight to both contributions of the objective. In the case of several unknown functions, further penalty terms may be added. Due to their additive nature the introduction of additional terms is straightforward. This work therefore focuses on the base case of a single measurement quantity and a single unknown function. A regularization method thus modifies the original estimation problem to be solved. By imposing the smoothness penalty the noise amplification is contained such that stable solutions are obtained. While the variance in the solution is thereby reduced to a meaningful level, it comes at the price of not solving the original problem. The computed estimated fˆ will therefore generally be biased. The choice of the regularization parameter λ is crucial as it sets the balance between bias and variance. Var-

A. Bardow / Computers and Chemical Engineering 32 (2008) 115–124

117

For Tikhonov regularization (2), the expected total error can be computed in the linear case as (Honerkamp & Weese, 1990; Weese, 1992): E(f − fˆ (d, λ; f )2 ) = f − K+ (λ)Kf 2 + σ 2 trace[K+ (λ)(K+ (λ)) ] T

(4)

variance

bias

where −1

K+ (λ) = (KT K + λLT L)

Fig. 1. Iterative working cycle of model-based experimentation.

ious regularization parameter choice methods are discussed in the literature (see discussions, e.g. in Engl et al., 1996; Hansen, 1998). 2.3. Working cycle Model-based experimentation is typically performed in an iterative working cycle built of at least three core elements: experimental design, the experiment itself and data analysis (Fig. 1) (cf. Asprey & Macchietto, 2000; Bard, 1974; Marquardt, 2005). The experimental design yields the settings d for the experiment. Based on the experimental data g˜ , an estimate fˆ is computed in the data analysis step. This estimate may then be used to refine the experimental design. This working cycle is iterated till a termination criterion, e.g. on the accuracy, is fulfilled. While a sound formulation for the data analysis problem has been presented in Eq. (2), the question of experimental design for ill-posed problems is approached in the next section.

(5)

The first term of the METER criterion for Tikhonov regularization in Eq. (4) reflects the bias introduced by the penalty term whereas the second term quantifies the variance in the estimate. In this way, the bias–variance trade-off is properly incorporated into the METER design criterion. It should be noted that this structure of the expected total error is rather general in function estimation (Geman, Bienenstock, & Doursat, 1992). The approach assumes that an initial guess for the true solution f and for the measurement noise level σ 2 are available. While an initial assumption on the true solution is required, the current estimate fˆ k may be used as guess for the true solution f in any subsequent iteration (k + 1) of the working cycle of model-based experimentation (Fig. 1). The regularization parameter λ integrates naturally as an additional free variable of the design optimization problem (3) and is determined along with the experimental settings. The METER criterion thus provides a consistent rule to determine λ where previous approaches had to rely on a priori knowledge (Ito & Kunisch,1995; Liu, 2001). The implications of these properties are discussed in detail in light of a numerical example in the following sections. In particular, formulations requiring less a priori knowledge are derived in Section 5. 4. Illustrative example: reaction rate estimation 4.1. Problem formulation

3. Minimum expected total error criterion for experimental design In optimal experimental design the free settings d of the experimental setup are chosen such that the maximal information on the unknown function can be gathered. The gathered data should enable efficient determination of a robust estimate of the function f. Thus, a successful experiment would give an estimate fˆ that is as close as possible to the true solution f. A useful measure for the quality of an experimental design is therefore the statistically expected value for the total error between the estimate fˆ and the true function f. It is therefore proposed here to obtain the optimal experimental design from minimizing the expected total error with respect to the design variables d. The minimum expected total error (METER) design for ill-posed problems is thus computed from (d ∗ , λ∗ ) = argmin E(f − fˆ (d, λ; f )2 ).

KT .

(3)

The specific merits of the new approach are discussed in the light of an example. For this purpose, the determination of reaction rates as function of time from measured concentration data is considered (Bardow & Marquardt, 2004a). The core of the underlying mathematical problem is the differentiation of experimental data. This by itself is a standard problem in chemical engineering beyond the area of reaction kinetics since often not the measured quantity itself but its derivative is of interest. In particular, such model-based rate measurements typically form the starting point for the identification of constitutive equations (Bardow, G¨oke, Koß, Lucas, & Marquardt, 2005; Bardow & Marquardt, 2004b; Brendel, Bonvin, & Marquardt, 2006; Kristensen, Madsen, & Jorgensen, 2004; Mahoney, Doyle, & Ramkrishna, 2002; Marquardt, 2005; Michalik et al., 2007; Timmer, Rust, Horbelt, & Voss, 2000). Thus, while the chosen example is conceptually simple it is already of immediate practical interest.

118

A. Bardow / Computers and Chemical Engineering 32 (2008) 115–124

The inverse problem of numerical differentiation can be written in form of an Fredholm integral equation (1) (Liu, 2001): tmax K(t, s)f (s) ds, 0 ≤ t ≤ tmax , (6) g(t) = 0

where K(t, s) = 1 for t ≥ s

and K(t, s) = 0 otherwise.

(7)

Discretization of Eq. (6) is usually less critical than in the solution of direct problems since the discretization error is typically much smaller than regularization and data error (Weese, 1993). While more advanced schemes for numerical differentiation are known (e.g. Abramovich & Silverman, 1998; Hanke & Scherzer, 2001) the finite difference scheme is still often employed in practice to determine the unknown derivative f = dg/dt from the measurements g˜ (ti ) and serves here as example. Equidistant measurements with sampling interval dt = ti − ti−1 = const. are assumed. The discretized kernel K is then a lower triangular matrix with all entries identical to dt (Liu, 2001): ⎛ ⎞ 1 0 0 ⎜. . ⎟ .. 0 ⎟ g = K · f, where K = dt ⎜ (8) ⎝ .. ⎠. 1 ··· 1 In an experiment, the sampling interval dt can be chosen by the experimenter himself. It thus serves as design parameter. It is well known that if the sampling is too coarse the approximation will be poor. However, in the inverse problem, too fine sampling can also lead to an amplification of the error since the measurement noise will corrupt the result and the variance increases (Engl et al., 1996; Hanke & Scherzer, 2001). The METER design criterion (3) is now applied to determine the optimal sampling interval for finite differences. No additional regularization parameter λ is required as the sampling interval itself has a regularizing effect. For the sound incorporation of the bias effect, the first term in the objective (4) is computed using the piecewise constant finite difference estimate and the true solution on a finer grid. Thus, the METER criterion (4) is approximated by E(f − fˆ (d, λ; f )2 ) = dtfine

nfine

2

−1

(f (ti ) − fˆ (ti )) + dt σ 2 trace[(KT K)

],

(9)

i=1

where fˆ (ti ) is the piecewise constant approximation to f on the estimation grid defined by dt. The finer interpolation grid uses here dtfine = 10−3 . In the example, a first-order reaction, leading to an exponential decay, serves as true function, i.e. f = exp(−t) with 0 ≤ t ≤ 10. Measurement standard deviation is σ = 0.01. 4.2. Optimal METER design The optimal sampling interval dt for finite difference estimation of the reaction rate is determined by the METER design

Fig. 2. METER design objective as function of sampling interval dt. Bias (dashed) and variance (dotted) contributions to the objective are also shown (left axis). The E-optimal design criterion (thin full line) is shown on the right axis.

problem (3) as dt = 0.48. In order to gain a deeper understanding of the proposed design approach the METER objective (4) has been evaluated for 0 ≤ dt ≤ 1 (Fig. 2). The expected bias–variance trade-off is indeed found in the METER objective. The variance contribution dominates for small sampling intervals, while bias is dominant for large sampling intervals. Criteria proposed previously for the design of ill-posed problems would have suggested a very different sampling policy since they solely focus on the variance contribution (Bitterlich & Knabner, 2003; Ito & Kunisch, 1995; Liu, 2001; Sylte et al., 2002). In these methods, the so-called Fisher information matrix is usually introduced as a variance measure. The Fisher matrix corresponds here to the inverse of the term inside the trace in Eq. (4). As an example for these design criteria, the E-optimal experimental design criterion used e.g. by Ito and Kunisch (1995), Liu (2001), and Bitterlich and Knabner (2003) is plotted on the right axis in Fig. 2. In E-optimal design, the smallest eigenvalue of the Fisher information matrix is maximized (Walter & Pronzato, 1990). It can be seen that the classical design criteria suggest the use of the maximum sampling time. From the perspective of Eoptimal design the result is meaningful as measurement noise will have the least effect on the finite difference estimate if the points used in the computation are farthest apart. Variance will thus be minimal. However, the estimate will be meaningless as a constant function will be recovered in any case. The computation thus demonstrates that classical design criteria are not able to reflect the specific nature of ill-posed problems. In order to assess the quantitative accuracy of the METER criterion a simulation study was performed. Simulated measurement data was corrupted with random noise and the finite difference scheme was applied to this data. The basis for the METER criterion, i.e. deviation of the estimate from the true signal, was then evaluated and averaged over 10,000 replications. In addition, the measurement error standard deviation was varied in the range σ = 10−3 , . . . , 10−1 . The total error obtained from the simulation data is compared to the error predictions used in the METER criterion in Fig. 3. It can be seen that the

A. Bardow / Computers and Chemical Engineering 32 (2008) 115–124

119

Fig. 3. Estimation error between true and estimated function as function of sampling time dt. Left: predicted estimation error as used in METER criterion. Right: estimation error averaged over 10,000 cases based on simulated measurement data.

METER criterion indeed captures the actual behavior found in the experiment. The error level corresponds well to the predicted value. The small deviations are to due to the approximate handling of the bias term by interpolation in the present example. In summary, it can be concluded that the METER criterion is able to find the best sampling time with good accuracy. Fig. 3 further shows that the measurement variance σ 2 indeed serves only as a weighting factor between the two error contributions: The bias contribution dominating for large dt stays constant while the variance contribution increases according to the multiplicative variance factor (cf. Eq. (3)). In the present case, increasing measurement errors shift the optimal sampling time to larger values which is also intuitively the correct behavior. The example of numerical differentiation studied here is well suited to demonstrate the specific properties of the METER approach. However, it is also special since the design variable, the sampling time dt, serves at the same time as implicit regularization parameter. The success of the approach therefore already shows that the new method is also able to initialize a regularization parameter. This step was missing in previous approaches (Ito & Kunisch, 1995; Liu, 2001).

In order to assess the impact of a wrong initial guess the previous scenario is slightly modified. While the true unknown model is still represented by f = exp(−t) a linear and a sine function are used as initial guess in the METER design procedure. The true model and the two wrong initial guesses are shown in Fig. 4. The METER design criterion evaluated using these wrong initial guesses is compared to the result obtained with the correct initial guess in Fig. 5. In the cases with misspecification, the general shape of the METER objective, in particular the bias–variance trade-off, is still retained. The shape of the objective is largely determined by the mathematical problem (here: numerical differentiation) and depends less on the specific function studied. The optimal time step predicted from both wrong functions turns out to be slightly smaller. Using this smaller time step the average deviation would be expected to increase by a factor of 4. In summary, while there is a loss associated with using a wrong initial guess the qualitative picture is still captured correctly and the quantitative results are quite robust. 5.2. Knowledge on function class: robust METER designs The previous case study shows that a misspecification of the initial guess still leads to reasonable METER designs. However,

5. Discussion 5.1. Dependency on initial guess The crucial assumption in using the METER criterion for optimal experimental design is the knowledge of the function f—the function actually sought in the experiment. One may thus wonder about the practicality of the approach. However, it should be noted that this is the standard dilemma in experimental design theory. For nonlinear problems, it already cannot be avoided even in the simpler case of design for parameter estimation (Box & Lucas, 1959). An iterative experiment cycle is thus usually required to find the desired solution (Fig. 1). This strategy may also be applied to the METER approach. Still, even in the initial stage of an analysis, the METER criterion can be adapted to the level of a priori knowledge available as outlined in the following sections.

Fig. 4. True exponential model f = exp(−t) and wrong functions assumed as initial guess in METER design study.

120

A. Bardow / Computers and Chemical Engineering 32 (2008) 115–124

Fig. 5. METER design objective for different guesses of the function f.

if the experimenter is aware about the uncertainty of his initial guess on the unknown function f, robust design formulations could be employed. Even though the experimenter might not know the desired solution exactly he still often has at least some qualitative knowledge about the general class of functions the solution should belong to (e.g. exponential decay for reaction rates, peak shaped functions for spectra, polynomials for transport coefficients). The METER criterion may then be used in simulation studies to explore the influence of the design variables for the expected function class. This may already give important insight into the proper design and deepen the problem understanding. For more quantitative design rules, robust design formulations (Asprey & Macchietto, 2002; Pronzato & Walter, 1985, 1988; Walter & Pronzato, 1987) that take uncertainty in the initial guess explicitly into account can be adapted. Two robust design formulations are prevalent in the literature (Walter & Pronzato, 1990): design in the min–max sense and in the average sense. The first method, the min–max (or worst-case) approach, aims at optimizing the worst possible performance for any function f in a specified class F, i.e. ∗ (dmm , λ∗mm ) = argminmaxf ∈ F E(f − fˆ (d, λ; f )2 ).

(10)

The min–max design approach (10) chooses the best design if the function f turns out to be the worst candidate within the specified range F. The second method, the average design, tries to optimize the expected value of the design objective, i.e. in the case of the METER design criterion: ∗ (dav , λ∗av ) = argmin Ef [E(f − fˆ (d, λ; f )2 )] = argmin P(f )E(f − fˆ (d, λ; f )2 ) df.

(11) (12)

f

The worst-case method (10) relies on the specification of an admissible function class F. The average design approach (11) requires knowledge of the probability distribution P(f ) for the function f to compute the expectation Ef (Asprey & Macchietto, 2002).

Fig. 6. Robust minmax METER design objective for the function f = exp(−kt) with different rate constants k.

For the min–max method, a solution algorithm consisting of iterative solution of the two optimization problems is presented in Asprey and Macchietto (2002). The average design can be computed by multidimensional quadrature (Asprey & Macchietto, 2002). For response surface model design theory (Box & Draper, 1959, 1963) average designs have recently been introduced by Kobilinsky and co-workers as reported in Goos, Kobilinsky, O’Brien, and Vandebroek (2005). The robust METER design formulations, Eqs. (10) and (11), are now illustrated for the problem of reaction rate estimation by numerical differentiation introduced in Section 4.1. It is assumed that the experimenter expects the unknown function to be of exponential form, i.e. f = exp(−kt). The uncertainty considered is in the rate constant k. As a simple example, the unknown function is expected here to be within the following set: f = exp(−kt)

with

k ∈ [0.1, 10].

(13)

The true solution is given by k = 1 as before. For the study of the min–max design, it can be observed that within the parametrized function set F given in Eq. (13) the METER objective is monotonically increasing with the rate constant k (Fig. 6). The worst choice for f is thus the one with the largest k-value. This behavior could be anticipated as in this case there is less time to collect data before the reactants are gone. More complex dependencies are however to be expected in general. The best time step dt computed in min–max design ensures an upper bound on the estimation error even for the worst choice of f. However, it may lead to suboptimal solutions for other cases. This is the cost of robustness which is particularly severe in min–max designs as the worst-case might occur only very infrequently (Asprey & Macchietto, 2002). For the average design approach (11), the rate constant k is assumed here to be uniformly distributed in [0.1, 10]. Thus, the probability P(f ) is obtained by parameterizing f and assuming a probability for the parameters contained. The average design objective (11) is shown as function of the sampling time dt in Fig. 7. For comparison, the nominal case with k = 1 as well as the min–max design objective are also plotted. The average

A. Bardow / Computers and Chemical Engineering 32 (2008) 115–124

121

Under the physically sound assumption that the true solution itself is bounded an analysis of the bias term can be based on the first matrix norm of the right hand side of Eq. (14). In this way, an upper bound for the bias can be analyzed, e.g. first in simulation studies. This corresponds to standard design theory where a lower bound for the variance from the Cram´er-Rao theorem is used (Walter & Pronzato, 1990). Quantitative design predictions can be obtained by imposing a constraint on the bias term. Designs with constraints have been discussed by Montepiedra and Fedorov (1997). In this way, constrained METER designs are computed from the following optimization problem: min trace[K+ (λ)(K+ (λ)) ] T

d

−1

Fig. 7. Robust average METER design objective for the function f = exp(−kt) in comparison to nominal solution (k = 1) and min–max design.

design also leads to smaller time steps while being less extreme than the min–max case. The selection of smaller time steps is induced by choice of the unsymmetric uncertainty interval around k = 1 and the uniform distribution for k. Since values of large k > 1 are more likely, the average design carries the properties of the designs for larger k. The robust formulations of the METER design approach introduced in this section allow for the sound incorporation of uncertainty in the initial guess. These methods are expected to be of even more importance for nonlinear problems in order to capture also the effect of local linearization. 5.3. No a priori knowledge: constrained METER designs This section deals with the case when there is really no reasonable assumption on the unknown function f available. An ad hoc solution to the design problem could simply neglect the first term of the METER criterion (4) (f = 0). In this case, only the expected variance of the estimate fˆ would be minimized. The criterion then corresponds to a direct extension of the wellknown A-optimal design criterion (Walter & Pronzato, 1990) to ill-posed problems. Such a design is therefore expected to provide at least a reasonable initial experiment. As shown above, minimum variance designs may however yield very unfavorable settings. It therefore still seems beneficial to refine the A-optimal type solution to the design problem discussed in the previous paragraph by inclusion of the bias effect. Such a refinement is possible in the linear case studied here since the impact of the design variables on the bias can be bounded even without assuming any a priori knowledge on functional form for f. After discretization, the bias contribution is given by (cf. Eq. (4)): (I − K(λ)−1 K)f ≤ I − K(λ)−1 Kf ,

(14)

where I is the identity matrix. The right hand side of the inequality follows from the submultiplicative property of the matrix norm.

s.t. (KT K + λLT L)

λLT L ≤ M.

(15) (16)

Constrained METER designs do not require any assumption on the unknown function f. Variance of the estimate is minimized while limited bias is ensured. Note that the trace in the objective function could be replaced by any other design criteria known from standard optimal experimental design theory (cf. Walter & Pronzato, 1990). Constrained METER designs require two parameters to be specified: the regularization parameter λ as well as the bias bound M. With respect to the regularization parameter, constrained METER designs have to resort to a priori knowledge or to strategies discussed, e.g. by Ito and Kunisch (1995) where regularization is only gradually introduced till stable computations are possible. In the reaction rate estimation example studied above, this issue does not occur at all as the design variable dt itself serves as regularization parameter. For the selection of the bias bound M the following strategy is proposed: A minimum bias design is computed first by minimizing the matrix norm of Eq. (16). Thereby, a lower bound Mmin on the bias is achieved. The bias bound factor M is then taken as a multiple of this lower bound Mmin . While the actual computations do not apply due to the lack of an additional regularization parameter, the strategy can still be nicely visualized for the numerical differentiation problem (Fig. 8). In this case, the objective function (15) favors the maximal allowable time step such that the bias constraint (16) becomes active and determines the sampling step. This trade-off is expected for design variables that have a regularizing property. In other cases, both error contributions, bias and variance, may also show qualitatively similar dependencies on the design variable. As can be seen from the Eqs. (15) and (16) both error contributions are expected to be small if the matrix KT K is large. In these cases, the bias constraint is inactive and the usual minimal variance designs are determined from the constrained METER design problem (15). 5.4. Work process for optimal METER design The METER design problem (3) for ill-posed problems relies on an initial guess for the function f actually to be determined in the experiment. The discussion in the previous

122

A. Bardow / Computers and Chemical Engineering 32 (2008) 115–124

mulation is taken here from Liu (2001). From Mie theory, the particle sizing problem can be formulated as follows: ∞ ∞ g(s) = Kf (s) = πr 2 k(sr)N(r) dr = k(sr)f (r) dt, 0

0

(17) where N(r) denotes the number of droplets with radius r, g(s) the attenuation factor of light with frequency s and k is the Mie extinction factor. The latter can be given for non-absorbing spheres with refractive index near one as follows (Liu, 2001): k(x) = 2 − 4

Fig. 8. Objective function (15) and constraint (16) values of constraint METER design as function of sampling time dt.

sections introduced formulations tailored to the level of a priori knowledge. A work process for the optimal experimental design of illposed problems that reflects the level of a priori knowledge is proposed in Fig. 9. This work process refines the experimental design step in Fig. 1. If a reasonable initial guess is available, the standard METER design criterion (3) should be used. In the case of only general knowledge about the expected function class min–max (Eq. 10) and average METER designs (Eq. 11) give robust design predictions. Finally, constrained METER designs (Eq. (16)) allow studying problems without any reasonable assumption on the unknown function by analyzing a bound for the bias term. 6. Application to particle sizing The METER criterion is further examined by application to an inverse scattering problem. A typical ill-posed problems in engineering and science is the determination of the particle size distribution by light scattering and extinction. The technique is applied, e.g. in aerosol science, multiphase flow and chemistry but also in astronomy (van de Hulst, 1957). The problem for-

Fig. 9. Work process for optimal METER designs of ill-posed problems.

sin(x) (1 − cos(x)) +4 . x x2

(18)

The droplet number N(r) is assumed as a single Gaussian 2 = 1.52 . The ¯ = 5.5 and variance σN distribution with mean N goal is to recover the function N(r) in the interval [1, 10]. The design problem studied here is the selection of the measurement frequencies s. It is assumed that an interval of length 2 can be sampled. Thus, measurements are available at: si = s0 +

2(i − 1) , M−1

i = 1, . . . , M,

(19)

with a resolution of M = 51 assumed here. The design question is then where to place the sampling interval, i.e. how to choose s0 . The measurement variance is chosen as σ 2 = 1. A typical measurement set is shown in Fig. 10. Tikhonov regularization (2) is used to solve the data analysis problem. The METER criterion (3) is applied to determine the optimal sampling interval specified by s0 . The METER criterion is plotted for different s0 -values in Fig. 11. The optimal sampling interval offset found is s0 = 1.6. In addition, the METER criterion yields an optimized value for the regularization parameter of λ = 0.13 for which the results are shown. Due to the nontrivial kernel function (18) the shape of the METER criterion is much more complex than in the illustrating example. It can be seen that both bias and variance contributions are of importance throughout the design space. While the general trend of both contributions is similar, they differ in their local features. In

Fig. 10. Measurement data set for particle sizing in comparison to noise free data.

A. Bardow / Computers and Chemical Engineering 32 (2008) 115–124

Fig. 11. METER design objective for particle sizing problem as function of sampling interval offset s0 . Bias (dotted) and variance (dashed) contributions to the objective are also shown.

123

strategy of variance minimization in experimental design it not suitable for ill-posed problems. The bias–variance trade-off, well known in the solution of ill-posed problems, should also be reflected in the design procedure. This is achieved by the minimum expected total error (METER) design criterion (3) introduced in this work. The METER criterion has been validated and analyzed in case studies from reaction kinetics and light scattering. In particular, the main assumption – knowledge on the unknown and desired function f – has been discussed in detail. Several variants of the METER approach have been developed for different scenarios of a priori knowledge. They have been summarized in a practical work procedure. It is also an important feature of the METER criterion (3) that it identifies the individual contributions for bias and variance. A deeper problem understanding can be gained by a separate analysis of the dependence of bias and variance on the design variables. In summary, the minimum expected total error (METER) design criterion seems to provide the first sound framework for the experimental design of ill-posed problems. As discussed above, the method even yields design guidelines with minimal a priori knowledge. This property underlines the practical utility of the approach. Acknowledgments The author thanks Olaf Kahrs, Ernesto Kriesten and Claas Michalik (RWTH Aachen) for helpful comments and gratefully acknowledges financial support by the Deutsche Forschungsgemeinschaft (DFG) through grant BA 2884/1-2 and within the Collaborative Research Center (SFB) 540 “Model-based Experimental Analysis of Kinetic Phenomena in Fluid Multi-phase Reactive Systems”.

Fig. 12. Total estimation error between true and estimated particle size number averaged over 10,000 cases based on simulated measurement data. The dashed lines indicate the standard deviations in the error estimate.

particular, the global minimum is determined through the bias contribution. In order to validate the design prediction, the estimation problem has been solved using simulated data corrupted by adding Gaussian noise for 10,000 replications. The estimation error found from these simulated experiments is compared to the METER prediction in Fig. 12. From comparison with Fig. 11, it can be seen that the METER criterion prediction indeed corresponds quantitatively to the behavior found in the simulated experiments. The maximum deviation is 1.3% which would not be resolvable from the plot. In particular, the location of the optimal sampling offset is accurately predicted. The METER design thus leads on average to the best particle size number estimate. 7. Conclusions An optimal design criterion for ill-posed problems is introduced in this work. It is based on the recognition that the common

References Abramovich, F., & Silverman, B. (1998). Wavelet decomposition approaches to statistical inverse problems. Biometrika, 85(1), 115–129. Asprey, S. P., & Macchietto, S. (2000). Statistical tools for optimal dynamic model building. Computers & Chemical Engineering, 24(2–7), 1261–1267. Asprey, S. P., & Macchietto, S. (2002). Designing robust optimal dynamic experiments. Journal of Process Control, 12, 545–556. Bard, Y. (1974). Nonlinear parameter estimation. New York: Academic Press. Bardow, A., G¨oke, V., Koß, H. J., Lucas, K., & Marquardt, W. (2005). Concentration-dependent diffusion coefficients from a single experiment using model-based Raman spectroscopy. Fluid Phase Equilibria, 228, 357–366. Bardow, A., & Marquardt, W. (2004). Identification of diffusive transport by means of an incremental approach. Computers & Chemical Engineering, 28(5), 585–595. Bardow, A., & Marquardt, W. (2004). Incremental and simultaneous identification of reaction kinetics: Methods and comparison. Chemical Engineering Science, 59(13), 2673–2684. Beck, J. V., & Woodbury, K. A. (1998). Inverse problems and parameter estimation: Integration of measurements and analysis. Measurement Science & Technology, 9(6), 839–847. Bitterlich, S., & Knabner, P. (2003). Experimental design for outflow experiments based on a multi-level identification method for material laws. Inverse Problems, 19(5), 1011–1030.

124

A. Bardow / Computers and Chemical Engineering 32 (2008) 115–124

Box, G. E. P., & Draper, N. R. (1959). A basis for the selection of a responsesurface design. Journal of the American Statistical Association, 54(287), 622–654. Box, G. E. P., & Draper, N. R. (1963). Choice of a second order rotatable design. Biometrika, 50(3/4), 335–352. Box, G. E. P., & Lucas, H. L. (1959). Design of experiments in non-linear situations. Biometrika, 46(1/2), 77–90. Brendel, M., Bonvin, D., & Marquardt, W. (2006). Incremental identification of kinetic models for homogeneous reaction systems. Chemical Engineering Science, 61(16), 5404–5420. Draper, N. R., Guttman, I., & Lipow, P. (1977). All-bias designs for spline functions joined at axes. Journal of the American Statistical Association, 72(358), 424–429. Engl, H. W., Hanke, M., & Neubauer, A. (1996). Regularization of inverse problems. Dordrecht: Kluwer. Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias variance dilemma. Neural Computation, 4, 1–58. Goos, P., Kobilinsky, A., O’Brien, T. E., & Vandebroek, M. (2005). Model-robust and model-sensitive designs. Computational Statistics & Data Analysis, 49, 201–216. Hanke, M., & Scherzer, O. (2001). Inverse problems light: Numerical differentiation. American Mathematical Monthly, 108, 512–521. Hansen, P. C. (1998). Rank-deﬁcient and discrete ill-posed problems. Philadelphia: SIAM. Honerkamp, J., & Weese, J. (1989). Determination of the relaxation spectrum by a regularization method. Macromolecules, 22, 4372–4377. Honerkamp, J., & Weese, J. (1990). Tikhonov regularization method for ill-posed problems—a comparison of different methods for the determination of the regularization parameter. Continuum Mechanics and Thermodynamics, 2, 17–30. Ito, K., & Kunisch, K. (1995). Maximizing robustness in nonlinear illposed inverse problems. SIAM Journal on Control and Optimization, 33(2), 643–666. Kristensen, N. R., Madsen, H., & Jorgensen, S. B. (2004). A method for systematic improvement of stochastic grey-box models. Computers & Chemical Engineering, 28(8), 1431–1449. Liu, J. (2001). Optimal experimental designs for linear inverse problems. Inverse Problems in Engineering, 9, 287–314. L¨uttich, T., Mhamdi, A., & Marquardt, W. (2005). Design, formulation, and solution of multidimensional inverse heat conduction problems. Numerical Heat Transfer Part B: Fundamentals, 47(2), 111–133.

Mahoney, A. W., Doyle, F., & Ramkrishna, D. (2002, May). Inverse problems in population balances: Growth and nucleation from dynamic data. AIChE Journal, 48(5), 981–990. Marquardt, W. (2005). Model-based experimental analysis of kinetic phenomena in multi-phase reactive systems. Chemical Engineering Research & Design, 83(A6), 561–573. Michalik, C., Schmidt, T., Zavrel, M., Ansorge-Schumacher, M., Spiess, A., & Marquardt, W. (2007). Application of the incremental identification method to the formate dehydrogenase. Chemical Engineering Science, 62, 5592–5597. Montepiedra, G., & Fedorov, V. V. (1997). Minimum bias designs with constraints. Journal of Statistical Planning and Inference, 63(1), 97– 111. Pronzato, L., & Walter, E. (1985). Robust experiment design via stochasticapproximation. Mathematical Biosciences, 75, 103–120. Pronzato, L., & Walter, E. (1988). Robust experiment design via maximin optimization. Mathematical Biosciences, 89, 161–176. Sylte, A., Ebeltoft, E., Grimstad, A. A., Kulkarni, R., Nordtvedt, J. E., & Watson, A. T. (2002). Design of two-phase displacement experiments. Inverse Problems in Engineering, 10(1), 65–84. Timmer, J., Rust, H., Horbelt, W., & Voss, H. U. (2000, September). Parametric, nonparametric and parametric modelling of a chaotic circuit time series. Physics Letters A, 274, 123–134. van de Hulst, H. C. (1957). Light scattering by small particles. New York: Wiley. Wald, A. (1943). On the efficient design of statistical investigations. Annals of Mathematical Statistics, 14, 134–140. Walter, E., & Pronzato, L. (1987). Robust experiment design: between qualitative and quantitative identifiabilities. In E. Walter (Ed.), Identiﬁability of parametric models (pp. 104–113). Oxford: Pergamon Press. Walter, E., & Pronzato, L. (1990). Qualitative and quantitative experiment design for phenomenological models—a survey. Automatica, 26(2), 195– 213. Weese, J. (1992). A reliable and fast method for the solution of Fredholm integral-equations of the 1st kind based on Tikhonov regularization. Computer Physics Communications, 69(1), 99–111. Weese, J. (1993). A regularization method for nonlinear ill-posed problems. Computer Physics Communications, 77(3), 429–440. Yee, T. F., Grossmann, I. E., & Kravanja, Z. (1990). Simultaneous optimization models for heat integration. 1. Area and energy targeting and modeling of multi-stream exchangers. Computers & Chemical Engineering, 14(10), 1151–1164.