On the connection and equivalence of two methods for solving an ill-posed inverse problem based on FRAP data

On the connection and equivalence of two methods for solving an ill-posed inverse problem based on FRAP data

Accepted Manuscript On the connection and equivalence of two methods for solving an ill-posed inverse problem based on FRAP data ˇ Pap´acˇ ek C. Maton...

643KB Sizes 1 Downloads 12 Views

Accepted Manuscript On the connection and equivalence of two methods for solving an ill-posed inverse problem based on FRAP data ˇ Pap´acˇ ek C. Matonoha, S. PII: DOI: Reference:

S0377-0427(15)00327-1 http://dx.doi.org/10.1016/j.cam.2015.05.028 CAM 10196

To appear in:

Journal of Computational and Applied Mathematics

Received date: 15 October 2014 Please cite this article as: C. Matonoha, . Pap´acˇ ek, On the connection and equivalence of two methods for solving an ill-posed inverse problem based on FRAP data, Journal of Computational and Applied Mathematics (2015), http://dx.doi.org/10.1016/j.cam.2015.05.028 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Click here to view linked References

On the connection and equivalence of two methods for solving an ill-posed inverse problem based on FRAP data ˇ Pap´ C. Matonoha1 , S. aˇcek2 1 Institute

of Computer Science, Academy of Sciences of the Czech Republic, Pod Vod´ arenskou vˇ eˇz´ı 2, 182 07 Prague 8, Czech Republic, 2 University of South Bohemia in Cesk´ ˇ e Budˇ ejovice, FFPW, CENAKVA, Institute of Complex Systems, 373 33 Nov´ e Hrady, Czech Republic

Abstract Two methods for solving an ill-posed inverse problem based on Fickian diffusion equation and spatio-temporal data from FRAP measurements are presented. The most usual method is the Tikhonov regularization. Nevertheless, in our specific problem we have detected difficulties residing in determination of the optimal regularization parameter α. Hence, an equivalent method based on least squares with a quadratic constraint regularization is proposed. This latter approach naturally takes into account the noise level in the data and corresponds to the Morozov’s discrepancy principle as well. The equivalence of both methods is rigorously proven and on a simple numerical example with synthetic input data practically documented. Keywords: inverse problem, parameter identification, Tikhonov regularization, optimization, least squares with a quadratic constraint, L-curve, FRAP

1. Introduction In the last decades, the field of image processing has certainly been one of the fastest growing areas in informatics and applied mathematics. Many new applications, e.g. in biology and medicine, rise up every day. Nevertheless, there is a gap between the sophisticated equipment for the data acquisition and the quality of further data processing. Our main goal concerns the elaboration of reliable methods for the data processing part. More specifically, we deal with the inverse problem of model parameters identification using the spatio-temporal images acquired by the so-called FRAP method, cf. [1]. The FRAP (Fluorescence Recovery After Photobleaching) method is based on the measuring the change in fluorescence intensity in a region of interest, e.g. in a finite 2D domain representing the part of a membrane, in response to an external stimulus, a high-intensity laser pulse provided by the confocal

Preprint submitted to Journal of Computational and Applied Mathematics October 15, 2014

laser scanning microscopy (CLSM). The stimulus, also called bleaching, causes an irreversible loss in fluorescence in the bleached area without any damage to intracellular structures. After the bleach, the observed recovery in fluorescence reflects the mobility (related to diffusion) of fluorescent compounds from the area outside the bleach. CLSM is a technique allowing to obtain highresolution optical images with deep selectivity, rejecting the information coming from the out-of-focus planes. However, the small energy level emitted by the fluorophore and the amplification performed by the photon detector introduces a non-negligible measurement noise. In our previous work, cf. [2], we presented the problem formulation taking into account that the diffusion coefficient D is space independent (isotropic), but it can vary in time. The key idea behind the two coupled formulations of the inverse problem, i.e. a single parameter identification on one hand and the identification of the sequence {Dj }, where j is the index of time level, on the other hand, resides in fact that both the biological reality and the numerical process are under control. In particular, the time dependent diffusion coefficient can discover either an irregularity in the measurement process or a new dynamic phenomenon. Moreover, the whole experimental protocol consists of the sequence (time series) of data (levels of signal on generally rectangular 2D domain) naturally inducing M separated problem formulations between (j −1)th and j th time level, where j ∈ {1, . . . , M }.1 Last, but definitively not least, based on the sensitivity analysis, we can formulate and solve the problem of data reduction, i.e. we can select the irrelevant data which are almost not reducing the confidence interval of the estimated parameters, cf. [3]. Let us underline that due to the noisy data, the parameter identification problem is highly unstable, i.e. ill-posed in Hadamard’s sense, cf. [4]. Thus, in order to get reliable results for the sequence {Dj }, an adequate numerical technique, e.g. Tikhonov regularization [5, 6], is mandatory. In this paper, we present an alternative method to the Tikhonov regularization. We show how both methods are connected and prove their equivalence. 2. Inverse Problem Formulation In order to present our idea as clear as possible, we disregard the possibility of the fluorescent particles to react2 , i.e. there is no reaction term. Assuming (i) local homogeneity, i.e. the concentration profile of fluorescent particles is smooth, and (ii) isotropy, i.e. the diffusion coefficient D is space-invariant, the following Fickian diffusion equation on two-dimensional domain Ω ⊂ R2 represents the governing equation describing the unbleached particle concentration y(r, t) ∈ R: ∂y − ∇ · (D∇y) = 0 , (1) ∂t 1 The

0th time level corresponds to the first post-bleach measurement. the so-called bleaching during scanning (the time dependent attenuation of the fluorescent signal) is taking into account. 2 Neither

2

in (t0 , T +t0 )×Ω, where r ∈ R2 is a spatial coordinate in physical units, t0 is the initial time, and the time interval between the initial and the last measurements is T . Moreover, if we assume (iii) the possibility of one-dimensional simplification, we get y as a function of dimensionless quantities: spatial coordinate x := Lr (L is a characteristic length), time τ := Tt , and p := D LT2 (re-scaled diffusion coefficient): ∂y ∂2y −p 2 =0 , ∂τ ∂x

(2)

in (t0 , T + t0 ) × [0, 1], with the initial and Dirichlet boundary conditions3 y(x, τ0 ) = f (x), y(0, τ ) = g0 (τ ),

x ∈ [0, 1],

y(1, τ ) = g1 (τ ),

(3) τ ≥ τ0 .

(4)

Having both the process model (with all model parameters identified) and the data determining the initial condition and boundary conditions, the time evolution of the dependent variable y can be simulated in the above formulated initial boundary value forward problem (2)–(4). The inverse problem resides in the identification of model parameters based on the minimization of mismatch between the spatio-temporal data and the simulated values obtaining as the solution of the forward problem. In the next, we define the objective function and FRAP data structure, as well. 2.1. Spatio-temporal FRAP data Based on FRAP experiments, we have a 2D dataset in form of a table with (N + 1) rows corresponding to the number of spatial points where the values are measured, and (m + M + 1) columns with m pre-bleach and M + 1 post-bleach experimental values forming 1D profiles yexp (xi , τj ),

i = 0 . . . N,

j = −m . . . M,

where x0 = 0 and xN = 1. In fact, the process is determined by m columns of pre-bleach data containing the information about the steady state and optical distortion,4 and M + 1 columns of post-bleach data, cf. Fig. 2, containing the information about the transport of unbleached particles (due to the diffusion) through the boundary. 3 The other parameter important for the FRAP measurement is the level of recovery. This quantity can be controlled by the flow of unbleached particles through the boundary. It would be the reason for using the special form of time-dependent Neumann boundary conditions. 4 The noise identification can be performed using the pre-bleach data as well.

3

2.2. Objective function We construct an objective function Y (p) representing the disparity between the experimental and simulated time-varying concentration profiles, and then within a suitable method we look for such a value p = (p1 , . . . , pM )T ∈ RM minimizing Y , i.e. we allow the variation of p in time. A usual form of the objective function is the sum of squared differences between the experimentally measured and numerically simulated time-varying concentration profiles. Taking separately temporal (sub-index j) and spatial (sub-index i) data points, we get: Y (p) =

M ∑ N ∑ j=1 i=0

2

[yexp (xi , τj ) − ysim (xi , τj , pj )] ,

(5)

where ysim (xi , τj , pj ) are the simulated values resulting from the solution of problem (2)–(4), yexp (xi , τ0 ), i = 0 . . . N, represent the initial condition f (x), cf. (3), and pj is the dimensionless diffusion coefficient in the j th time interval [τj−1 , τj ]. The left and right Dirichlet boundary conditions g0 (τ ) and g1 (τ ), cf. (4), are represented by yexp (0, τj ) and yexp (1, τj ), j = 1 . . . M, respectively. Furthermore, in order to keep control over the possible time dependence of parameter p ∈ RM , we define the following objective functions Yj (pj ) =

N ∑ i=0

2

[yexp (xi , τj ) − ysim (xi , τj , pj )] , j ∈ {1 . . . M }.

(6)

2.3. Ill-posedness and Tikhonov regularization Let us denote p∗j , j = 1 . . . M, approximate solutions of minimization problems p∗j = arg min Yj (pj ) , j ∈ {1 . . . M }. (7) pj

Problem (7) is ill-posed in the sense that the solution, i.e. the diffusion coefficients p∗1 . . . p∗M , do not depend continuously on the initial experimental data. This led us to the necessity of using some stabilizing procedure. Usually, one has to use a priori information in order to solve the problem in a stable manner. Such a priori information as smoothness of the solution p∗ ∈ RM is expressed, in this work, in form of the following (Tikhonov) regularized cost functions: Fj (pj , preg , α) =

N ∑ i=0

2

[yexp (xi , τj ) − ysim (xi , τj , pj )] + α (pj − preg )2

(8)

for j = 1 . . . M , where α ≥ 0 is a regularization parameter and preg ∈ R is an expected value. Taking α = 0, function F (p, preg , α) =

M ∑

Fj (pj , preg , α)

(9)

j=1

is identical to Y (p) and thus turns to (5). In the next subsection we formulate the Tikhonov’s idea residing in the minimization of function (9), instead of (6), with respect to p and preg for a given α. 4

2.4. Optimization method Minimization of (9) is a two-level minimization problem. It is formulated as { } p∗reg , p∗ = min min F (p, preg , α) (10) preg

p≥0

where p∗reg and p∗ = (p∗1 , . . . , p∗M )T are approximate solutions. Minimization over the variable preg ∈ R represents the outer minimization problem and minimization over the variable p ∈ RM represents the inner minimization problem. The inner minimization problem p∗ = arg min F (p, preg , α) = arg min p≥0

p≥0

M ∑

Fj (pj , preg , α)

j=1

for fixed preg can be solved by minimizing each function Fj (pj , preg , α) separately. Minimization of Fj with respect to pj ≥ 0 then represents a onedimensional optimization problem. Basic optimization method is an iteration (0) process starting from an initial point pj and generating a sequence of iterates (1)

(2)

pj , pj , . . . leading to a value p∗j such that (l+1)

pj

(l)

= pj + σ (l) d(l) ,

where d(l) is a direction vector and σ (l) > 0 is a step-length. The direction vector is determined on the basis of values (k)

(k)

(k)

(k)

pj , Fj (pj , ., .), Fj′ (pj , ., .), Fj′′ (pj , ., .),

0 ≤ k ≤ l,

and the step-length is determined on the basis of behavior of the function (l) (l) Fj (pj , ., .) in the neighborhood of pj . We have used, in the Section 5, the variable metric method implemented in the UFO system [7] for finding the solution p∗j . Having computed solutions p∗j of inner minimization problems for j = 1, . . . , M, the outer minimization problem p∗reg = arg min F (p∗ , preg , α) preg

p∗reg ,

gives due to (8), as the average value of {p∗1 , . . . , p∗M }. Note that the solutions depend on the value α, so we further use the notations p∗ (α), p∗j (α) and p∗reg (α). Now we are facing the problem to get a unique solution p∗ (α∗ ), i.e. a unique sequence of diffusion coefficients p∗j (α∗ ), j = 1, . . . , M, for a particular value of α∗ . In other words, we are looking for in some sense optimal parameter α∗ , e.g., such a value for which the variance (relative standard deviation) of the solution is “small enough” and at the same time the solution is not oversmoothed, i.e. possible variations (e.g. due to the time dependent potential field in the computational domain) are not “killed” by the excessive value of the regularization parameter α. The problem of choosing such a value α∗ is discussed in the next section. 5

3. Tikhonov regularization and properties of the L-curve 3.1. The L-curve A useful tool to visualize the relation between the residuum for different values of regularization parameter α and the norm of a solution or relative standard deviation of the solution or some other measure of variability of the solution, is the so-called L-curve [8]. Usually, this parametric plot, in our case with Y (p∗ (α)) (regularization term is not considered here) in the abscissa, and ∥p∗ (α) − p∗reg (α)∥2 in the ordinate, is L-shaped (hence the name). Note that ∥p∗ (α) − p∗reg (α)∥2 =

M ∑ [ ∗ ]2 pj (α) − p∗reg (α) j=1

where p∗reg (α) is according to a context either a scalar or M −dimensional vector of the same quantities. Further we announce four lemmas pointing out some properties of considered functions related to the L-curve. Lemma 1. Let α ≥ 0 be given and p∗ (α) be a solution of problem [ ] p∗ (α) = arg min F (p, preg , α) = arg min Y (p) + α∥p − preg ∥2 p, preg

p, preg

where Y (p) has form (5). Then

1. ||p∗ (α) − p∗reg (α)||2 is non-increasing as a function of α. 2. Y (p∗ (α)) is non-decreasing as a function of α. Proof. As p∗reg is the average value of solutions p∗1 (α), . . . , p∗M (α), we can only consider the inner minimization problem for pj . Consider two arbitrary values 0 ≤ α1 < α2 ≡ α1 + β, β > 0. Let [ ] p∗ (αi ) = arg min Y (p) + αi ||p − preg ||2 , (11) p∈RM

with corresponding values p∗reg (αi ), i = 1, 2.

1. Suppose ||p∗ (α2 ) − p∗reg (α2 )||2 > ||p∗ (α1 ) − p∗reg (α1 )||2 . As p∗ (α1 ) is a solution of (11) for α1 , it holds that Y (p∗ (α1 ))+α1 ||p∗ (α1 )−p∗reg (α1 )||2 ≤ Y (p∗ (α2 ))+α1 ||p∗ (α2 )−p∗reg (α2 )||2 . Thus Y (p∗ (α1 )) + α1 ||p∗ (α1 ) − p∗reg (α1 )||2 + β||p∗ (α1 ) − p∗reg (α1 )||2 < Y (p∗ (α2 )) + α1 ||p∗ (α2 ) − p∗reg (α2 )||2 + β||p∗ (α2 ) − p∗reg (α2 )||2 = Y (p∗ (α2 )) + α2 ||p∗ (α2 ) − p∗reg (α2 )||2

which is a contradiction with a solution p∗ (α2 ) of (11) for α2 . 6

2. Suppose Y (p∗ (α2 )) < Y (p∗ (α1 )). From 1. it holds that ||p∗ (α2 ) − p∗reg (α2 )||2 ≤ ||p∗ (α1 ) − p∗reg (α1 )||2 . Thus Y (p∗ (α2 ))+α1 ||p∗ (α2 )−p∗reg (α2 )||2 < Y (p∗ (α1 ))+α1 ||p∗ (α1 )−p∗reg (α1 )||2 which is a contradiction with a solution p∗ (α1 ) of (11) for α1 .

2

So, for α1 < α2 we have that Y (p∗ (α1 )) ≤ Y (p∗ (α2 ))

and

||p∗ (α1 ) − p∗reg (α1 )||2 ≥ ||p∗ (α2 ) − p∗reg (α2 )||2 .

Lemma 2. Let assumptions of Lemma 1 be satisfied. Then Y (p∗ (α1 )) = Y (p∗ (α2 ))



||p∗ (α1 ) − p∗reg (α1 )||2 = ||p∗ (α2 ) − p∗reg (α2 )||2 .

Proof. 1. Let ||p∗ (α1 ) − p∗reg (α1 )||2 = ||p∗ (α2 ) − p∗reg (α2 )||2 . Then Y (p∗ (α2 )) + α1 ||p∗ (α1 ) − p∗reg (α1 )||2 + β||p∗ (α1 ) − p∗reg (α1 )||2 Y (p (α2 )) + α1 ||p (α2 ) − ∗



Y (p (α1 )) + α1 ||p (α1 ) − ∗



p∗reg (α2 )||2 p∗reg (α1 )||2

+ β||p (α2 ) − ∗

+ β||p (α1 ) − ∗

p∗reg (α2 )||2 p∗reg (α1 )||2

= ≤

because p∗ (α2 ) is a solution of (11) for α2 = α1 + β. Thus Y (p∗ (α2 )) ≤ Y (p∗ (α1 )) which implies Y (p∗ (α2 )) = Y (p∗ (α1 )) due to Lemma 1. 2. Let Y (p∗ (α1 )) = Y (p∗ (α2 )). Then Y (p∗ (α2 )) + α1 ||p∗ (α1 ) − p∗reg (α1 )||2 Y (p (α1 )) + α1 ||p (α1 ) − ∗



Y (p (α2 )) + α1 ||p (α2 ) − ∗



p∗reg (α1 )||2 p∗reg (α2 )||2

= ≤

because p∗ (α1 ) is a solution of (11) for α1 . Thus ||p∗ (α1 ) − p∗reg (α1 )||2 ≤ ||p∗ (α2 ) − p∗reg (α2 )||2 which implies ||p∗ (α1 ) − p∗reg (α1 )||2 = ||p∗ (α2 ) − p∗reg (α2 )||2 due to Lemma 1. 2 Lemma 3. Let α1 < α2 and p∗ (α1 ), p∗ (α2 ) be solutions of (11). Let Y (p∗ (α1 )) = Y (p∗ (α2 )) Then

or

||p∗ (α1 ) − p∗reg (α1 )||2 = ||p∗ (α2 ) − p∗reg (α2 )||2 .

Y (p∗ (α1 )) = Y (p∗ (˜ α)) = Y (p∗ (α2 ))

and ||p∗ (α1 ) − p∗reg (α1 )||2 = ||p∗ (˜ α) − p∗reg (˜ α)||2 = ||p∗ (α2 ) − p∗reg (α2 )||2 for all α ˜ ∈ [α1 , α2 ]. Proof. The assertion of the Lemma follows directly from Lemmas 1 and 2. 7

3.2. Tikhonov regularization and the optimality conditions Tikhonov regularization [6] is based on adding a regularization term in (5) getting (9) and solving the problem [ ] p∗ (α) = arg min Y (p) + α∥p − preg ∥2 , st. p ≥ 0. (12) p, preg

As p∗reg (α) is the average value of p∗1 (α), . . . , p∗M (α), necessary Karush-KuhnTucker (KKT) optimality conditions for a solution have the form [9]: ( ) ∂Yj (pj ) 1 + 2α(pj − preg ) 1 − = λj , j = 1, . . . , M, ∂pj M pj ≥ 0,

λj pj = 0,

λj ≥ 0,

j = 1, . . . , M,

where λ ∈ RM is a vector of Lagrange multipliers. Note that, in fact, the diffusion coefficients p∗j (α) are strictly positive, so the KKT conditions reduce into the simple form ( ) ∂Yj (pj ) 1 + 2α(pj − preg ) 1 − = 0, j = 1, . . . , M. (13) ∂pj M The following Lemma says that the L-curve is decreasing and convex function. Lemma 4. Let assumptions of Lemma 1 be satisfied. Then ||p∗ (α) − p∗reg (α)||2 is a decreasing and convex function of Y (p∗ (α)). Proof. Due to Lemma 1 and (13) it holds that ∂Y (p∗ (α)) ∂α

=

M ∑ ∂Yj (p∗j (α)) ∂p∗j (α) j=1

= −2α ∂||p∗ (α) − p∗reg (α)||2 ∂α

=

∂pj

( ) M ∑ ) ∂p∗j (α) ( ∗ 1 pj (α) − p∗reg (α) 1 − ≥ 0, ∂α M j=1 M ∑ ∂(p∗j (α) − p∗reg (α))2 ∂p∗j (α)

∂pj

j=1

=

(14)

∂α

2

M ∑ ∂p∗j (α) ( j=1

∂α

∂α

(15)

( ) ) 1 p∗j (α) − p∗reg (α) 1 − ≤ 0. M

When (14) is positive then there exists an inverse function α(Y ) and we have ∂||p∗ (α) − p∗reg (α)||2 ∂α ∂||p∗ (α) − p∗reg (α)||2 1 = =− <0 ∂Y ∂α ∂Y α

(16)

which means that ||p∗ (α) − p∗reg (α)||2 is a decreasing function of Y (p∗ (α)). 8

The second derivative implies that ∂ 2 ||p∗ (α) − p∗reg (α)||2 ∂ 1 1 ∂α(Y ) =− = 2 >0 (17) 2 ∂Y ∂Y α α ∂Y which means that ||p∗ (α) − p∗reg (α)||2 is a convex function of Y (p∗ (α)). 2 Figure 1 shows a typical theoretical behavior of the L-curve. In the upperleft part we have small values of α corresponding to the under-smoothing (the solution is corrupted by the noise in data). The lower-right part corresponds to the over-smoothing, i.e. the regularization term dominates for large α, making the solution smoother. More exactly, it holds that limα→0 p∗j (α) = p∗j (0), i.e. the solution tends to irregularized solution. For α → ∞ we have that (i) function values Y (p∗ (α)) become larger (although there is a supremum) and (ii) ∥p∗ (α) − p∗reg (α)∥2 → 0, i.e. the estimated parameter variance is diminishing or even p∗j (α) ≡ p∗reg (α), ∀j. This fact also follows from (14) or (15) where we have M ∑ ∂p∗j (α) j=1

∂α

(p∗j (α) − p∗reg (α)) ≤ 0,

(18)

which implies that ∥p∗ (α) − p∗reg (α)∥2 is diminishing. The question is how to choose a “right” (in some sense optimal) parameter α∗ . Hansen proposed in [8] the so-called L-curve criterion consisting in finding the point of maximal curvature on the L-curve. This point with corresponding solution p∗ (α∗ ) is called L-curve optimal. However, in most cases this point is hard to determine. An example of the L−curve 1 0.9

L−curve

α→ 0

0.8

||2

0.5

reg

0.6

||p−p

0.7

0.4

← α = [δ,L(δ)]

0.3 0.2 0.1 0 100

← α* 102

α→ ∞ 104

106

108

110

Y(p)

Figure 1: L-curve in a general case.

4. Least squares with a quadratic constraint To avoid the above mentioned complication of non-unambiguous choice of the parameter α∗ , other approaches can be used. We follow the ideas for linear 9

case Ax = b of Hansen [8] and formulate them for our nonlinear case, i.e. for solving initial boundary value problem (2)-(4). 4.1. Constraint based on determination of estimated parameter variance The first alternative approach to Hansen’s L-curve criterion consists in prescribing the value of ∥p∗ − p∗reg ∥2 in advance. As the norm ∥p∗ (α) − p∗reg (α)∥2 diminishes for α → ∞, assume that we have prescribed the variance in the solution with some value ξ. According to Hansen [8], we can solve the following equivalent optimization problem with a quadratic constraint p∗ (ξ) = arg min Y (p), p∈RM

st. ∥p − preg ∥2 ≤ ξ,

p ≥ 0,

(19)

where preg is the average value of p1 , . . . , pM . 4.2. Measurement noise based constraint Suppose that we either know or can estimate the noise in input data. If we δ denote yexp (xi , τj ) as real noisy data and yexp (xi , τj ) as ideal data that would be measured without the noise, then M ∑ N ∑ ]2 [ δ yexp (xi , τj ) − yexp (xi , τj ) ≤ δ j=1 i=0

where δ specifies the noise level (for the normally distributed non-correlated additive noise with the variance σ02 , we have δ ≈ M N σ02 ). This leads to another possibility to solve the optimization problem (10). As Hansen [8] claims, the following optimization problem is again equivalent to the previous ones p∗ (δ) = arg min ∥p − preg ∥2 , p∈RM

st. Y (p) ≤ δ,

p ≥ 0.

(20)

4.3. On the equivalence of all types of methods The necessary KKT optimality conditions for a solution of problem (19) have the form [9]: ( ) ∂Yj (pj ) 1 (1) = λj − 2µ(1) (pj − preg ) 1 − , j = 1, . . . , M, ∂pj M ∥p − preg ∥2 ≤ ξ, pj ≥ 0,

(1)

[ ] µ(1) ξ − ∥p − preg ∥2 = 0,

λj pj = 0,

(1)

λj

≥ 0,

µ(1) ≥ 0,

j = 1, . . . , M,

while the KKT optimality conditions for a solution of problem (20) have the form [9]: ) ( 1 ∂Yj (pj ) (2) = λj − µ(2) , j = 1, . . . , M, 2(pj − preg ) 1 − M ∂pj 10

Y (p) ≤ δ, pj ≥ 0,

(2)

µ(2) [δ − Y (p)] = 0,

λj pj = 0,

(2)

λj

≥ 0,

µ(2) ≥ 0, j = 1, . . . , M,

where, again, λ(i) ∈ RM , µ(i) ∈ R, i = 1, 2 are Lagrange multipliers. Now we show that the solutions of problems (19)-(20) are attained on the boundary. Lemma 5. (Lawson, Hanson [10]) Let α ≥ 0 be fixed, let p∗ (α) be a solution of problem (12). Denote Y (p∗ (α)) = δ and ||p∗ (α) − p∗reg (α)||2 = ξ. Then 1. p∗ (α) is a solution of problem (19). 2. p∗ (α) is a solution of problem (20).

Proof. Let there exist p˜ so that ||˜ p − p˜reg ||2 ≤ ξ Then

and Y (˜ p) < Y (p∗ (α)).

Y (˜ p) + α||˜ p − p˜reg ||2 < Y (p∗ (α)) + α||p∗ (α) − p∗reg (α)||2

which is a contradiction with the fact that p∗ (α) solves (12). The second part of the Lemma can be proved in the same way. 2 We see that the solutions of problems (19)-(20) are attained on the boundary. Thus the parts of KKT optimality conditions in square brackets are zero, µ(1) , µ(2) ≥ 0 are arbitrary numbers, and the solutions of problems (19) and (20) are also the solution of problem (12). All three problems (12),(19),(20) are equivalent because they lead to the same KKT optimality conditions. Relations between Lagrange multipliers and regularization parameter α are the following: 1. α = µ(1) = (1)

2. λj = λj 3. λj =

(1) λj

1 µ(2)



=

(2) λj µ(2)

=

(2) λj .

µ(1) µ(2) = 1,

,

It follows that either µ(1) = µ(2) = 1 which is impossible because α can be arbitrary or (1) (2) λj = λj = λj = 0 which is in agreement with the fact that the solutions, i.e. the diffusion coefficients p∗j (α), j = 1, . . . , M, are expected to be strictly positive. As mentioned above, the KKT conditions then correspond to the unconstrained case and have the form (13). Note that from (13) and from the fact that p∗reg (α) is the average value of ∗ p1 (α), . . . , p∗M (α), it holds that M ∑ ∂Yj (pj ) j=1

∂pj

11

= 0.

(21)

If we return to the Figure 1, then each value δ (specifying the noise level) corresponds the value L(δ) = ξ on the L-curve so that Y (p) = δ



∥p − preg ∥2 = L(δ).

Moreover, this point also corresponds to a certain Tikhonov regularization parameter α, i.e. α ≡ [δ, L(δ)]. We call the respective α∗ for a given noise δ ∗ as noise optimal and the solution p∗ (δ ∗ ) corresponds to the solution found by applying the Morozov’s discrepancy principle [11]. Note that further we will prefer to solve problem (20) rather than problem (19) because the estimation of the noise in data is more straightforward and from the practical point of view (as mentioned above) a routine task. 5. Numerical example In this section we illustrate both the difficulties caused by the ill-posedness of our problem and the proposed solution. We have performed numerical computations with the synthetic input data. The input test data yexp (xi , τj ) of the “experimental” diffusion process is 11 time series each with 21 spatial points (N = 21 and M = 10). The results, i.e. the diffusion coefficients p∗ (α) were obtained using our software CA-FRAP5 . The input data are shown in Fig. 2. On the left there are exact data (without the noise) while on the right there are data with the Gaussian noise added. More exactly, they were obtained from the exact data by adding the normally distributed Gaussian noise with the zero mean and coefficient of variation (defined as the ratio of the standard deviation to the exact value) equal to 0.1. Only three time series corresponding to indices j = 0, 5, 10 are shown. The resulting values of respective parameters are plotted in Fig. 3. On the left part there is a solution of problem (7) for both exact and noisy data when no regularization is used (α = 0). On the right part there are solutions of problem (12) for noisy data using three different values of α > 0. The smoothing effect of regularization is clearly visible. In Fig. 4 (a)-(c) we plot the function values for the computed solutions p∗ (α) when Tikhonov regularization was used. In part (a) we see non-decreasing values of Y (p∗ (α)) and in part (b) we see non-increasing values of ∥p∗ (α) − p∗reg (α)∥2 , which is in agreement with Lemma 1. In part (c) there is a parametric plot of the L-curve, i.e. values ∥p∗ (α) − p∗reg (α)∥2 against Y (p∗ (α)). We realize that in this particular case the method based on finding a point with maximal curvature which corresponds to α∗ fails. To complete the overall information about the behavior of p∗reg (α), the dependence of p∗reg (α) on α is shown in part (d). The practical illustration of the Hansen’s conjecture about the equivalency of Tikhonov approach (12) and measurement noise based constraint approach (20) is shown in Figure 5. In order to get this plot, we first computed solutions p∗ (α) 5 [email protected]

12

of problem (12) with corresponding values Y (p∗ (α)) and Pα ≡ ∥p∗ (α)−p∗reg (α)∥2 for α = 1, . . . , 1000. Then we took each value Y (p∗ (α)) for solving problem (20) subject to constraint Y (p) ≤ δ = Y (p∗ (α)) and getting the solution p∗ (δ) with corresponding function value Pδ ≡ ∥p∗ (δ) − p∗reg (δ)∥2 . As the solution p∗ (δ) is attained on the boundary, see Lemma 5, it holds Y (p∗ (δ)) = δ. Finally, in order to compare both methods mutually, we computed differences |Pα − Pδ | for all particular cases Y (p∗ (α)) = Y (p∗ (δ)) = δ. We can see that the differences are negligible, hence, the results do not depend on the chosen method. Test data (with the Gaussian noise) 11

10

10

9

9

8

8

7

7

6

6

yexp

yexp

Test data (without noise) 11

5

5

4

4

3

3

2

2

1

1

0

5

10

15

0

20

5

10

i

15

20

i

Figure 2: Test data without the noise and with the additive Gaussian noise.

*

*

Test data − values pj (α) for α=0

Test data − values pj (α) for different α

1

1 without noise Gaussian noise

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1 0 1

α=0 α = 10 α = 100 α = 1000

0.9

0.8

p*j (α)

j

p*(α)

0.9

0.1 2

3

4

5

6

7

8

9

0 1

10

j

2

3

4

5

6

7

8

9

10

j

Figure 3: Left: solution of problem (7) for exact and noisy data, no regularization is used. Right: solutions of problem (12) for noisy data using different Tikhonov regularization parameter α.

6. Conclusions We have presented two methods for the solution of the apparently simple parameter identification problem. Due to the noisy data from the spatio-temporal 13

Test data − values Y(p*(α)) vs. α

Test data − values ||p*(α)−p* (α)||2 vs. α reg

173.5

0.35

Values ||p*(α)−p*reg(α)||2

173 0.3

2

172.5

||p (α)−preg(α)||

0.2

* 171

0.15

*

Y(p*(α))

172 171.5

0.25

170.5

0.1

170 0.05

169.5

Values Y(p*(α))

169 0

200

400

α

*

600

*

800

0 0

1000

2

*

200

400

α

600

800

1000

Test data − values p*reg(α) vs. α

Test data − ||p (α)−preg(α)|| vs. Y(p (α)) 0.35

0.46

L−curve 0.3

0.42

0.2

preg(α)

0.15

0.4

*

||p*(α)−p*reg(α)||2

0.44

0.25

0.38

0.1

0.36

0.05

Values p* (α) 0 169

reg

170

171

172

173

0.34 0

174

Y(p*(α))

200

400

α

600

800

1000

Figure 4: Function values for the computed solution p∗ (α): (a) upper-left: values Y (p∗ (α)), (b) upper-right: values ∥p∗ (α) − p∗reg (α)∥2 , (c) lower-left: L-curve, (d) lower-right: values p∗reg (α).

FRAP measurement and ill-posedness of the problem, we have to look for a stable numerical process, in order to get reliable results. The most usual method is the Tikhonov regularization. Nevertheless, in our specific nonlinear problem we had to deal with the complicated task of determining the optimal regularization parameter α∗ . Fortunately, there is an equivalent method based on least squares with a quadratic constraint regularization. While the Tikhonov regularization method does not use the information about the noise level in data, the latter approach naturally takes into account the noise level and corresponds to the Morozov’s discrepancy principle as well. The equivalence of both methods is theoretically proven and practically illustrated on a numerical example. Acknowledgement This work was supported by the long-term strategic development financing of the Institute of Computer Science (RVO:67985807), and by the Ministry of Education, Youth and Sports of the Czech Republic - projects CENAKVA (No. CZ.1.05/2.1.00/01.0024) and CENAKVA II (No. LO1205 under the NPU I program). 14

−3

7

x 10

Test data − values |Pα−Pδ| vs. δ Values |P −P | α

δ

6

4

α

δ

|P −P |

5

3 2 1 0 169

170

171

δ

172

173

174

Figure 5: Differences (in absolute value) between Pα ≡ ∥p∗ (α) − p∗reg (α)∥2 computed by Tikhonov approach (12) and Pδ ≡ ∥p∗ (δ) − p∗reg (δ)∥2 computed by measurement noise based constraint approach (20).

References [1] R. Kaˇ na, O. Pr´aˇsil, C. W. Moullineaux, Immobility of phycobilins in the thylakoid lumen of a cryptophyte suggests that protein diffusion in the lumen is very restricted, FEBS letters 583 (2009) 670–674. ˇ Pap´ [2] S. aˇcek, R. Kaˇ na, C. Matonoha, Estimation of diffusivity of phycobilisomes on thylakoid membrane based on spatio-temporal FRAP images, Math. Comput. Modelling 57 (7-8) (2013) 1907–1912. ˇ Pap´ [3] S. Kindermann, S. aˇcek, Optimizing data space selection for parameter identification in a reaction-diffusion model based on frap measurement, Manuscript in preparation (2014) 22 p. [4] J. Hadamard, Lectures on the Cauchy Problem in Linear Partial Differential Equations, Yale University Press, New Haven, 1923. [5] H. W. Engl, M. Hanke, A. Neubauer, Regularization of inverse problems, Vol. 375 of Mathematics and its Applications, Kluwer Academic Publishers Group, Dordrecht, 1996. [6] A. N. Tikhonov, V. Y. Arsenin, Solutions of ill-posed problems, V. H. Winston & Sons, Washington, D.C.: John Wiley & Sons, New York-Toronto, Ont.-London, 1977, translated from the Russian, Preface by translation editor Fritz John, Scripta Series in Mathematics. ˇ ska, uma, C. Matonoha, J. Vlˇcek, N. Rameˇsov´ a, M. Siˇ [7] L. Lukˇsan, M. T˚ J. Hartman, Ufo 2014, interactive system for universal functional optimization, Tr1191, ICS AS CR, Prague (2014). URL http://www.cs.cas.cz/luksan/ufo.html 15

[8] P. C. Hansen, Rank-deficient and discrete ill-posed problems, SIAM Monographs on Mathematical Modeling and Computation, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1998, numerical aspects of linear inversion. [9] A. Antoniou, W.-S. Lu, Practical optimization, Springer, New York, 2007, algorithms and engineering applications. [10] C. L. Lawson, R. J. Hanson, Solving least squares problems, Vol. 15 of Classics in Applied Mathematics, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1995, revised reprint of the 1974 original. [11] V. A. Morozov, On the solution of functional equations by the method of regularization, Soviet Math. Dokl. 7 (1966) 414–417.

16