Towards hybrid system modeling of uncertain complex dynamical systems

Towards hybrid system modeling of uncertain complex dynamical systems

Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393 www.elsevier.com/locate/nahs Towards hybrid system modeling of uncertain complex dynamical system...

264KB Sizes 0 Downloads 32 Views

Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393 www.elsevier.com/locate/nahs

Towards hybrid system modeling of uncertain complex dynamical systems Thordur Runolfsson School of Electrical and Computer Engineering, The University of Oklahoma, Norman, OK 73019, United States Received 17 May 2006; accepted 30 May 2006

Abstract In this paper we consider the problem of finding a low dimensional approximate model for a discrete time Markov process. This problem is of particular interest in systems that exhibit so-called metastable behavior, i.e. systems whose behavior is principally concentrated on a finite number of disjoint components of the state space. The approach developed here is based on a proper orthogonal decomposition and, unlike most existing approaches, does not require the Markov chain to be reversible. An example is presented to illustrate the effectiveness of the proposed method. c 2006 Elsevier Ltd. All rights reserved.

1. Introduction In this paper we consider the problem of finding a low dimensional approximate model of a Markov process. This problem is of particular interest in systems that exhibit so-called metastable behavior, i.e. there exists a decomposition of the state space into a finite number of components that have the characteristic that once the system enters one of the components it spends a long period of time there, but eventually transitions to one of the other components where the behavior repeats itself. Systems that exhibit this kind of behavior arise, for instance, in complex dynamical systems that have many stable limit sets and are perturbed by a small noise process. For complex systems that exhibit such metastable behavior, a model simplification in the form of a hybrid system may be of considerable interest. A first step in the construction of a process of model simplification is the characterization of the metastable behavior. This is the problem considered in the present paper. The characterization of metastable behavior has recently received considerable attention in the literature. In particular, in [1,3] set theoretic methods are developed to characterize metastable behavior, in [5] spectral theoretic methods are used to characterize metastable sets, and in [11,8] a spectral decomposition of the Markov process is developed. In above treatments that use the spectral properties of the System, it is assumed that the underlying Markov process is reversible and their development depends critically on this assumption. In the present paper, we develop an alternative approach for identifying a low dimensional approximate model. The method developed is based on ideas from proper orthogonal decomposition (principal component analysis) and does not require the reversibility of the underlying Markov process. E-mail address: [email protected]. c 2006 Elsevier Ltd. All rights reserved. 1751-570X/$ - see front matter doi:10.1016/j.nahs.2006.05.004

384

T. Runolfsson / Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393

The paper is organized as follows: in Section 2 we discuss the type of complex behavior that is of interest, in Section 3 we formulate the problem, present the principal mathematical ideas of the approach and formulate the main results. In Section 4 we consider a simpler finite dimensional case and present an example. 2. Complex behavior In this section, we discuss the type of complex behavior that gives rise to the metastable behavior discussed in the Introduction. The intention of this discussion is to explore the type of models that will give rise to this type of behavior, but at the same time we emphasize that systems that exhibit this type of behavior are not limited to the models discussed here. Consider the stochastic system ξn+1 = b (ξn ) + εwn

(1)

where wn is a sequence of iid standard Gaussian random variables and ε is a constant parameter. The above system is a perturbation of the deterministic system ζn+1 = b (ζn ) .

(2)

Assume first that ζ = 0 is an unique equilibrium point of the deterministic system that is (globally) asymptotically stable, and let D be a bounded set that contains the origin and is an invariant set for (2). Then it can be shown that for any initial state in the interior of D the solution of (1) will exit D at some (random) time instant with probability −c one. Furthermore, for small ε, it can be shown using the techniques in [2] that Pr (ξn 6∈ D|ξn−1 ∈ D) ∝ e ε2 for some appropriately defined constant c. In particular, if the noise intensity ε is small, then the time until first exit from D is large. Assume now that (2) has several stable equilibrium points, say ζ 0 , . . . , ζ k , and let D0 , . . . , Dk be the corresponding domains of attraction. Then, for small ε, it follows that once the solution of (1) enters one of the Di ’s, it will stay there for a long period of time, but will eventually exit and enter one of the other domains of attraction. This exit time problem has been studied in considerable detail for diffusion processes in [2], and again it can be shown that −c the probability of a transition from one domain of attraction to another is of the order of e ε2 . More complex random perturbations of the dynamics of (2) than the additive perturbations in (1) are also possible. Indeed, small random perturbations of the dynamics of (2), as defined in [6,7], or the worst case perturbation in [15], are other candidate formulations. Irrespective of the origin of the formulation, the resulting system has the characteristic that there exists a finite collection of disjoint sets for which the system will spend long periods in one of the sets before transitioning to another, i.e. the system has a metastable behavior. A system that exhibits the complex metastable behavior discussed above can be and frequently is an ergodic process. Consequently, if the principal interest is the long term average behavior, one may consider an infinite horizon or ergodic performance measure for the system. However, one should note that when the system exhibits metastable behavior as described above, the average cost criteria may be unsatisfactory. In particular, if the metastable behavior evolves on a time scale that is comparable to the (long) time scale of interest, the ergodic design criteria may lead to oversimplified results or conclusions. While in general it may not be practical or possible to measure or observe the process dynamics, it is achievable that the metastable behavior can be observed or identified. We note that this identification involves coarse graining in both spatial and time scales. Usually, identifying the metastable behavior involves identifying in which of a finite number of metastable states the process is. However, characterizing the metastable behavior may require the identification of a metastable partition of the phase space of the process ξ . In general, this is a difficult problem that has been solved for special cases in the literature. In this paper, we develop a general approach for the direct identification of an approximate operator that captures the metastable behavior of a general Markov chain process. The developed approach does not require the identification of a metastable partition of the state space, and is based on the spectral properties of an associated operator. 3. Problem formulation and metastability Let X be a Polish space (complete, separable metric space), let B = B (X ) be the Borel σ -algebra on X , and consider a Markov chain (ξ0 , ξ1 , . . .) defined on the state space X with transition function p (ξ, A) defined for all ξ ∈ X and Borel sets A ∈ B, i.e. p (ξ, A) satisfies

T. Runolfsson / Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393

385

1. p (·, A) is measurable for each fixed A ∈ B, 2. p (ξ, ·) is probability measure on (X, B) for each fixed ξ ∈ X . Let M1 = M1 (X ) be the space of all probability measures on X distribution ν ∈ M1 . Then the distribution of ξ1 is Z p (ξ, A) ν (dξ ) = (Pν) (A) ν1 (A) =

and assume that the initial state ξ0 has

(3)

X

where P : M1 → M1 is the Perron Frobenius operator. It follows that the distribution of ξn is νn (A) = (P n ν) (A). A measure µ is said to be an invariant measure for the Markov chain if (Pµ) (A) = µ (A) for all Borel sets A. We assume that µ is a unique invariant probability measure for p (ξ, A). We note that if ν  µ, then there exists a density g such that dν = gdµ. Furthermore, it can be shown [13] that νn  µ for all n, and consequently for any such ν we can consider P as an operator on L 1 (µ), i.e. P maps the density g of ν into the density g1 = Pg of ν1 . Note that since µ is a finite measure on X , we can view as an operator on L m (µ) for any integer m ≥ 1. Let A and B be Borel sets on X , assume that the distribution of the initial state is the invariant measure µ, and define the transition probability from B into A as p (A|B) = Pr (ξ1 ∈ A|ξ0 ∈ B) R p (ξ, A) µ (dξ ) Pr (ξ1 ∈ A, ξ0 ∈ B) = B = Pr (ξ0 ∈ B) µ (B) if µ (B) > 0, and p (A|B) = 0 otherwise. We note that p (A|B) characterizes the dynamical fluctuations of the distribution of the Markov chain within the invariant distribution µ. The following definition is given in [5]. Definition 1. A Borel set A is said to be invariant if p (A|A) = 1, and metastable if p (A|A) ≈ 1. Therefore, a metastable set is almost invariant. SN Consider a measurable partition A1 , . . . , A N of the state space X , i.e., Ai ∈ B, i=1 Ai = X and Ai ∩ A j = 0 PN unless i = j. We are interested in finding a partition such that i=1 p (Ai |Ai ) ≈ N , i.e. a metastable partition. For such a partition, if one exists, we can approximate the Markov chain with transition function p (ξ, A) on X with an appropriately defined finite dimensional Markov chain (defined on a state space S of dimension M). In particular, the approximate Markov chain is obtained by a projection of the original chain onto the span of the characteristic functions for the metastable partition. The characterization of a metastable partition is in general a difficult problem. This problem has been considered in considerable detail in [5,16] for the special case where the Markov chain is reversible, i.e. p (ξ, dη) µ (dξ ) = µ (dη) p (η, dξ ), and therefore the operator P is self-adjoint on L 2 (µ). It is shown in [5,16] that in this case, the characterization of the metastable partition can be related to spectral properties of P and the indicator (characteristic) functions of the metastable sets “identified” with the eigenfunctions corresponding to the dominant (i.e. closest to hPχ ,χ i one) eigenvalues of P. Intuitively, this is most easily understood if one writes p (A|A) = hχ AA,χ AAi µ where χ A is the µ indicator of the set A and h·, ·iµ is the inner product on L 2 (µ). If δ ≈ 1 is an eigenvalue of P with eigenfunction hPϕ,ϕi

ϕ ∈ L 2 (µ) then hϕ,ϕi µ = δ. Thus, if ϕ is supported and approximately constant on some Borel set A, then that set µ will be a metastable set. The extension of this result to the non-reversible case is technically very difficult. An extension that involves characterizing a reversible chain that has the same almost invariant sets as the original chain is discussed in [6]. An alternative approach for characterizing metastable behavior, one that does not require identification of metastable sets and is based on spectral properties of the Markov chain, is developed for reversible finite dimensional chains in [11,8]. There is a large class of systems that do not satisfy the reversibility assumption. This is best illustrated by a simple example. Example 2. Consider the linear system ξn+1 = Fξn + wn

(4)

386

T. Runolfsson / Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393

where F ∈ Rr ×r is a stable matrix (i.e. all eigenvalues inside the unit disk), and wn is a sequence of iid Gaussian random variables with zero mean and covariance W > 0. Then p (ξ, A) has density p (ξ, η) =

(2π)

r 2

1 √

1

det W

T W −1 (η−Fξ )

e− 2 (η−Fξ )

and the stationary distribution for (4) has density p (ξ ) =

(2π )

r 2

1 √ det Σ

1 T −1 e− 2 ξ Σ ξ

where the stationary covariance satisfies Σ = FΣ F T + W. The process is reversible if the detailed balance equation p (ξ ) p (ξ, η) = p (η) p (η, ξ )

(5)

is satisfied. A straightforward calculation shows that (5) is satisfied if and only if W −1 = F T W −1 F + Σ −1 F T W = W F. Obviously, these conditions are only satisfied in very special cases. Therefore, even for a simple linear dynamical system, the reversibility conditions are rarely satisfied. In this paper, we suggest an alternative direct method for identifying metastable behavior that applies to the general non-reversible case. The approach is based on proper orthogonal decomposition (POD). We start with a brief review of POD. 3.1. Proper orthogonal decomposition Let H be a Hilbert space with inner product h·, ·i and let P : H → H be a bounded linear operator. Let ϕ1 , . . . , ϕ N ∈ H and define u i = Pϕi , i = 1, . . . , N . We view the ϕi ’s as test functions and the u i ’s as the resulting data. Typically the test functions ϕ1 , . . . , ϕ N are chosen so that as N → ∞, the projection of H onto the span of the ϕi ’s, converges to the identity in an appropriate way. Let U = span{u 1 , . . . , u N } and let d = dim (U). We PN ku i − Q W u i k2 is seek a subspace W of H of finite dimension M < d such that the mean square error i=1 minimized. Here Q W is the orthogonal projection of H onto W, and k·k is the norm on H. Note that since Q W is an orthogonal projection, we have ku i k2 = ku i − Q WP u i k2 +kQ W u i k2 . Thus the minimization of the mean square error N kQ W u i k2 in the projection. It is well known (see e.g. [14, is equivalent to the maximization of the “energy” i=1 17]) that the solution of this problem is given by the projection QW ψ =

M

X ψ, w j w j j=1

where w1 , . . . , w M are the eigenfunctions corresponding to the M largest eigenvalues of the selfadjoint linear operator R : H → H given by Rψ =

N X

u i hψ, u i i .

i=1

PN |hψ, u i i|2 ≥ 0, ∀ψ ∈ H. Therefore, the eigenvalues of R are real and nonnegative, and Note that hψ, Rψi = i=1 the eigenfunctions are orthonormal. The orthonormal eigenfunctions w1 , . . . , w M are frequently called the POD basis functions. In order to characterize the subspace W and the orthogonal projection Q W , we need to calculate the POD basis functions. We note that the range of R is U. Therefore, the eigenfunction of R corresponding to the eigenvalue λ can

T. Runolfsson / Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393

be written as w = Rw = R

PN

i=1 ci u i

N X

ci u i =

=

N X

for some constant coefficients ci and

N X

i=1

387

ci Ru i

i=1

ci

i=1

N X

N N X

X u j ui , u j = ci u j U ji

j=1

i=1

j=1

PN where the symmetric N × N matrix U has entries Ui j = u j , u i . Furthermore, since λw = λ i=1 ci u i it follows T that c = (c1 . . . c N ) is an eigenvector of U corresponding to the eigenvalue λ. Consequently, the problem reduces to finding the eigenvalues λ1 , . . . , λ N of the positive semi-definite matrix U and the associated orthonormal eigenvectors c1 , . . . , c N . In fact we have the following. Proposition 3. Suppose that M < d = dim (U) and let U be the positive semi-definite matrix with entries

Ui j = u j , u i , 1 ≤ i, j ≤ N . Let λ1 ≥ λ2 ≥ · · · ≥ λ N ≥ 0 and c1 , . . . , c N ∈ R N be the eigenvalues and corresponding orthonormal eigenvectors of U . Then the M POD basis functions are N 1 X j wj = p ci u i , λ j i=1

j = 1, . . . , M

(6)

PN j Proof. As indicated above, we know that the eigenfunctions of R can be written as w j = i=1 ci u i for some j constants ci . We note that if d = dim (U) < N , then there exists a subset of d independent functions in U such that the w j ’s can be written as a linear combination of those functions. To simplify the notation, we assume that d = N and note that the d < N case can be treated in similarly. Recall that Rw j =

N X i=1

j

ci

N X

u k Uki = λ j

k=1

N X

j

ci u i = λ j w j

(7)

i=1

and therefore N X k=1

uk

N X

! j Uki ci

j − λ j ck

=0

i=1

PN j j Due to the independence of the u i ’s, it follows that i=1 Uki ci − λ j ck = 0 for each k, or U c j = λ j c j . Next, note that since rank U = d we know that λ M > 0. Furthermore, + * N N X X

2 j j

w j = c uk c ui , i

i=1

=

N X N X i=1 i=1

k

i=1

2

j j ck ci Uki = c j U c j = λ j c j = λ j

Therefore, (6) follows after we normalize w j in (8) by

p

λ j > 0, 1 ≤ j ≤ M.

(8) 

As we mentioned earlier, if the set of test functions is sufficiently rich, then the accuracy of the POD approximation can be made arbitrarily good. Proposition 4. Assume that the projection Q N of H onto span{ϕ1 , . . . , ϕ N } converges pointwise to the identity on H as N → ∞. Let u be any point in the range of P and let ϕ ∈ H be such that u = Pϕ. Then for any ε > 0 there exists an integer N¯ such that for any N ≥ N¯ , there exist constants k1 , . . . , k N such that

N

X

ki u i < ε kϕk .

u −

i=1

388

T. Runolfsson / Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393

ε kϕk, ∀N ≥ N¯ where kPk < ∞ is the Proof. For the given ε > 0, select integer N¯ such that kϕ − Q N ϕk < kPk PN induced norm of P on H. Let k1 , . . . , k N be coefficient so that Q N ϕ = i=1 ki ϕi . Then,



N N



X X



ki u i = Pϕ − ki Pϕi

u −



i=1 i=1

= kP (ϕ − Q N ϕ)k ≤ kPk kϕ − Q N ϕk < ε kϕk .



We are now in position to state the following result on the POD approximation of the bounded linear operator P. Theorem 5. Assume that the projection Q N of H onto span{ϕ1 , . . . , ϕ N } converges pointwise to the identity on H as N → ∞. Then, for any ϕ ∈ H and ε > 0, there exists an integer N¯ and constant k such that for N ≥ N¯ there exists M ≤ N , and such that kPϕ − Q W P Q N ϕk < ε (kϕk + k) . Proof. By Proposition 4, if we select N sufficiently large, we have

N

X

kPϕ − Q W P Q N ϕk = Pϕ − Q W P ki ϕi

i=1

N

X

= Pϕ − Q W ki u i

i=1

N N N

X X X

= Pϕ − ki u i + ki u i − Q W ki u i

i=1 i=1 i=1



N N N

X

X X



≤ Pϕ − ki u i + ki u i − Q W ki u i



i=1 i=1 i=1

N

X

≤ ε kϕk + ki (u i − Q W u i ) .

i=1

Next, we note that for a given N we can select M, the dimension of W, large enough so that ε. Therefore, kPϕ − Q W P Q N ϕk ≤ ε kϕk +

N P

PN

i=1 k(u i

− Q W u i )k <

kki (u i − Q W u i )k

i=1

 N P k(u i − Q W u i )k ≤ ε kϕk + max |ki | 

i

i=1

≤ ε (kϕk + k) where and k = (maxi |ki |).



Note that in order to calculate U , we need to evaluate the inner products u j , u i . This is discussed next in terms of transfer operators of a Markov process. 3.2. Identification of a low dimensional approximation for a Markov chain We now return to the problem of identifying a low dimensional approximation for the Markov chain on X with transition function p (ξ, A) . Assume that there exists an unique invariant measure µ for p (ξ, A), i.e. Z p (ξ, A) µ (dξ ) = µ (A) (Pµ) (A) = X

T. Runolfsson / Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393

389

for all Borel sets A, where P is the Perron–Frobenius operator defined by (3). We view P as a linear operator on the Hilbert space L 2 (µ). The inner product on L 2 (µ) is Z

u,v µ = u (ξ ) v (ξ ) µ (dξ ) . X

The adjoint operator of P is the (expectation) operator Z f (η) p (ξ, dη) . (K f ) (ξ ) = X

Note that p (ξ, A) = (K χ A ) (ξ ), where χ A is the indicator (characteristic) function for the set A, i.e. χ A (ξ ) = 1 if ξ ∈ A and 0 otherwise. Let A1 , . . . , A N be a measurable partition of X and let ϕi (ξ ) = χ Ai (ξ ) and u i (ξ ) = (K ϕi ) (ξ ) = p (ξ, Ai ). We view the functions u i (x) as the POD data for the operator K , and seek the M < NP dimensional subspace of L 2 (µ) and N ku i − Q W u i k2 . As we saw in the corresponding orthogonal projection that minimizes the mean square error i=1 P j N the previous section, the optimal W is W = span{w1 , . . . , w M } where w j = √1 i=1 ci u i and λ1 ≥ · · · ≥ λ N ≥ 0 λj   j j T and c j = c1 . . . c N , 1 ≤ j ≤ N are the eigenvalues and corresponding orthonormal eigenvectors of the N × N

positive semi-definite matrix U with entries Ui j = u j , u i µ . Assume now that the invariant measure µ is an ergodic invariant measure of the Markov chain (ξ0 , ξ1 , . . .) with transition function p (ξ, A). Then for any f ∈ L 1 (µ) Z −1 1 KX lim f (ξk ) = f (ξ ) µ (dξ ) . K →∞ K X k=0 Assume that we are given the data sequence (ξ0 , ξ1 , . . . , ξ S−1 ) from either experimental or simulation data. Then since u j (ξ ) u i (ξ ) = p ξ, A j p (ξ, Ai ) ∈ L 1 (µ), we have Z

Ui j = u j , u i µ = u j (ξ ) u i (ξ ) µ (dξ ) X

L−1 S−1  1 X 1X = lim u j (ξk ) u i (ξk ) ≈ p ξk , A j p (ξk , Ai ) . L→∞ L S k=0 k=0

In summary, we have the following methods for the identification of the optimal finite dimensional subspace. Method: • Pick a partition A1 , . . . , A N of X and calculate the corresponding u i (ξ ) = p (ξ, Ai ) , i = 1, . . . , N .

• For the given “data” u i , i = 1 . . . , N calculate the positive semi-definite matrix U with entries Ui j = u j , u i µ . • Let λ1 ≥ λ2 ≥ λ N ≥ 0 be the eigenvalues of U , and let c1 , . . . , c N be the corresponding eigenvectors of U . • Assume there exists a M < N so that λ1 , . . . , λ M are of comparable size and λ M  λ M+1 . • Let λ1 ≥ · · · ≥ λ M > 0 be the dominant (largest) eigenvalues of U . For j = 1, . . . , M define w j = PN j √1 i=1 ci u i and W = span {w1 , . . . , w M } ⊆ L 2 (µ). λj

Let w j = √1

λj

j i=1 ci u i ,

PN

j = 1, . . . , M be generated using the above procedure. Substituting u i (ξ ) = p (ξ, Ai )

into the formula for w j (ξ ) gives N 1 X j c p (ξ, Ai ) w j (ξ ) = p λ j i=1 i Z N 1 X j = p c χ Ai (η) p (ξ, dη) λ j i=1 i X Z X Z j N c p i χ Ai (η) p (ξ, dη) = = v j (η) p (ξ, dη) . λj X i=1 X

390

T. Runolfsson / Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393

We note that v j (η) =

PN

i=1

c

j

√i χ Ai (η) characterizes the contributions of the original sets Ai to the jth POD λj

component. 3.3. Computational considerations

We note that the evaluation the matrix U requires the the determination (calculation) of the inner products u j , u i µ , which in turn requires the evaluation of of the transition probabilities u i (ξ ) = p (ξ, Ai ). The evaluation of these quantities is in general a challenging computational problem, and the development of efficient methods for their evaluation is of principal interest. In the previous section, we indicated a method for calculating the Ui j ’s for the case of an ergodic process. However, we note that in this calculation we assumed that we had at our disposal the probabilities u i (ξ ) = p (ξ, Ai ). The efficient computation of these integrals is a challenging problem by itself, which may include polynomial chaos based techniques [4,9] as well as Monte Carlo and quasi Monte Carlo methods [10, 12]. For the special case of small random perturbations, discussed in Section 2, we note that large deviation (Laplace integral) approximations can be used for the evaluation of the transition probabilities p (ξ, Ai ). In particular, we note that in this case, for each A j the transition probabilities p (ξ, Ai ) , ξ ∈ A j are principally determined by the characteristics of the deterministic flow; e.g. if A j is contained in an invariant for the deterministic system, then the probability p (ξ, Ai ) is large if Ai belongs to the same invariant set but small otherwise. Furthermore, in both cases p (ξ, Ai ) can be approximated by a constant. 3.4. Model reduction In this section, we identify a finite dimensional approximate operator for the expectation operator K for the Markov chain defined by p (ξ, A). This is achieved by the projection of the operator K onto W = span {w1 , . . . , w M }, where the wi ’s are orthogonal eigenfunctions of R corresponding to the M largest eigenvalues. The projection Q W : L 2 (µ) → W is defined for g ∈ L 2 (µ) by QW g =

M X

hg, wi iµ wi .

i=1

Let g = K ϕ for some ϕ ∈ L 2 (µ). Then QW K ϕ =

M X

hK ϕ, wi iµ wi .

i=1

P Assume now that ϕ ∈ W. Then ϕ = M j=1 α j w j and * + M M X X

hK ϕ, wi iµ = K α j w j , wi = α j K w j , wi µ . j=1

µ

j=1

Define an M × M matrix S as the matrix with entries s ji = K w j , wi . Consequently ! M M X X QW K ϕ = α j s ji wi . i=1

j=1

Note that the coefficients of Q W K ϕ are given by α T S, where α is the vector with entries α j , j = 1, . . . , M. For an

P arbitrary g ∈ L 2 (µ), we first find the projection ϕ = Q W g = M j=1 g, w j µ w j . Then the projected operator is ! M M

X X QW K QW g = g, w j µ s ji wi . i=1

j=1

We note finally that the approximate operator obtained above may not correspond to an expectation operator for any Markov chain.

T. Runolfsson / Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393

391

4. Finite dimensional case We now assume that the state space X is a finite dimensional set, e.g. X = {x1 , . . . , x N }. In this case, we pick Ai = {xi } and the transition function is replaced by the transition probabilities p ji = Pr ξn+1 = x j |ξn = xi . We let P be the stochastic matrix with entries pi j (note that it is more common to define the stochastic matrix as the transpose of P, but we have defined it this way here to be consistent with the operator notation in Section 3). Assume that P (and the Markov chain) is irreducible and aperiodic. Then there exists a unique invariant distribution µ that satisfies the equation µ = Pµ, i.e. µ is a right eigenvector of P corresponding to the eigenvalue 1. In this case, the operator K has the matrix representation P T , and the symmetric matrix U has entries N X

Ui j = u j , u i µ = p jk pik µk = (P D P T )i j k=1

where D = diag (µ1 , . . . , µ N ). Let λ1 ≥ · · · ≥ λ N ≥ 0 be the eigenvalues of U with corresponding eigenvectors c1 , . . . , c N , and assume that there exists a M < N such that λ1 , . . . , λ M are the dominant eigenvalues of U . Define the orthonormal basis functions w1 , . . . , w M ∈ R N by w j = (P T cˆ j ) j

where cˆ j = √c . Then the entries of the matrix S are given by λj

E D si j = P T wi , w j = wiT P Dw j = w Tj D P T wi ,

1 ≤ i, j ≤ M.

T D P T W where W = [w · · · w ] ∈ R N ×M . We note that the projected In particular, it is easy to see that S = W M M M M 1 N operator is now for ϕ ∈ R

QW P Tϕ =

M D X i=1

=

M X

P T ϕ, wi

E µ

wi

ϕ T P Dwi wi

i=1

=

M X

wi wiT D P T ϕ

i=1

 w1T D P T ϕ   .. = [w1 · · · w M ]   . 

w TM D P T ϕ  T w1  ..  T = [w1 · · · w M ]  .  D P T ϕ = W M W M D P T ϕ. w TM

T D K . Furthermore, if ϕ ∈ W, then In particular, the projected operator has the matrix representation K M = W M W M T M ϕ = W M α for some α ∈ R and K M ϕ = W M W M D K W M α = W M Sα. Thus, the coefficients of Q W K ϕ = K M ϕ are in this case β = Sα, i.e. K M ϕ = W M β. n ϕ = K n−1 K ϕ = K n−1 W W T D K ϕ = Consider now the evolution of the projected operator K M , i.e. K M M M M M M n ϕ = W S n α. In T n−1 W M D K ϕ, where n ≥ 1 is and integer. Note that if ϕ ∈ W, then ϕ = W M α and K M WM S M particular, the evolution of the projected operator is characterized by the repeated application of the M × M coefficient matrix S.

Example 6. Consider a Markov chain with N = 12 that is constructed as follows. Let P1 ∈ R12×12 be a block diagonal matrix having three blocks with positive entries, P2 ∈ R12×12 be a matrix with positive entries, ε is a small

392

T. Runolfsson / Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393

parameter and define Pˆ = (1 − ε)P1 + ε P2 . We select the entries of P1 and P2 randomly from a uniform distribution pˆ on [0, 1]. Define the normalized matrix with entries pi j = P N i j . Then P is a transition matrix of a Markov chain. i=1

pˆ i j

We assume that the P so chosen is irreducible and aperiodic. For a particular choice of P, we obtain the matrix U with eigenvalues (λ1 , . . . , λ12 ) = 0.0956 0.0456 0.0366 0.0084 · · · 0.00004 . As expected M = 3 and the reduced system is K 3 = W3 W3T D K . The POD error between the two systems is N X N X

(K (k, j) − K 3 (k, j))2 µ (k) = Tr (K − K 3 )T D (K − K 3 ) = 0.0331.

j=1 k=1

Note that TrK D K T = 0.2118. so the relative error is 15.6%. In order to compare K and K 3 . it is informative to look at the left eigenvector corresponding to the maximal eigenvalue of the two matrices. Obviously, for K this is the invariant distribution for the Markov chain. An easy computation gives 0.2483 0.2508

e1K

0.3196   0.2388   0.3969   0.4747   0.1864 =µ=  0.3113   0.2051   0.1889   0.2941 0.2542 0.1860

and

e1K 3

0.3168   0.2377   0.3953   0.4817   0.1814 = . 0.3138   0.2062   0.1816   0.2984 0.2567 0.1772

We remark that the leading eigenvector of K 3 is indeed a very good approximation to the invariant distribution, with a relative error of 1.6%. We also note that the relative error between K 10 and K 310 is 0.4%. In particular, the error between K n and K 3n seems to be a decreasing as a function of n. Acknowledgement This research is supported by AFOSR under grant FA9550-05-1-0441. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]

M. Dellnitz, O. Junge, On the approximation of complicated dynamical behavior, SIAM Journal on Numerical Analysis 36 (1999) 491–515. M.I. Freidlin, A.D. Wentzell, Random Perturbations of Dynamical Systems, Springer, New York, 1998. G. Froyland, M. Dellnitz, Detecting and locating near-optimal, almost-invariant sets and cycles, 2001. Preprint. R. Ghanem, J. Red-Horse, Propagation of probabilistic uncertainty in complex physical systems using stochastic finite element approach, Physica D 133 (1999) 137–144. W. Huisinga, B. Schmidt, Metastability and dominant eigenvalues of transfer operators, 2004. Preprint. O. Junge, J.E. Marsden, I. Mezic, Uncertainty in the dynamics of conservative maps, in: Proceedings of the 2004 IEEE CDC, 2004. Y. Kifer, Random Perturbations of Dynamical Systems, Birkhauser, New York, 1988. S. Lafon, A.B. Lee, Diffusion maps and coarse-graining: A unified framework for dimensionality reduction, graph partitioning and data set parametrization, 2005. Preprint. S.V. Lotovsky, R. Mikulevicius, B.L. Rozovskii, Nonlinear filtering revisited: A spectral approach, SIAM Journal on Control and Optimization 35 (2) (1997) 435–461. N. Metroplois, S.M. Ulam, The Monte Carlo method, Journal of the American Statistical Association 44 (1949) 333–341. B. Nadler, S. Lafon, R.R. Coifman, I.G. Kevrekidis, Diffusion maps, spectral clustering and eigenfunctions of Fokker–Planck operators, 2005. Preprint. P. Glasserman, P. Acworth, M. Broadie, A comparison of some Monte Carlo and quasi Monte Carlo techniques for option pricing, in: Monte Carlo and Quasi-Monte Carlo Methods: Proceedings of a Conference at the University of Salzburg, 1996, pp. 1–18. D. Revuz, Markov Chains, North-Holland, Amsterdam, 1975. C.W. Rowley, T. Colonius, R.M. Murray, Model reduction for compressible flows using pod and galerkin projection, Physica D 189 (2004) 115–129.

T. Runolfsson / Nonlinear Analysis: Hybrid Systems 2 (2008) 383–393

393

[15] T. Runolfsson, Nonparametric modeling of uncertain complex systems, 2005. Preprint. [16] Ch. Sch¨utte, W. Huisinga, P. Deuflhard, Transfer operator approach to conformational dynamics in biomolecular systems, in: Ergodic Theory, Analysis, and Efficient Simulation of Dynamical Systems, pp. 191–223. [17] S. Volkwein, Bounds for condition numbers of mass and stiffness matrices arising in pod galerkin approximations, 2003. Preprint.