A randomized balanced proper orthogonal decomposition technique

A randomized balanced proper orthogonal decomposition technique

Journal Pre-proof A randomized balanced proper orthogonal decomposition technique Dan Yu, Suman Chakravorty PII: DOI: Reference: S0377-0427(19)30545...

845KB Sizes 0 Downloads 109 Views

Journal Pre-proof A randomized balanced proper orthogonal decomposition technique Dan Yu, Suman Chakravorty

PII: DOI: Reference:

S0377-0427(19)30545-X https://doi.org/10.1016/j.cam.2019.112540 CAM 112540

To appear in:

Journal of Computational and Applied Mathematics

Received date : 17 January 2019 Revised date : 7 October 2019 Please cite this article as: D. Yu and S. Chakravorty, A randomized balanced proper orthogonal decomposition technique, Journal of Computational and Applied Mathematics (2019), doi: https://doi.org/10.1016/j.cam.2019.112540. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Elsevier B.V. All rights reserved.

Journal Pre-proof

Dan Yua , Suman Chakravortyb,∗, a College

of

A Randomized Balanced Proper Orthogonal Decomposition Technique

p ro

of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China b Department of Aerospace Engineering, Texas A&M University, College Station, TX, 77840, USA

Abstract

Pr e-

In this paper, we consider the model reduction problem of large-scale linear systems, such as systems obtained through the discretization of partial differential equations. We propose a randomized proper orthogonal decomposition (RPOD∗ ) technique to obtain the reduced order model by perturbing the primal and adjoint system using Gaussian white noise. We show that computations required by RPOD∗ algorithm is orders of magnitude lower while its performance is much better than other state of the art algorithms. We also relate the RPOD∗ algorithm to Krylov subspace methods and show that it constitutes a randomized approach to computational linear algebra problems that utilize Krylov subspace methods.

1. Introduction

al

Keywords: Model Reduction, Proper Orthogonal Decomposition (POD), Krylov subspace methods, Randomized Algorithm

Jo

urn

In this paper, we are interested in the model reduction of large scale linear systems such as those governed by partial differential equations (PDE). The dimension of the system is large due to the discretization of PDEs. For instance, consider the atmospheric dispersion of an air pollutant [1]. The emission of contaminants on the ground level is shown in Fig.1 with four point sources labeled from S1 to S4. This is a three dimensional problem, and after discretizing the PDE, the dimension of the system is 106 . Therefore, we are interested in constructing a reduced order model (ROM) that can capture the input/output characteristics ? This

work was supported by NSFC project No.61801213 author Email addresses: [email protected] (Dan Yu), [email protected] (Suman Chakravorty) ∗ Corresponding

Preprint submitted to Journal of Computational and Applied Mathematics October 8, 2019

Journal Pre-proof

Zn concentration 1

1 0.

0.09

0.05

S3

0.08

0.05

0.05 0. 0

2

250

0.07

0.1

S2

0.01

200

0.

1

0.06

S4 0.1

0.05

01

150 100

0.0

.1

5

0 S1

0 0. 0.02

0.0 0.05

5

50 0

0.0

1

-50

0.

00

0

500

0.04 0.03 0.02

0.02 0.02

1

0.01

1000

x (m)

0.01

p ro

y (m)

0.0

00

300

-100

0.1

2 0.0

350

of

400

1500

2000

Figure 1: Air pollutant problem

Jo

urn

al

Pr e-

of the large model such that this ROM can then be used by a filtering algorithm for updating the states of the field, such as the Kalman filter. Also, actuators and sensors could be placed anywhere in this field, which leads to a model reduction problem of a large-scale system with a large number of inputs/outputs. In the past few decades, two model reduction techniques, singular value decomposition (SVD)-based methods and Krylov subspace methods have been studied in detail [2]. Exhaustive surveys on model reduction methods can be found in, e.g., [3, 4]. For large-scale systems, Krylov subspace methods [2, 5] which compute subspace bases using the Lanczos [6] and Arnoldi [7] procedures iteratively have been widely used. Compared with SVD-based methods, the computational and storage cost using Krylov subspace methods are reduced, while in general, important properties of the original system such as stability and passivity are not preserved [3, 5]. Among SVD-based methods, balanced proper orthogonal decomposition (BPOD) [8, 9] was proposed as an approximate version of balanced truncation [10]. Balancing transformations are constructed using impulse responses of both the primal and adjoint system, and hence, the most controllable and observable modes can be kept in the ROM. In 1978, Kung [11] presented a new model reduction algorithm in conjunction with the SVD technique, and the eigensystem realization algorithm (ERA) [12] was developed based on this technique. BPOD is equivalent to ERA procedure [13], and forms the Hankel matrix using the primal and adjoint system simulations as opposed to the input-output data as in ERA. The primary drawback of BPOD and ERA is that for a large scale system, such as that obtained by discretizing a PDE, with a large number of inputs/outputs, the computational burden incurred is very high. There are two main parts to the computation: the first is to collect datasets from computationally expensive primal and adjoint simulations in order to generate the Hankel matrix. The second part is to solve the SVD problem for the resulting Hankel matrix.

2

Journal Pre-proof

Jo

urn

al

Pr e-

p ro

of

To reduce the computational cost, various methods have been proposed in literature. For example, [9] proposed a BPOD output projection method to address the problem when the number of outputs is large, albeit the method is still faced with a very high computational burden when both the numbers of inputs and outputs are large. In [14], a method is proposed to reduce the number of snapshots, however, the primary problem regarding large number of inputs/outputs remains the same. More recently, [15] proposes a tangential interpolation-based ERA (TERA) method, which projects the impulse responses of original system onto suitably chosen directions, and constructs a projected Hankel matrix. In [16], a CUR-ERA method constructs a sub-Hankel matrix by searching over columns and rows of the full Hankel matrix. Similarities and differences between the above mentioned algorithms and the RPOD∗ algorithm, are discussed in detail in section 5.2. Recently, randomized algorithms, such as random sampling algorithms and random projection algorithms have received a lot of attention in computer science [17, 18] and control community [19]. For a large scale matrix H, random ˆ by choosing sampling algorithms construct a rank k approximation matrix H and rescaling some columns of H according to certain sampling probabilities [20, 17], so the errors are bounded with high probability. In random projection methods [18], a Gaussian test matrix Ω is generated, and the orthonormal matrix Q is constructed by performing a QR factorization of the matrix product HΩ. The large matrix H is projected on to Q such that the error is bounded with high probability. Some recent papers have also proposed the use of these randomized SVD solution techniques to model reduction problems for dynamical systems [21, 22, 23]. We had introduced an RPOD algorithm in [24] that randomly chooses a subset of the input/output trajectories. A sub-Hankel matrix is constructed using these sampled trajectories, which is then used to form a ROM in the usual BPOD fashion. The major problem associated with this algorithm is that different choices of sampling algorithms would lead to different lower bounds, and the choice of a good sampling algorithm other than the uniform distribution is unclear. In this paper, we propose the RPOD∗ algorithm where we perturb the primal and adjoint system with Gaussian white noise, and we show that Markov parameters of the ROM constructed using RPOD∗ are close to Markov parameters of the full order system. The main contribution of RPOD∗ method in model reduction is that only one primal and one adjoint simulation are needed, and the construction and storage of the full Hankel matrix is avoided, and hence, it is computationally far less expensive when compared to BPOD/ERA algorithms for a large-scale system with a large number of inputs/outputs. Fundamentally, the RPOD* constructs a randomized version of the standard Krylov subspace, which is orders of magnitude less time and memory consuming than Krylov subspace methods, while finding the orthonormal basis for the randomized subspace reduces to a small symmetric Eigenvalue problem that can be accurately solved using standard methods. This paper is a significantly extended version of the conference proceeding that first proposed the RPOD* algorithm [25]. The rest of the paper is organized as follows. In section 2, the problem is 3

Journal Pre-proof

of

formulated, and in section 3, we introduce the partial realization problem and review the BPOD algorithm. The RPOD∗ algorithm is proposed in section 4, and formal proofs and results are shown. Also, we discuss some implementation problems in this section. In section 5, we compare the RPOD∗ algorithm with related algorithms. In section 6, we provide computational results comparing the RPOD∗ with other state of the art algorithms for three numerical examples. Notations used in this paper are summarized as follows.

p ro

• (A, B, C): system matrices.

• x(k), ω(k), z(k): states, inputs and outputs of primal system.

• y(k), ωa (k), za (k): states, inputs and outputs of adjoint system. • (Λ, V, U ): eigenvalues and right/left eigenvectors of system matrix A, with (λi , vi , ui ) as the ith eigenvalue-eigenvector pair.

Pr e-

• R(A, B), O(A, C): reachability and observability matrix • (Φ, Ψ): biorthogonal bases

• (X, Y, H): primal, adjoint snapshot ensembles and the resulting Hankel matrix. • (Σ, M, N ): singular values and left/right singular vectors. • Ω, Ωa : Gaussian random matrices.

¯ Y¯ , H, ¯ Σ, ¯ M ¯,N ¯ , Ψ, ¯ Φ, ¯ A, ¯ B, ¯ C): ¯ • (X, RPOD randomized snapshot ensembles, Hankel matrix, singular values/vectors, projection bases and ROM.

al

¯ R , Y¯R , H ¯R, Σ ¯ R, M ¯ R, N ¯R , Ψ ¯ R, Φ ¯ R , A¯R , B ¯R , C¯R ): ideal R-mode random• (X ized snapshot ensembles, Hankel matrix, singular values/vectors, projection bases, and ROM.

urn

• nx , nω , nz : number of states, inputs and outputs of primal system • ∆t: sampling time

• S: a finite number such that kAS k ≈ 0, where k.k is the spectral norm of the matrix. • R: number of RPOD snapshots/rank of the snapshot ensembles.

Jo

2. Problem Formulation

Consider a linear discrete-time input-output system: x(k + 1) = Ax(k) + Bω(k), z(k) = Cx(k),

4

(1)

Journal Pre-proof

y(k + 1) = A0 y(k) + C 0 ωa (k), za (k) = B 0 y(k),

of

where x(k) ∈
(2)

p ro

where y(k) ∈
 u01   ..   . , u0nx λnx

Pr e-



A = V ΛU 0 = v1

···

 vnx 

λ1

..

.



(3)

where λi is the ith eigenvalue, (vi , ui ) are the corresponding right and left eigenvector, i = 1, 2, · · · nx . In this paper, we consider the model reduction problem for large-scale systems with a large number of inputs/outputs, i.e., nx , nω , nz are large. Consider system (1), the input-output relation with zero initial condition is given by: k X

CAi−1 Bω(k − i),

al

z(k) =

i=1

(4)

urn

which is fully determined by the Markov parameters hi = CAi−1 B. The goal ¯ B, ¯ C), ¯ such that given the same input ω(k), the is to construct a ROM (A, outputs of the ROM z¯(k) are close to the outputs of the full order system z(k), i.e., z(k)− z¯(k) is small, k = 1, 2, · · · . The problem could be solved via matching a finite sequence of Markov parameters, and is known as the partial realization problem. 3. Partial Realization Problem

Jo

In this section, first we introduce the partial realization problem. Then we briefly review the BPOD algorithm, an approach used to solve the partial realization problem.

5

Journal Pre-proof

3.1. Preliminaries Denote the finite reachability matrix R(A, B) and observability matrix O(A, C), respectively, as

of

R(A, B) = [B, AB, · · · , AS−1 B] ∈
O(A, C) = [C 0 , A0 C 0 , · · · , (A0 )S−1 C 0 ]0 ∈
(5)

p ro

where S is a finite number. The span of columns of the matrix R(A, B) is denoted as span col R(A, B), and is referred to as a Krylov subspace. Given a finite sequence of Markov parameters, the partial realization problem is defined as follows.

Pr e-

Definition 2. Partial Realization Problem [2, 26]. Given a finite sequence of Markov parameters hk , k = 1, · · · , 2S−1, the partial realization problem consists of finding a positive integer nx and constant matrices (A, B, C) such that hk = CAk−1 B, A ∈
span col Φ = span col O(A, C)0 ,

(6)

ˆ = Φ0 B, Cˆ = CΨ is a partial realization of then the ROM: Aˆ = Φ0 AΨ, B (A, B, C) and matches 2S − 1 Markov parameters.

urn

al

The biorthogonal bases (Φ, Ψ) can be constructed using different approaches, such as Krylov subspace methods and balanced truncation methods. In the following, we briefly review the BPOD algorithm, which constructs R(A, B) and O(A, C) using primal and adjoint simulations and constructs the biorthogonal bases by solving a SVD problem. 3.2. BPOD Algorithm Consider the stable linear system (1)-(2), let B = [b1 , · · · , bnω ] where bj , j = 1, · · · , nω denotes the j th column of B, and C = [c01 , · · · , c0no ]0 , where cj , j = 1, · · · , nz denotes the j th row of C. The BPOD algorithm [9] is summarized in Algorithm 1. For simplicity, we assume that in both the primal and adjoint simulations, we take S snapshots at discrete time 0, 1, · · · , S − 1.

Jo

4. Randomized Proper Orthogonal Decomposition (RPOD∗ )

In this section, first, we first provide a heuristic derivation of the proposed RPOD* algorithm to explain the intuition behind the technique. We follow it with a detailed development of performance bounds on the RPOD* algorithm. We discuss implementation issues of the algorithm in section 4.3. 6

of

Journal Pre-proof

p ro

Algorithm 1 BPOD Algorithm 1. Collect impulse responses X of primal system (1): Use bj , j = 1, · · · , nω as initial conditions for (1) with ω(k) = 0, k ≥ 0. Take S snapshots at discrete times 0, 1, · · · , S − 1, then X = [x1 (0), · · · , xnω (0), · · · , x1 (S − 1), · · · , xnω (S − 1)],

(7)

Pr e-

where xj (k) is the state snapshot at time instant k with bj as the initial condition, k = 0, 1, · · · , S − 1 and j = 1, 2, · · · , nω . 2. Collect impulse responses Y of adjoint system (2): Use c0j , j = 1, · · · , nz as initial conditions for (2) with ωa (k) = 0, k ≥ 0. Take S − 1 snapshots at time 0, 1, · · · , S − 1, and Y = [y1 (0), · · · , ynz (0), · · · , y1 (S − 1), · · · , ynz (S − 1)],

(8)

where yj (k) is the state snapshot of adjoint system at time k with c0j as the initial condition, k = 0, 1, · · · , S − 1 and j = 1, 2, · · · , nz . 3. Construct the Hankel matrix H and solve the SVD problem,    0  Σ 0 N 0 H = Y X = M M1 , (9) 0 Σ1 N10

al

where Σ contains the first R non-zero singular values, and (M, N ) are the corresponding left and right singular vectors. 4. Construct BPOD bases:

(10)

Ar = Φ0 AΨ, Br = Φ0 B, Cr = CΨ.

(11)

urn

Ψ = XN Σ−1/2 , Φ = Y M Σ−1/2 ,

Jo

5. The ROM is:

7

Journal Pre-proof

x(k) =

Pr e-

p ro

of

4.1. RPOD*: The Heuristic Idea The primary issue with the BPOD algorithm arises when the number of inputs/ outputs are high. The generation and storage of the reachable and observable subspaces R(A, B) and O(A0 , C 0 ) is highly time and memory consuming. Moreover, the SVD of the resulting Hankel matrix is also highly computationally intensive due to the large size of the problem. In the following, we propose a randomized technique that is able to address each of the above issues in a highly efficient fashion. Under Assumption 1, there exists a finite number S, such that kAS k ≈ 0, where k.k denotes the spectral norm of the matrix. In general, we can only find S such that ||AS || < e for any arbitrarily small e. However, for simplicity, we will assume that ||AS || = 0, essentially the error in the response is O(e) in making this approximation. ¯ constructed by perturbing the priConsider a primal snapshot ensemble X mal system (1) with white noise ω(k) which has normal distribution, and collect¯ = [x(t1 ), x(t2 ), · · · , x(tR )], ing R snapshots at time t1 ∆t, t2 ∆t, · · · , tR ∆t, i.e., X where t1 , · · · , tR are non-consecutive integers, and t1 ≥ S. Moreover, let us also assume that the snapshots are spaced such that tk − tk−1 > S. The state snapshot x(k) at time k is: k−1 X i=0

Ai Bω(k − i), k = 1, 2, · · · ,

(12)

urn

al

where ω(k) is white noise. ¯ can be written as follows: The snapshot ensemble X  ¯ = x(t1 ) x(t2 ) · · · x(tR ) X  (1)  ω (1) ω (2) (1) · · · ω (R−1) (1) ω (R) (1) (1) (2) (R−1) (2) ω (R) (2)     ω (2) ω (2) · · · ω  = B AB · · · AS−1 B ×  . , . .. . .   {z } | . . ··· ··· . R(A,B) (1) (2) (R−1) (R) ω (S) ω (S) · · · ω (S) ω (S) | {z } Ω

(13)

Jo

where R(A, B) is the controllability matrix. In the above expression, each ω (i) (j) is a nω × 1 vector with i.i.d Standard Gaussian entries (mean 0 and variance 1). Moreover, each matrix entry is also independent of any other entry. Denote Ω = ω ¯ (i) is the ith column of Ω. Thus, it ¯ (1) ω ¯ (2) · · · ω ¯ (R) , where ω can be seen that Ω is an Snω × R matrix with standard Gaussian i.i.d. entries. This allows us to explain the basic intuition behind the RPOD* technique. Let us assume that R(A, B) is really only a rank R matrix with R  Snω . Then, ¯ spans the same subspace as R(A, B). To the randomized snapshot ensemble X see this, note that Ω is a ”tall” random Gaussian matrix, which implies that Ω0 Ω ≈ Snω IR , where IR is the R × R identity matrix. Therefore, Ω has full column rank, and moreover, is almost perfectly conditioned. Therefore, the rank 8

Journal Pre-proof

p ro

of

¯ is also R and hence, X ¯ spans the same subspace as R(A, B). An analoof X gous argument is valid for the randomized adjoint ensemble Y¯ = O(A0 , C 0 )Ωa , (i) where Ωa is made of nz × 1 random Gaussian vectors ωa (j), making Ωa an Snz × R random Gaussian matrix with standard entries. Thus, instead of using controllability and observability matrices in the BPOD algorithm, one can use ¯ Y¯ . The size of these the randomized primal and adjoint snapshot ensembles X, ensembles is nx × R and the resulting randomized Hankel matrix dimension is R × R. Therefore, owing to the fact that R  Snω , Snz , the snapshot generation effort and the storage requirements for the Hankel matrix are drastically reduced, while the resulting SVD problem is also a small R × R problem. The heuristic arguments above can be rigorously justified using elements of Random matrix theory and a perturbation analysis of the underlying SVD problem. We endeavor to do this in the next section. The RPOD∗ algorithm is summarized in Algorithm 2.

Pr e-

Algorithm 2 RPOD∗ Algorithm 1. Perturb the primal system (1) with white noise ω(k), collect R snapshots at time t1 , t2 , · · · , tR , where t1 ≥ S, tk − tk−1 ≥ S. Denote the snapshot ¯ as: ensemble X  ¯ = x(t1 ) x(t2 ) · · · x(tR ) . X (14)

2. Perturb the adjoint system (2) with white noise ωa (k), collect R snapshots at time t1 , t2 , · · · , tR . Denote the adjoint snapshot ensemble Y¯ as:  Y¯ = y(t1 ) y(t2 ) · · · y(tR ) . (15)

al

3. Solve the SVD problem:

¯ ¯ = Y¯ 0 X ¯= M H

¯o M

 ¯  Σ 0

0 ¯o Σ



 ¯0 N ¯0 , N o

(16)

urn

¯ contains the first R non-zero singular values σ where Σ ¯1 ≥ σ ¯2 ≥ · · · ≥ ¯,N ¯ ) are the corresponding left and right singular vectors. σ ¯R > 0, (M 4. Construct bases: ¯ =X ¯N ¯Σ ¯ −1/2 , Φ ¯ = Y¯ M ¯Σ ¯ −1/2 . Ψ

(17)

¯ 0 AΨ, ¯ B ¯=Φ ¯ 0 B, C¯ = C Ψ. ¯ A¯ = Φ

(18)

Jo

5. The ROM is:

9

Journal Pre-proof

4.2. Performance Bounds for the RPOD* Algorithm

of

We assume that A is non-defective (Assumption 1). In the following, k.k denotes the spectral norm if the argument is a matrix, or the Euclidean 2(i) (i) norm if the argument is a vector. Let Cvi = Ob , u0i B = Co , and let (i) (i) (i) (i) ||Ob || = ob , ||Co || = co . We arrange the eigenmodes in an order such (1) (1) (2) (2) (n ) (n ) that |λ1 |ob co > |λ2 |ob co > · · · |λnx |ob x co x . The ordering above defines the dominant modes in the input-output response of the system. Further, let

i=1

p ro

nx nx nx X X X X = [B, AB, A2 B, · · · AS−1 B] = [ vi u0i B, λi vi u0i B, · · · λS−1 vi u0i B], i i=1

i=1

R R R X X X XR = [ vi u0i B, λi vi u0i B, · · · λS−1 vi u0i B], i i=1

i=1

Pr e-

δX = X − XR ,

i=1

and denote

¯ = XΩ, X ¯ R = XR Ω, δ X ¯ =X ¯ −X ¯ R = δXΩ, X

(19) (20) (21)

(22)

where Ω is an Snω × R Gaussian random matrix with i.i.d standard normal entries. ¯ is the Remark 1. Note that X = R(A, B) is the reachability matrix, and X primal snapshot ensemble constructed using RPOD∗ . XR is a surrogate matrix ¯ R is the ideal constructed using exact first R eigenvalue-eigenvector pairs, and X R-mode randomized snapshot ensemble.

al

Similarly, we define:

urn

Y = [C 0 , A0 C 0 , · · · A0S−1 C 0 ] = [

nx X

ui vi0 C 0 ,

i=1

nx X i=1

λi ui vi0 C 0 , · · ·

R R R X X X YR = [ ui vi0 C 0 , λi ui vi0 C 0 , · · · λS−1 ui vi0 C 0 ], i i=1

i=1

nx X

λS−1 ui vi0 C 0 ], i

i=1

(23) (24)

i=1

δY = Y − YR , Y¯R = YR Ωa , Y¯ = Y Ωa , δ Y¯ = Y¯ − Y¯R = δY Ωa ,

(25) (26)

Jo

where Ωa is an Snz × R Gaussian random matrix with i.i.d standard normal entries. ¯ Φ ¯ be the result of applying RPOD* to ensembles X, ¯ Y¯ , and H ¯ = Y¯ 0 X ¯ Let Ψ, 0 ∗ ¯ ¯ ¯ ¯ as the resulting Hankel matrix with SVD H = M ΣN . According to RPOD , ¯ =X ¯N ¯Σ ¯ −1/2 , and Φ ¯ = Y¯ M ¯Σ ¯ −1/2 , and it follows that Ψ ¯Φ ¯0 = X ¯H ¯ −1 Y¯ 0 . Ψ

10

Journal Pre-proof

¯

¯

¯

¯

¯

= =

of

Let ΨR , ΦR be the result of applying RPOD* to ensembles XR , YR , HR ¯R = M ¯ RΣ ¯ RN ¯0 , Ψ ¯R = X ¯RN ¯R Σ ¯ −1/2 , Φ ¯ R = Y¯R M ¯ RΣ ¯ −1/2 , and Ψ ¯ RΦ ¯0 Y¯R0 X R R R R −1 0 ¯RH ¯ Y¯ . X R R ¯ =H ¯ R + E, where E = Y¯ 0 δ X ¯ + δ Y¯ 0 X ¯ R + δ Y¯ 0 δ X. ¯ Define We have H R ¯Φ ¯0 − Ψ ¯ RΦ ¯ 0R = (X ¯ R + δ X)( ¯ H ¯ R + E)−1 (Y¯R + δ Y¯ )0 − X ¯RH ¯ −1 Y¯R0 . ∆=Ψ R

(27)

p ro

¯ −1 E|| < 1, we may expand (H ¯ R + E)−1 = H ¯ −1 − Assuming that ||H R R −1 −1 −1 −1 −1 ¯ ¯ ¯ ¯ ¯ H E H + H E H E H + HOT . Then, ∆ can be expanded as ∆ = R R R R R P∞ l l=1 (−1) ∆l , where

∆l = ¯RH ¯ −1 Y¯R0 − I)δ X( ¯ H ¯ −1 E)l−1 H ¯ −1 Y¯R0 + X ¯ R (H ¯ −1 E)l−1 H ¯ −1 δ Y¯ 0 (X ¯RH ¯ −1 Y¯R0 − I) (X R R R R R {z R } | {z } | ∆l1

∆l2

¯RH ¯ −1 δ Y¯ 0 δ X( ¯ H ¯ −1 E)l−1 H ¯ −1 Y¯R0 − δ X( ¯ H ¯ −1 E)l−1 H ¯ −1 δ Y¯ 0 . +X R R R R R {z } |

Pr e-

∆l3

(28)

Consider now the difference between the impulse response of the idealized ¯R , C¯R ) resulting from applying RPOD* to the R-mode ensembles ROM (A¯R , B ¯ R , Y¯R and the actual ROM (A, ¯ B, ¯ C) ¯ resulting from the true ensembles X, ¯ Y¯ . X ¯ − C¯R A¯kR B ¯R = C(Ψ ¯ RΦ ¯ 0R + ∆)(A(Ψ ¯ RΦ ¯ 0R + ∆))k B C¯ A¯k B ¯ RΦ ¯ 0R )(A(Ψ ¯ RΦ ¯ 0R ))k B ≡ Fk (∆) + Sk (∆), − C(Ψ Fk (∆) =

k X

¯ RΦ ¯ 0R )A)l ∆(A(Ψ ¯ RΦ ¯ 0 ))k−l B, C((Ψ R

(29)

l=0

al

where Fk (∆) denotes the first order effect of the perturbation ∆ and Sk (∆) denotes the second and higher order effects of the perturbation for k = 0, 1, 2, · · · . Given the above developments, the outline of the proof is as follows:

urn

¯ • First, we will bound the difference between the Markov parameters C¯ A¯k B of the true RPOD* ROM and the Markov parameters of the idealized R¯R in terms of ∆. mode RPOD* ROM C¯R A¯kR B • Then, we shall use random matrix theory results to establish probabilistic ¯ − C¯R A¯k B ¯ large deviation bounds on ||C¯ A¯k B R R ||. • Next, we shall bound the difference between the true system Markov pa¯R ||. rameters and the idealized R-mode Markov parameters ||CAk B−C¯R A¯kR B

Jo

• Finally, we will use the above three steps to develop a probabilistic large deviation bound on the difference between the true system response z(t) and the RPOD* response z¯(t).

We have the following result regarding the first order effect of the perturba¯ RΦ ¯ 0 plays an important role in our subsequent analysis tion Fk (∆). The term Ψ R and thus, first we characterize it in the following result. 11

Journal Pre-proof

¯ R ,Φ ¯ R be the result of applying RPOD* to the ensembles X ¯R Lemma 1. Let Ψ PR 0 0 ¯ ¯ ¯ and YR Then, ΨR ΦR = i=1 vi ui .

of

¯ R = VR α, where UR = [u1 , · · · uR ] and VR = Proof. Note that Y¯R = UR β, X [v1 , · · · vR ] contains the first R left and right eigenvectors respectively, and α, β are coefficient matrices. Thus, it follows that: ¯ R = β 0 UR0 VR α = β 0 α = M ¯ RΣ ¯ RN ¯R0 . Y¯R0 X

(30)

p ro

¯ 0 AΨ ¯R = Σ ¯ −1/2 M ¯ 0 Y¯ 0 AX ¯RN ¯R Σ ¯ −1/2 , but Y¯ 0 AX ¯R We also have that A¯R = Φ R R R R R R 0 0 0 = β UR AVR α = β ΛR α, where ΛR = diag(λ1 , · · · λR ). Therefore: ¯R Σ ¯ −1/2 ) . ¯ 0 AΨ ¯ R = (Σ ¯ −1/2 M ¯ 0 β 0 ) ΛR (αN A¯R = Φ R {zR } | | R {z R } P

(31)

P −1

Pr e-

The two terms on the outside of the last equation above are inverses of ¯ ¯ ¯0 M ¯R Σ ¯ −1/2 = Σ ¯ −1/2 M ¯ 0 β 0 αN ¯ −1/2 M each since it follows using (30) that: Σ R R ΣR R R R R −1/2 0 ¯ N ¯ ¯ N = IR . R R ΣR ¯ 0 β 0 ) = VR P −1 P = ¯ R P = VR (αN ¯R Σ ¯ −1/2 )(Σ ¯ −1/2 M Then, it follows that Ψ R R R 0 −1 ¯ 0 ¯ R P )(P −1 ¯ ¯ 0 = (Ψ VR . In an analogous fashion P ΦR = UR . Therefore, ΨR Φ R PR 0 0 0 ¯ ΦR ) = VR UR = i=1 vi ui . Lemma 2. We have the following inequality: (1) (1) (1) (1) ||Fk (∆l )|| ≤ R|λ1 |k {||C∆l1 ||co + ||∆l2 B||ob + (k + 1)Rco ob ||∆l3 ||}, where ∆l1 ∼ ∆l3 have been defined in (28).

al

¯ RΦ ¯ 0 )k B, Proof. Consider first Fk (∆l1 ). It can be seen that Fk (∆l1 ) = C∆l1 (AΨ R since any other term in Fk (∆l1 ) has the form: ¯ RΦ ¯ 0R A)n (Ψ ¯ RΦ ¯ 0R − I)δ X( ¯ H ¯ −1 E)l−1 H ¯ −1 Y¯R0 (AΨ ¯ RΦ ¯ 0 )k−n B, C(Ψ R R | {z R }

(32)

∆l1

urn

¯ RΦ ¯ 0 −I) = 0 if n 6= 0, because Ψ ¯ RΦ ¯ 0 = PR vi u0 (Lemma ¯ RΦ ¯ 0 A)n (Ψ and C(Ψ i R R R i=1 P 0 ¯ RΦ ¯ 0 A)n = R λn Cvi u0 , and I − Ψ ¯ RΦ ¯ 0 = Pnx 1). Therefore C(Ψ i R R i=1 i i=R+1 vi ui . Noting the orthonormality of the left and right eigenvectors, it follows that ¯ RΦ ¯ 0 A)n (Ψ ¯ RΦ ¯ 0 − I) = 0. C(Ψ R R ¯ RΦ ¯ 0 A)k Using an analogous argument, it can be seen that Fk (∆l2 ) = C(Ψ R ∆l2 B. Furthermore,

Jo

¯ RΦ ¯ 0 )k B|| = k ||(AΨ R

and it follows that:

R X i=1

λki vi u0i B k ≤ |{z} (i)

Co

R X i=1

|λi |k ||vi ||||Co(i) || ≤ R|λ1 |k c(1) o ,

kFk (∆l1 )k ≤ R|λ1 |k kC∆l1 kc(1) o . 12

(33)

(34)

Journal Pre-proof

(1)

||Fk (∆l3 )|| ≤

k X j=0

of

¯ RΦ ¯ 0 A)k || ≤ R|λ1 |k o . Therefore, it follows that ||Fk (∆l2 )|| Similarly, ||C(Ψ R b (1) ≤ R|λ1 |k ob ||∆l2 B||. Furthermore, by the same token, ¯ RΦ ¯ 0 A)j ||||∆l3 ||||(AΨ ¯ RΦ ¯ 0 )k−j B|| ||C(Ψ R R (1)

(35)

p ro

k = (k + 1)R2 c(1) o ob |λ1 | ||∆l3 ||.

Hence, the result follows.

Next, we shall find high confidence probabilistic error bounds for the different terms in the above lemma using their definitions from (28). First, we need the following result regarding the large deviations of the extremal singular values of a Gaussian random matrix.

Pr e-

Theorem 1. (Extremal Singular Values of a Random Gaussian Matrix [28]). Suppose that Ω is an Snω × R matrix whose elements are i.i.d. standard Gaussian √ √ √ random√variables. Then: 2 P ( Snω − R − t ≤ σmin (Ω) ≤ σmax (Ω) ≤ Snω + R + t) ≥ 1 − 2e−t /2 . An important corollary follows for a tall matrix, i.e., when Snω  R, the case we have in this paper, the proof is by direct substitution and is left out here. √ √ Corollary 1. Let t = ( − γ) Snω , where R = γSnω . Then: √ 2 √ √ P ((1 − ) Snω ≤ σmin (Ω) ≤ σmax (Ω) ≤ (1 + ) Snω ) ≥ 1 − 2e−(− γ) Snω /2 .

urn

al

This allows us to form the following large deviation bounds on the spectral ¯ R , Y¯R , and δ X, ¯ δ Y¯ . Noting that X ¯ R = XR Ω, it norms of the random matrices X √ ¯ R || ≤ kXR kkΩk. However, ||Ω|| ≤ (1+) Snω with probability follows that: ||X √ −(− γ)2 Snω /2) . Hence, with probability at least at least 1 − δ1 , where δ√ 1 = 2e ¯ R || ≤ (1 + ) Snω ||XR ||. Similarly, with probability at least 1 − δ1 , 1 − δ1 , ||X √ √ ¯ ≤ (1 + ) Snω ||δX||. Let R = ζSnz , and δ2 = 2e−(− ζ)Snz /2 . Then, ||δ X|| using analogous arguments √ on Y¯R = YR Ωa , δ Y¯ = δY Ωa , with √ probability at least 1 − δ2 , ||Y¯R || ≤ (1 + ) Snz ||YR ||, and ||δ Y¯ || ≤ (1 + ) Snz ||δY ||. ¯ −1 ||. Since H ¯ R = Ω0a Y 0 XR Ω, Next, we can derive large deviation bound for ||H R R ¯ and therefore ||HR || ≥ σmin (Ωa )σmin (YR )σmin (XR )σmin (Ω). This implies that √ ¯ R || ≥ S nω nz (1−)2 σmin (YR )σmin (XR ), with probability at least 1−(δ1 +δ2 ), ||H √ −1 ¯ which implies that ||HR || ≤ (S nω nz (1−)2 σmin (YR ) σmin (XR ))−1 with probability at least 1 − (δ1 + δ2 ). This now allows us to prove the following result.

Jo

R Lemma 3. Let max(||XR ||, ||YR ||) = D, max(||δX||, ||δY ||) = d, γ = Sn , ω R Dd ζ = Snz , σ ¯ = min(σmin (XR ), σmin (YR )), and ν = σ¯ 2 . Given any , such that √ √ max( γ, ζ) <  < 1, with probability at least 1−(δ1 +δ2 ), the following bounds hold: ||C∆l1 || ≤ 2l−1 ρ2l ν l+1 , ||∆l2 B|| ≤ 2l−1 ρ2l ν l+1 , ||∆l3 || ≤ 2l−1 Gρ2l+2 ν l+1 , √ √ 1+ (−(− γ)2 Snω /2) (−(− ζ)2 Snz /2) where δ1 = 2e , δ2 = 2e , ρ = 1− , and G =

1+

σ ¯2 D 2 ρ2 .

13

Journal Pre-proof

Proof. We have: k∆l3 k ¯ R kkY¯R kkH ¯ −1 k2 kδ Y¯ kkδ Xkk ¯ H ¯ −1 Ekl−1 + kδ Xkk ¯ H ¯ −1 kkH ¯ −1 E||l−1 kδ Y¯ k. ≤ kX R

R

R

of

R

¯ R ||, Using the definitions of ρ, ν and d, it follows using the bounds for ||X −1 ¯ ¯ ¯ ¯ ||YR ||, ||δ X||, ||δ Y || and ||HR ||, we have that (36)

d2 ) = (2 + e)ρ2 ν, σ ¯2

(37)

where using the definition of E, we have: ¯ −1 Ek ≤ ρ2 (2ν + kH R d D.

Therefore, using the definition of G, we have:

Pr e-

where e =

p ro

d2 ¯ −1 l−1 )kHR Ek , σ ¯2

k∆l3 k ≤ (ρ4 ν 2 + ρ2

¯ −1 Ek ≤ (2 + e)l−1 Gρ2l+2 ν l+1 . k∆l3 k ≤ Gρ4 ν 2 kH R

(38)

¯ H ¯ −1 ||||H ¯ −1 E||l−1 ||Y¯R ||. Noting ¯ RΦ ¯ 0 − I)||||δ X|||| Further, ||C∆l1 || ≤ ||C(Ψ R R R 0 ¯ ¯ that ||C(ΨR ΦR − I)|| ≤ ||δY ||, it follows that ||C∆l1 ||≤ ((2 + e)ρ2 ν)l−1 dρ2 ν ≤ ((2+e)ρ2 ν)l−1 ρ2 ν 2 . An analogous argument holds for the bound ||∆l2 B||. The above result now allows us to find high confidence bounds on the error ¯R , C¯R ) response and the between the idealized R-mode RPOD* ROM (A¯R , B ¯ ¯ ¯ true RPOD* ROM (A, B, C) response.

Jo

urn

al

Corollary 2. The following bounds hold with probability at least 1 − (δ1 + δ2 ): (1) 2 (1) (1) 4 (1) 2 2 k ¯ − C¯R A¯k B ¯ ||C¯ A¯k B R R || ≤ R|λ1 | {(co + ob )ρ + RG(k + 1)co ob ρ }ν + o(ν ), −2 2 for all k = 0, 1, · · · , S and limν→0 ν o(ν ) = 0. P∞ ¯ C¯R A¯k B ¯ Proof. Recall that ||C¯ A¯k B− R R || ≤ ||Fk (∆1 )||+ l=2 ||Fk (∆l )||+||Sk (∆)||. From Lemma 2 and Lemma 3, we see that with probability 1−δ1 −δ2 , ||Fk (∆1 )|| ≤ (1) (1) (1) (1) R|λ1 |k {ρ2 (co + ob ) + RG(k + 1)co ob ρ4 }ν 2 , and is an o(ν) function in the −1 sense that ν o(ν) → 0 as ν → 0. Using the same results, with probability at l−1 2l+2 l+1 lest 1 − δ1 − δ2 , ||Fk (∆l )|| ≤ K(2 + e)P ρ ν for l = 2, 3..., where K < ∞ is P∞ ∞ 2 l −2 4 a uniform constant. P Therefore ν l=0 ((2 + e)ρ ν) . l=2 ||Fk (∆l )|| ≤ Kρ ν ∞ 2 l 2 −1 For ν small enough, < ∞, and hence , it P∞ l=0 ((2 + e)ρ ν) = (1 − (2 + e)ρ ν) P ∞ follows that ν −2 l=2 ||Fk (∆l )|| → 0 as ν → 0, and hence, l=2 ||Fk (∆l )|| is an 2 o(ν ) function with probability at least 1−δ1 −δ2 . Furthermore, due to a similar argument, the second andP higher order terms in ||Sk (∆)|| is an o(ν 3 ) function. ∞ Therefore, it follows that l=2 ||Fk (∆l )|| + ||Sk (∆)|| is an o(ν 2 ) function with probability at least 1 − δ1 − δ2 . This completes the proof of the result.

We next prove a simple result about the best R-mode approximation to the response obtained by applying RPOD* to the idealized R-mode ensembles ¯ R , Y¯R . X 14

Journal Pre-proof

¯ R , Y¯R Lemma 4. Suppose that the RPOD* is applied to the idealized ensembles X ¯ ¯ ¯ to obtain the ROM (AR , BR , CR ). Then, the following result holds for all k: (i) (i) ¯R || ≤ Pnx ||CAk B − C¯R A¯k B |λi |k o co . R

i=R+1

b

p ro

of

¯ RΦ ¯ 0 = PR vi u0 . Therefore, it follows Proof. Using Lemma 1, we know that Ψ i R i=1 ¯R = C Ψ ¯ RΦ ¯ 0 (AΨ ¯ RΦ ¯ 0 )k B = PR λk (Cvi )(u0 B) = PR λk O(i) Co(i) that: C¯R A¯kR B i i R i=1 i b Pnx k i=1 Pnx k R (i) (i) λi Ob Co . λi (Cvi )(u0i B) = i=1 and CAk B = i=1 (i) (i) k ¯R k ≤ Pnx Therefore, it follows that: kCAk B − C¯R A¯kR B i=R+1 |λi | kOb kkCo k Pnx (i) (i) = i=R+1 |λi |k ob co . This now allows us to provide the following error bound on the RPOD* algorithm.



Pr e-

Theorem 2. RPOD* error bound. Let z(t) be the response of the true system ¯ B, ¯ C) ¯ to a bounded (A, B, C) and let z¯(t) be the response of the RPOD* ROM (A, input ω(t), such that ||ω(t)|| ≤ W , for all t. Given any t > S, with probability at least 1 − (δ1 + δ2 ), the following bound holds: (i) (1) (1) Pnx ob c(i) c(1) RGc(1) 4 2 2 o o +ob o ob ||z(t) − z¯(t)|| ≤ W i=R+1 1−|λi | + RW { ρ2 (1−|λ1 |) + (1−|λ1 |)2 }ρ ν + o(ν ), √

2

where δ1 = 2e(−(− γ) Snω /2) , δ2 = 2e(−(− and limν→0 ν −2 o(ν 2 ) = 0.

ζ)2 Snz /2)

,ρ=

1+ 1− ,

G=1+

σ ¯2 D 2 ρ2 ,

Proof. Let z¯R (t) be the response of the idealized R-mode RPOD* reduced order ¯R , C¯R ) to the input ω(t). Then, using Corollary 2, we have that: model (A¯R , B ||¯ zR (t) − z¯(t)|| ≤ W

l=0

k=t−S

¯ ¯ ¯t−k B|| ¯ ||C¯R A¯t−k R BR − C A

¯R − C¯ A¯l B|| ¯ ||C¯R A¯lR B

al

≤W

S X

t X

S S X X (1) (1) 4 2 2 ≤ W R{ |λ1 |l ρ2 ν 2 (c(1) + o ) + RG(l + 1)|λ1 |l c(1) o o ob ρ ν } + o(ν ) b l=0 (1) (1) RGco ob }ρ4 ν 2 (1 − |λ1 |)2

urn



l=0 (1) co RW { 2 ρ (1

(1) ob

+ + − |λ1 |)

+ o(ν 2 ),

(39)

PS 1 where we have used the fact that l=0 (l + 1)|λ1 |l ≤ (1−|λ 2 in the last line of 1 |) the above set of inequalities. From Lemma 4, we also have that:

Jo

||z(t) − z¯R (t)|| ≤ W

nx S X X

(i)

k=0 i=R+1

|λi |k ob c(i) o =W

nx X

i=R+1

(i) (i)

ob co . 1 − |λi |

(40)

Noting that (39) holds with probability 1 − δ1 − δ2 , using (40), and the triangle inequality that ||z(t) − z¯(t)|| ≤ ||z(t) − z¯R (t)|| + ||¯ zR (t) − z¯(t)||, the result follows. 15

Journal Pre-proof

(1)

Remark 2. Approximate Error. It may be shown that G ≈ 1 and RGc(1) o

(1)

c(1) o +ob ρ2 (1−|λ1 |)

(1) (1)

||z(t) − z¯(t)|| ≤

R2 W ob co 4 2 ρ ν . (1 − |λ1 |)2

of

o b  (1−|λ ¯R (t)||  ||¯ zR (t)− z¯(t)||. Therefore, the approximate 2 . Also, ||z(t)− z 1 |) RPOD* error bound is given by (with high probability):

(41)

p ro

The key point to note here is that the RPOD* error is O(ν 2 ) in the small ¯ −1 E||. parameter ν, where recall that ν is an upper bound on the scaled error ||H R Remark 3. The small parameter ν. It is worthwhile to look at the parameter ν in a slightly different fashion compared to as presented thus far. Note that max(||δX||,||δY ||) D2 d ν = Dd σ ¯2 = σ ¯ 2 D ≥ κ(HR )e, where e is the normalized error max(||XR ||,||YR ||) , 2

Pr e-

and κ(.) represents the condition number of a matrix. The term D σ ¯ 2 is an upper bound on κ(HR ) and heuristically, acts as a surrogate for it, and thus, as long as the conditioning of the matrix HR is good, the error parameter ν can be expected to be small. Note that the result above at its heart is about the perturbation of the inverse of HR (by E), and as such, it is only natural that the conditioning number κ(HR ) is central to the small parameter ν.

Jo

urn

al

4.3. RPOD* Implementation From the analysis above, we only need to collect R snapshots from primal and adjoint simulations. However, R is not known a priori, thus, in practice, we estimate the number of snapshots need to be collected as follows. Suppose we collect m snapshots. We start with a random guess m  nx , or we can choose m from experience. For instance, in a fluid system with nx = 106 , R is usually O(10) ∼ O(102 ), and hence, we start with m = O(10). Then we check the rank ¯ = m, it is possible that we did ¯ has full rank, i.e., rank (Y¯ 0 X) ¯ If Y¯ 0 X of Y¯ 0 X. ¯ < m. not take enough snapshots, and we increase m, until rank (Y¯ 0 X) Selecting snapshots varies for different problems and here, we provide a heuristic approach. More specifically, we assume that m snapshots are taken at time steps T ∆t, 2T ∆, · · · , mT ∆t, WLOG, and we discuss how to choose T . Our analysis requires that T ≥ S. In Fig.2, we show one simulation result comparing the accuracy of the ROMs using smaller values of T = 5, 10, 20, 30, 50, 100, 600 for the atmospheric dispersion problem introduced in section 6. Here, S = 5000, and R = 350. The error is defined in (46), and we take the average of the error for each ROM. The averaged error as a function of T is shown in Fig.2(a), and the total time to collect primal and adjoint snapshot ensembles are shown in Fig.2(b). It can be seen that as T increases, the ROM is more accurate, while it takes longer time to generate the snapshots. Thus, this is a trade-off between the accuracy and computational efficiency. The basic issue is that the selection of T affects the numerical conditioning of the randomized ensembles, and thus, the accuracy of the results. A lower T < S means that the elements of Ω and Ωa can no longer be assumed to be i.i.d, thereby affecting the conditioning of the matrices. However, as can be seen from the empirical evidence, the effect is essentially mitigated for much smaller T than S. 16

Journal Pre-proof

104

10-1

10-4 10-5 10-6

103

of

10-3

Time (s)

917.6s 460.5s 276.1s 178.6s

102

92.3s

10-7

38.1s 10-8 10-9

p ro

Averaged output error

5620s 10-2

101 0

100

200

300

400

500

600

0

100

200

300

400

500

600

T

T

(a) Averaged Error

(b) Time

Figure 2: Snapshot Selection: Effect of T

Pr e-

5. Comparison with related Algorithms

Now we consider the RPOD∗ algorithm from a more general point of view. The RPOD∗ snapshot ensemble can be written as ¯ = [B, AB, · · · , AS−1 B]Ω ∈
(42)

where Ω ∈
¯ = span col R(A, B). span col X

(43)

urn

al

Note that this is the fundamental difference with all Krylov subspace methods. The construction of, and finding the bases of Krylov subspaces is both time and memory expensive when the dimension of the system and the number of inputs/outputs are large. The RPOD∗ can be viewed as constructing a randomized version of the Krylov subspaces that is orders of magnitude less time and memory consuming than a typical Krylov subspace method, while solving a much smaller symmetric R × R Eigenvalue problem to find the orthonormal bases, such that: ¯ = span col X. ¯ span col Ψ

(44)

Jo

5.1. Comparison with BPOD As we mentioned in section 3.2, nω + nz simulations are needed for BPOD algorithm. It is expensive to store (nω + nz )S snapshots, and it is expensive to solve the resulting SVD problem due to the large size of the problem. For RPOD∗ algorithm, only (1 primal + 1 adjoint) simulations are needed, and only 2R, where R  S snapshots need to be stored. Also, it is easy to solve the resulting SVD problem. Another practical problem with impulse responses snapshots is that snapshots after some time are dominated by very few slow modes. On the other hand, RPOD∗ trajectories are white noise forced, and all the modes are always 17

Journal Pre-proof

of

present in all snapshots due to the persistent excitation of white noise terms [29]. Hence, RPOD∗ snapshots can be taken till time S, and be assured that all of the relevant modes will be captured.

p ro

5.2. Comparison with BPOD output projection, RPOD, Tangential interpolationbased ERA(TERA) and CUR-ERA As we mentioned in section 1, various methods have been proposed to reduce the computational cost of solving SVD problem of the large Hankel matrix when the number of inputs/outputs is large. In Table 1, the methods are divided into two groups: methods based on BPOD and methods based on ERA. Further, the methods are categorized based on the schemes to construct the reduced order Hankel matrix. Table 1: Comparison of Related Methods

BPOD based BPOD output projection [9], RPOD∗ RPOD [24]

Pr e-

By Projection By Sampling

ERA based TERA [15] CUR-ERA [16]

urn

al

BPOD output projection [9] projects the outputs onto a smaller subspace with projection matrix Θo ∈
Ho = Θ0 H ∈
Ht = W10 HW2 ∈
Jo

Hcur = I3 HI4 ∈
(45)

where Θ is a diagonal matrix with Θo on diagonal entries, W1 , W2 are diagonal matrices with W1 , W2 on the diagonal entries respectively. I1 ∈
Journal Pre-proof

p ro

of

Computational Complexity Analysis. Denote ∆T as the time to propagate the primal/adjoint system once. Both TERA and CUR-ERA algorithm need to collect 2S Markov parameters which takes time 2nω S∆T . BPOD output projection algorithm and RPOD algorithm need (nω + no ) simulations and (ˆ nω + n ˆ z ) simulations respectively, and thus, the total computation time to generate snapshot ensembles is (nω + no )S∆T and (ˆ nω + n ˆ z )S∆T . RPOD∗ needs 1 primal and 1 adjoint simulation till time tR , which can at most be a simulation time of 2RS∆T . It can be seen that RPOD∗ and CUR-ERA construct the smallest projected ¯ Hcur takes time O(R3 ), while Hankel matrices. Solving SVD problems using H, CUR-ERA requires the storage of the full Hankel matrix, which is avoided in RPOD∗ . CUR-ERA also requires the solution of a large combinatorial optimization problem to find the optimal set of inputs and outputs which can be very time consuming, and which is not required by RPOD*.

Pr e-

6. Computational Results

In this section, we show the comparison of RPOD∗ with BPOD/BPOD output projection for two examples: a one-dimensional heat transfer problem, and a three dimensional atmospheric dispersion problem to illustrate the proposed algorithm. Then we compare RPOD∗ with CUR-ERA algorithm for a large mass-spring-damper example to further validate the advantages of RPOD∗ . First, we define the error: Etime (t) =

kz(t) − zr (t)k , kz(t)k

(46)

al

where z(t) are the outputs of the full order system at time t, zr (t) are the outputs of the reduced order system. The frequency response error for multiple input multiple output systems is defined as:

urn

Ef re (jω) = kG(jω) − Gr (jω)k,

(47)

Jo

where G(jω) is the transfer function of the full state system as a function of frequency ω, and Gr (jω) is the transfer function of the ROM. All of the experiments reported in this paper were performed using Matlab 2017b on a Lenovo T4900d, Intel(R) Core (TM) i7-7770CPU, 3.60GHz, 8GB RAM machine. For the first two experiments below, there are design parameters that need to be chosen manually, such as the rank of the output projection, when to take the snapshots, and the size of the ROM. We have shown the results of the best selections of these parameters for the BPOD/BPOD output projection algorithms and the RPOD∗ algorithm. We note here that BPOD/ BPOD output projection are not particularly well suited for this problem due to the large number of inputs and outputs involved, but we use them solely for the purpose of comparison. 19

Journal Pre-proof

∂T ∂2T = α 2 + f, ∂t ∂x with boundary conditions ∂T |x=L = 0, ∂x

p ro

T |x=0 = 0,

of

6.1. Heat Transfer The equation for heat transfer by conduction along a slab is given by the partial differential equation: (48)

(49)

Jo

urn

al

Pr e-

where α = 4.2 × 10−6 (m2 /s) is the thermal diffusivity, L = 1m, and f denotes the external sources. Two point sources are located at x = 0.15m and x = 0.45m. The system is discretized using finite difference method, and there are 100 grid points which are equally spaced. We take full field measurements, i.e., measurements at every node. In the following, we compare the ROMs constructed using RPOD∗ , BPOD and BPOD output projection. In this problem, S = 1500, i.e., kAS k → 0, and the partial realization problem matches the first 1500 Markov parameters. For BPOD algorithm, we collect first 1500 impulse responses for both primal and adjoint simulations, where we need nω = 2 primal simulations and nz = 100 adjoint simulations. The rank of the constructed Hankel matrix is 55. For BPOD output projection algorithm, the rank of the output projection is 30, and hence, 30 adjoint simulations are needed. The rank of the resulting Hankel matrix is 40. For RPOD∗ , the system is perturbed by the white noise with distribution N (0, I2×2 ). Only 1 primal and 1 adjoint simulation are needed, and in each simulation, we collect 75 equispaced snapshots at time S, 2S, · · · , 75S. The rank of the resulting Hankel matrix is 55. The size of matrices and the computational time of three algorithms are summarized in Table 2. In Fig.3(a), we compare the spectral norm of the Markov parameters of the ROM constructed using BPOD and RPOD∗ with the true Markov parameters of the full order system. The BPOD output projection errors are orders of magnitude greater than the other algorithms, and hence, are omitted here. We perturb the system with random Gaussian noise, and compare the output relative errors in Fig.3(b). It can be seen that all three methods are accurate enough, while the performance of the RPOD∗ and BPOD algorithm are much better than BPOD output projection algorithm. The errors of BPOD and RPOD∗ are orders of 10−10 ∼ 10−12 , albeit the BPOD algorithm performs better than RPOD∗ . As we discussed before, RPOD∗ needs (1 + 1) simulations and collects snapshots at time [S, 2S, · · · , 75S], while BPOD needs (2 + 100) simulations, and collects snapshots at time [1, 2, · · · , S], and hence, it can be seen that the time to generate snapshot ensembles using RP OD∗ is longer, however, in this problem, the dominant computational cost is solving the SVD problem. Also, for both BPOD and BPOD output projection, a large Hankel matrix needs to be stored. 20

of

Journal Pre-proof

Table 2: Heat Transfer Example

100 × 75

100 × 150000

100 × 45000

75 × 75

150000 × 3000

45000 × 3000

55 0.6114s 1.0446s 0.007s 0.001s 1.664s

55 0.02s 0.822s 1.203s 1685.4s 1687.445s

40 0.02s 0.358s + 13.092s (projection) 0.294s 39.158s 52.922s

BPOD output projection 100 × 3000

p ro

al

10-12

3

BPOD 100 × 3000

Pr e-

Primal snapshot ensemble X Adjoint snapshot ensemble Z Hankel matrix Z 0X ROM size Generate X Generate Z Construct Z 0 X Solve SVD Total Time

RPOD∗ 100 × 75

10-4

RPOD *

2.5

2

urn

|| CAiB - CrAirBr||

BPOD

10-6

10-8

RPOD* BPOD BPOD projection

1.5

10

-10

1

10-12

0.5

10-14

0

0

10

20

30

40

50

60

70

80

90

100

(a) Comparison of Markov Parameters

Jo

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

TIME t(s)

i

(b) Comparison of Output Errors

Figure 3: Heat Transfer Problem Time Response

21

10000

Journal Pre-proof

of

RPOD∗ constructs the ROM using about 1.6s, while BPOD output projection needs 53s, and BPOD algorithm needs 1687s. Therefore, for the system with a large number of inputs and outputs, RPOD∗ can save the computation by orders of magnitude compared with BPOD/BPOD output projection, and as accurate as BPOD.

p ro

6.2. Atmospheric Dispersion Problem The three-dimensional advection-diffusion equation describing the contaminant transport in the atmosphere is: ∂c ~ ~ −X ~s ), + ∇ · (c~u) = ∇ · (K(X)∇c) + Qδ(X ∂t

(50)

Pr e-

~ t) denotes the mass concentration at location X ~ = (x, y, z), X ~s = where c(X, (xs , ys , zs ) denotes the location of point source, and Q is the contaminant emitted rate. The simulation domain x ∈ [0, 2000](m), y ∈ [−100, 400](m), z ∈ [0, 50](m), and wind velocity ~u = (4m/s, 0, 0). ∇ denotes the gradient operator, ~ = diag(Kx (x), Ky (x), Kz (x)) is a diagonal matrix whose entries are and K(X) turbulent eddy diffusivities. The boundary conditions are: c(0, y, z) = 0, c(∞, y, z) = 0, c(x, ±∞, z) = 0, ∂c c(x, y, ∞) = 0, Kz (x, y, 0) = 0. ∂z

(51)

Jo

urn

al

The system is discretized using finite difference method, and there are 100 × 100 × 10 grid points which are equally spaced. There are 10 point sources. We take the full field measurements (except the nodes on x = 0, ∞ and y = ±∞). In this example, since the system dimension is nx = 105 , constructing the ROM with full field measurements using BPOD is computationally impossible, and thus, we only compare RPOD∗ algorithm with BPOD output projection algorithm. For RPOD∗ algorithm, the system is perturbed by white noise with distribution N (0, I10×10 ). One primal simulation and one adjoint simulation are needed. In section 4.3, we have shown the accuracy of the ROM as a function of T in Fig.2, and in this example, we choose T = 50, collect 500 snapshots at time T, 2T, · · · , 500T (a tradeoff between the accuracy and computational efficiency as discussed), and extract 350 modes. For BPOD output projection algorithm, we collect impulse responses from nω = 10 primal simulations. We could not collect all first 5000 impulse responses due to the memory limits on the platform. Hence, for the best performance available in this example, we collect the first 600 snapshots for each primal simulation. The rank of the output projection is 200, and hence, 200 adjoint simulations are needed, and in the adjoint simulations, we collect the first 50 snapshots. We extract 350 modes. The size of the constructed matrices and the comparison of computational time are shown in Table 3. It can be seen that for this example, the dominant computation cost of BPOD output projection algorithm is construction of Z 0 X 22

Journal Pre-proof

of

(multiplication of a 105 × 6000 matrix and a 105 × 104 matrix). In this problem, RPOD∗ algorithm constructs the ROM in about 7.7 minutes, while BPOD output projection constructs the ROM in about 4.4 hours. Moreover, implementation of BPOD output projection algorithm is limited due to the memory limits on the platform. Note that even if we choose T = 600, the total computational time using RPOD∗ is about 1.6 hours as shown in Fig.2, which is still faster than BPOD output projection, while the RPOD∗ ROM is more accurate.

Primal snapshot ensemble X Adjoint snapshot ensemble Z Hankel matrix Z 0X ROM size Generate X Generate Z Construct Z 0 X Solve SVD Total Time

p ro

Table 3: Atmospheric Dispersion Example

RPOD∗ 105 × 500

BPOD output projection 105 × 6000

105 × 500

105 × 104

104 × 6000

Pr e-

500 × 500 350 223.1s 237.4s 0.34s 0.03s 460.87s

350 53.1s 105s + 126.6s (projection) 15380s 124.1s 15788.8s

al

In Fig.4(a), we compare Markov parameters of the ROM constructed using RPOD∗ and BPOD output projection with the full order system. RPOD∗ outperforms BPOD output projection by three orders of magnitude, and hence, the RPOD∗ errors are negligible. We perturb the system with random Gaussian noise, and compare errors in Fig.4(b). In Fig.5, we show the frequency responses errors.

urn

10-3

|| CAiB - CrAirBr||

10-4

10-4

1.8

RPOD * BPOD output projection

1.6 1.4

10-5

1.2

RPOD* BPOD output projection

10-6

1

10-7

0.8 0.6

10-8

0.4

Jo

0.2

10-9

0

0

10

20

30

40

50

60

70

80

90

100

0

100

200

300

400

500

600

700

800

900

TIME t(s)

i

(a) Comparison of Markov Parameters

(b) Comparison of Output Errors

Figure 4: Atmospheric Dispersion Problem Time Responses

23

1000

Journal Pre-proof

10-2

10-3

RPOD* BPOD output projection

10-5

10-6

10-8 10-4

10-3

10-2

10-1

p ro

10-7

of

10-4

100

101

102

Figure 5: Atmospheric Dispersion Problem Frequency Responses

Pr e-

It can be seen that RPOD∗ algorithm outperforms BPOD output projection algorithm by orders of magnitude. To improve the accuracy of the BPOD output projection algorithm, more snapshots in both the primal and adjoint simulations need to be taken, which is not computationally feasible using the current platform. 6.3. Mass-Spring-Damper Model

In this section, we show the comparison of RPOD∗ algorithm with the socalled “CUR-ERA” algorithm for a mass-spring-damper model. Th CUR-ERA is an optimized balanced truncation method especially suited for systems with large number of inputs and outputs. The code for the model and CUR-ERA algorithm are from the reference [16]. The system consists of 500 mass-spring-

al

Table 4: Mass-Spring-Damper Example

urn

Generate Hankel matrix

Solve SVD Total Time

RPOD∗ 0.15s(Generate 0.16s(Generate 0.0003s(Z’X) 0.0014s 0.3117s

X) Z)

+ +

CUR-ERA 2.16s

5.5886s 7.7486s

Jo

damper elements, and is formulated as a discrete-time input-output model as in (1). The system dimension nx = 1000 and there are 30 inputs and 30 outputs. In CUR-ERA algorithm, the first 1000 Markov parameters are generated and the resulting full Hankel matrix is 15000×15000. After constucting the full order Hankel matrix, CUR-ERA solves the CUR decomposition of Hankel matrix. For RPOD∗ , the first 1000 snapshots are generated for both the primal and adjoint simulations, and we collect 100 equi-spaced snapshots, i.e., we collect snapshots at k = 10, k = 20, . . . , k = 1000, and hence, the resulting Hankel matrix is 100 × 100. For fair comparison, the order of the reduced model is chosen to 24

Journal Pre-proof

101

101

p ro

102

of

be 80 for both algorithms, same as the one used in [16].The comparison of computational time is summarized in Table 4. In order to compare the performance, in Fig.6(a), we show the transfer function of the full order model and the two reduced order models identified from RPOD∗ and CUR-ERA algorithm, and in Fig.6(b), we show the frequency responses errors.

Full RPOD CUR

100

RPOD CUR

10-1 100

10-2

10-2 10-4

10-3

10-3

10-2

10-1

Pr e-

10

-1

100

10-4 10-4

101

(a) Comparison of Frequency Responses

10-3

10-2

10-1

100

101

(b) Comparison of Frequency Responses Errors

Figure 6: Mass-Spring-Damper Problem Frequency Responses

7. Conclusion

al

It can be seen from the plots that the RPOD∗ algorithm performs better than the CUR-ERA algorithm over most of the frequency range of interest, while at the same time, the RPOD∗ algorithm is more computationally efficient since it does not need to construct the full Hankel matrix, and also does not need to solve the additional optimization problems required by the CUR method.

Jo

urn

In this paper, we have introduced a randomized POD procedure for extraction of ROMs for large scale systems such as those governed by PDEs. The ROM is constructed by perturbing the primal and adjoint system with Gaussian white noise, where the computational cost to construct snapshot ensembles is reduced when compared to perturbing the primal and adjoint system with impulses. Also, it leads to a much smaller SVD problem, and an orders of magnitude reduction in the computation required for constructing ROMs over other related algorithms. Probabilistic performance bounds are provided for RPOD*. The computational results show that the accuracy of RPOD∗ is also much greater than other competing algorithms. We also show that the RPOD* represents a simple yet fundamentally different randomized approach to Krylov subspace methods. In future work, one of the directions could be the possible extension of RPOD* to nonlinear problems. Since the snapshot POD is routinely used for the

25

Journal Pre-proof

model reduction of nonlinear systems, it is plausible there might be a suitable extension of the RPOD* to nonlinear systems.

of

References

[1] J. M. Stockie, The Mathematics of Atmospheric Dispersion Modeling, SIAM Review 53, No.2 (2011) 349–372.

p ro

[2] A. Antoulas, Approximation of Large Scale Dynamical Systems, SIAM, Philadelphia, PA, 2005. [3] U. Baur, P. Benner, L. Feng, Model Order Reduction for Linear and Nonlinear Systems: A System-Theoretic Perspective, Arch. Computat. Methods Eng. 21 (2014) 331–358.

Pr e-

[4] P. Benner, S. Gugercin, K. Willcox, A Survey of Projection-Based Model Reduction Methods for Parametric Dynamical Systems, SIAM Review 57 (4) (2015) 483–531. [5] Z. Bai, Krylov subspace techniques for reduced-order modeling of largescale dynamical systems, Applied Numerical Mathematics 43 (2002) 9–44. [6] P. Feldmann, R. W.Freund, Efficient Linear Circuit Analysis by Pad´ e Approximation via the Lanczos Process, IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems 14 (5) (1995) 639–649. [7] I. M. Jaimoukha, E. M. Kasenally, Implicitly Restarted Krylov Subspace Methods for Stable Partial Realizations, SIAM Journal on Matrix Analysis and Applications 18 (3) (1997) 633–652.

al

[8] K. Willcox, J. Peraire, Balanced Model Reduction via the Proper Orthogonal Decomposition, AIAA Journal 40 (2002) 2323–2330.

urn

[9] C. W. Rowley, Model reduction for fluids using balanced proper orthogonal decomposition, International Journal of Bifurcation and Chaos 15 (2005) 997–1013. [10] B. C. Moore, Principal component analysis in linear systems: Controllability, observability and model reduction, IEEE Transactions on Automatic Control 26, No.1 (1981) 17–32.

Jo

[11] S. Kung, A new identification method and model reduction algorithm via singular value decomposition, 12th Asilomar Conference on Circuits, Systems and Computers (Nov. 1978) 705–714. [12] J.-N. Juang, R. S. Pappa, An eigensystem realization algorithm for model parameter identification and model reduction, Journal of Guidance, Control, and Dynamics 8, No.5 (1985) 620–627.

26

Journal Pre-proof

[13] Z. Ma, S. Ahuja, C. W. Rowley, Reduced order models for control of fluids using the eigensystem realization algorithm, Theoret. Comput. Fluid Dyn. 25 (2011) 233–247.

of

[14] J. H. Tu, C. W. Rowley, An improved algorithms for balanced pod through an analytic treatment of impulse response tails, Journal of Computational Physics 231 (June, 2012) 5317–5333.

p ro

[15] B. Kramer, S. Gugercin, Tangential interpolation-based eigensystem realization algorithm for MIMO systems, Mathematical and Computer Modelling of Dynamical Systems 22 (4) (2016) 282–306. [16] B. Kramer, A. A. Gorodetsky, System Identification via CUR-factored Hankel Approximation, SIAM J. Sci. Comput. 40 (2) (2018) A848–A866.

Pr e-

[17] M. W. Mahoney, Randomized Algorithms for Matrices and Data, Found. Trends Mach. Learn. 3 (2) (2011) 123–224. [18] N. Halko, P. Martinsson, J. Tropp, Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions, SIAM review 52(2) (2011) 217–288. [19] G. Calafiore, M. Campi, The scenario approach to robust control design, IEEE Transactions on Automatic Control 51 (2006) 742–753. [20] P. Drineas, et al., Fast monte carlo algorithms for matrices ii: Computing a low rank approximation to a matrix, SIAM J. Computing 36 (2006) 158– 183.

al

[21] O. Balabanov, A. Nouy, Randomized linear algebra for model reduction. Part I: Galerkin methods and error estimation, arXiv e-prints (2018) arXiv:1803.02602arXiv:1803.02602.

urn

[22] A. Buhr, K. Smetana, Randomized Local Model Order Reduction, SIAM Journal on Scientific Computing 40 (April 2018). [23] A. Alla, J. N. Kutz, Randomized Model Order Reduction, Advances in Computational Mathematics 45(3) (2019) 1251–1271. [24] D. Yu, S. Chakravorty, A Randomized Proper Orthogonal Decomposition Technique, in: Proceedings of American Control Conference, 2015, pp. 1127–1142.

Jo

[25] D. Yu, S. Chakravorty, A Computationally Optimal Randomized Proper Orthogonal Decomposition Technique, in: Proceedings of American Control Conference, 2016. [26] A.M.King, U.B.Desai, R.E.Skelton, A Generalized Approach to q-Markov Covariance Equivalent Realizations for Discrete Systems, Automatica 24 (4) (1988) 507–515. 27

Journal Pre-proof

[27] M. Bastug, et al., Model Reduction by Nice Selections for Linear Switched Systems, IEEE Transactions on Automatic Control 61 (11) (Nov. 2016) 3422 – 3437.

of

[28] M. Rudelson, R. Vershynin, Non asymptotic theory of random matrices: Extreme singular values, in: Proceedings of the International Congress of Mathematics, Vol. 4, 2010.

Jo

urn

al

Pr e-

p ro

[29] Y. Zhu, Chapter 3 - Identification Test Design and Data Pretreatment, in: Multivariable System Identification For Process Control, Pergamon, Oxford, 2001, pp. 31 – 63.

28