On global iterative schemes based on Hessenberg process for (ill-posed) Sylvester tensor equations

On global iterative schemes based on Hessenberg process for (ill-posed) Sylvester tensor equations

Accepted Manuscript On global iterative schemes based on Hessenberg process for (ill-posed) Sylvester tensor equations Mehdi Najafi-Kalyani, Fatemeh P...

2MB Sizes 0 Downloads 60 Views

Accepted Manuscript On global iterative schemes based on Hessenberg process for (ill-posed) Sylvester tensor equations Mehdi Najafi-Kalyani, Fatemeh Panjeh Ali Beik, Khalide Jbilou

PII: DOI: Reference:

S0377-0427(19)30176-1 https://doi.org/10.1016/j.cam.2019.03.045 CAM 12216

To appear in:

Journal of Computational and Applied Mathematics

Received date : 28 December 2018 Revised date : 17 March 2019 Please cite this article as: M. Najafi-Kalyani, F.P.A. Beik and K. Jbilou, On global iterative schemes based on Hessenberg process for (ill-posed) Sylvester tensor equations, Journal of Computational and Applied Mathematics (2019), https://doi.org/10.1016/j.cam.2019.03.045 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Manuscript Click here to view linked References

ON GLOBAL ITERATIVE SCHEMES BASED ON HESSENBERG PROCESS FOR (ILL-POSED) SYLVESTER TENSOR EQUATIONS

Mehdi Najafi-Kalyani§ , Fatemeh Panjeh Ali Beik∗,§ and Khalide Jbilou† §

Department of Mathematics, Vali-e-Asr University of Rafsanjan, P.O. Box 518, Rafsanjan, Iran

e-mails: †

[email protected], [email protected]

Laboratory LMPA, 50 Rue F. Buisson, ULCO calais cedex, France

e-mail:

[email protected]

Abstract. This paper is concerned with studying the performance of some global iterative schemes based on Hessenberg process to solve the Sylvester tensor equation (STE). Furthermore, we briefly mention the implementation of flexible versions. Under certain conditions, we devote part of the work to derive upper bounds for condition number of matrices with a special Kronecker product structure which can be helpful to obtain error bounds related to perturbation analysis of the mentioned STE. In addition, we propose using the Tikhonov regularization technique in conjunction with the tensor form of the flexible Hessenberg process when the STE is ill-posed. Numerical examples are examined for two test problems arising form discretization of three-dimensional partial differential equations. We also present some experimental results related to restoration of color images. Keywords: Hessenberg process, Sylvester tensor equation, Kronecker product, perturbation analysis, ill-posed problem, Tikhonov regularization. 2010 AMS Subject Classification: 65F10, 15A09.

1. Introduction A tensor is a multidimensional array and the number of its indices is called modes or ways. In the following, vectors and matrices are respectively denoted by lowercase and capital letters, and tensors of order three (or higher) are represented by Euler script letters. For a given N -mode tensor X ∈ RI1 ×I2 ×···×IN , the notation xi1 i2 ...iN stands for element (i1 , i2 , . . . , iN ) of X. Before stating the main problem, we present some notations and review some basic concepts which are used subsequently. For a given square matrix A with real eigenvalues, we denote the minimum and maximum eigenvalues of A by λmin (A) and λmax (A), respectively. The set of all eigenvalues (spectrum) of A is signified by σ(A). The symmetric and skew-symmetric parts of A are respectively denoted by H(A) and S(A), i.e., 1 1 H(A) = (A + AT ) and S(A) = (A − AT ). 2 2 By condition number of an invertible matrix A, we mean “cond(A) = kAk2 kA−1 k2 ” where k.k2 is the matrix 2-norm. For X ∈ RI1 ×I2 ×···×IN , tensors with the following form X:: · · · : k ∈ RI1 ×I2 ×···×IN −1 , | {z }

k = 1, 2, . . . , IN ,

(N −1)−times

are called frontal slices or column tensors of X (see [20] for background). Also, the vector vec(X) is obtained by using the standard vectorization operator with respect to frontal slices of X. For some discussions about the structured matrix computations arising in the context of tensor decompositions, one may refer to [26, 33]. ∗

Corresponding author. 1

2

M. Najafi-Kalyani, F. P. A. Beik and K. Jbilou

The inner product between two same size tensors X, Y ∈ RI1 ×I2 ×···×IN is given by hX, Yi =

I1 X I2 X

...

i1 =1 i2 =1

IN X

iN =1

xi1 i2 ···iN yi1 i2 ···iN ,

and the norm of a tensor X ∈ RI1 ×I2 ×···×IN is defined by 2

kXk =

I1 X I2 X

i1 =1 i2 =1

···

IN X

iN =1

x2i1 i2 ···iN .

Definition 1.1. (Kolda and Bader [20]) The n-mode (matrix) product of a tensor X ∈ RI1 ×I2 ×...×IN with a matrix U ∈ RJ×In is denoted by X ×n U and is of size I1 × · · · × In−1 × J × In+1 × · · · × IN ,

and its elements are defined as follows:

(X ×n U )i1 ···in−1 jin+1 ···iN =

In X

in =1

xi1 i2 ···iN ujin .

Remark 1.2. (Kolda and Bader [20]) The n-mode (vector) product of a tensor X ∈ RI1 ×I2 ×···×IN ¯ n v which is of order N − 1, i.e., the size is with a vector v ∈ RIn is signified by X× I1 × · · · × In−1 × In+1 × · · · × IN .

Elementwise,

¯ n v)i ···i i ···i = (X× 1 n−1 n+1 N

In X

in =1

xi1 i2 ···iN vin .

Consider the Sylvester tensor equation (STE) as follows: X ×1 A(1) + X ×2 A(2) + · · · + X ×N A(N ) = D,

(1.1)

where the right-hand-side tensor D ∈ R and coefficient matrices A ∈ R (n = 1, 2, . . . , N ) are known and X ∈ RI1 ×I2 ×···×IN is the unknown tensor to be determined. The above tensor equation may arise from discretization of a partial equation in high dimension [5, 6, 8, 25, 27, 32]. For instance, we may need to solve a STE after discretization of three-dimensional radiative transfer equation by using the Chebyshev collocation spectral method, see [23]. For N = 3, Eq. (1.1) may also appear when a fully three–dimensional microscale dual phase lag problem is discretized using mixed– collocation and finite difference scheme, see [24] for more details. One may refer to [19] for more background in tensor numerical methods to solve partial differential equations in many dimensions. It is well-known that the following linear system of equations is equivalent to (1.1), I1 ×I2 ×···×IN

with x = vec(X), b = vec(D), and A=

N X j=1

Ax = b,

I (IN ) ⊗ · · · ⊗ I (Ij+1 ) ⊗ A(j) ⊗ I (Ij−1 ) ⊗ · · · ⊗ I (I1 ) ,

(n)

In ×In

(1.2)

(1.3)

see [20] for further details. Here, we comment that the above Kronecker structure is not formed in practice whereas it is useful in deriving theoretical results. Some iterative methods have been already proposed based on the Krylov subspace in the literature to numerically solve (1.1). For instance, when the right-hand-side in (1.1) is a tensor of low rank, Kressner and Tobler [21] proposed Krylov subspace methods based on (extended) Arnoldi process. In fact, they first transformed the original STE to a low-dimensional STE. Then, a tensorised Krylov space is exploited which consists of the Kronecker product of standard Krylov subspaces. For the case that D is not restricted to be a tensor of low rank, Chen and Lu [8] proposed the GMRES method based on tensor format (GMRES− BTF) for solving (1.1). In [8], a preconditioner is also

Tensor form of global schemes and their flexible variants for (ill-posed) STEs

3

proposed by developing the exploited idea in [13] for finding the nearest Kronecker production. The tensor form of the FOM (FOM− BTF) algorithm was presented in [7]. Recently, there is a growing interest in constructing algorithms based on the Hessenberg process to solve matrix equations [1] and multiple linear systems [2, 3]. The reason is the fact that the Hessenberg process can lead to a Krylov subspace method having less expensive computational cost at each step than those construed based on the standard Arnoldi process. In some applications, such as STEs arising form discretization of three-dimensional partial differential equations by spectral methods, the coefficient matrices in STEs are (possibly ill-conditioned and) full. This motivated us to propose the global Hessenberg process which constructs a (computationally cheaper) nonorthogonal basis for the tensor Krylov subspace. In addition, performances of flexible algorithms are briefly pointed out. Some experimental results are devoted to solve STEs with perturbed right-hand-sides. For ill-posed problems, (flexible) algorithms are proposed by developing the idea of standard Tikhonov regularization in conjunction with tensor forms of Hessenberg and Arnoldi processes. In [22, 31], the authors focused on the perturbation analysis of (1.1). The obtained bounds for relative errors either are derived under restricted assumptions or depend on the norm of A−1 . The bound can be used for error analysis of a STE with perturbed data (such as right-hand-side or/and coefficient matrices). They may also be useful to know about the conditioning of a problem which is solved by iterative methods. In the next section, we first point out that the perturbation analysis of (1.1) is closely related to deriving upper bound for the condition number of A. Then, we discuss on conditioning of (1.1) for which we are particularly interested to extract information from coefficient matrices of the STE and present an upper bound for the condition number of A. The remainder of this paper is organized as follows. In the second section, we focus on the concept of perturbation analysis for STEs under certain conditions. In section 3, application of (flexible) global Hessenberg process in tensor form is mentioned to construct iterative algorithms for solving (1.1). Section 4 deals with proposing flexible global approaches to solve ill-posed problem in tensor forms. Several test problems are numerically solved in section 5 to examine performances of proposed iterative schemes in practice. Finally, we end the paper by brief conclusive remarks in section 6. 2. On the condition number of matrices with Kronecker structure In this section, we aim to use some information from matrices A(j) s for j = 1, 2, . . . , N and obtain an upper bound for the condition number of A given by (1.3). Evidently the Hermitian and skewHermitian parts of A are respectively given by H := H(A) =

N X

I (IN ) ⊗ · · · ⊗ I (Ij+1 ) ⊗ H(A(j) ) ⊗ I (Ij−1 ) ⊗ · · · ⊗ I (I1 ) ,

(2.1)

S := S(A) =

N X

I (IN ) ⊗ · · · ⊗ I (Ij+1 ) ⊗ S(A(j) ) ⊗ I (Ij−1 ) ⊗ · · · ⊗ I (I1 ) .

(2.2)

and

j=1

j=1

In [31], Shi et al. focused on perturbed version of the STE, i.e., ˆ ×1 (A(1) + ∆A(1) ) + X ˆ ×2 (A(2) + ∆A(2) ) + · · · + X ˆ ×N (A(N ) + ∆A(N ) ) = D + ∆D, X and obtained the following error bound N 

P

A(i) · F k∆Xk ≤ kXk

i=1

N Q

cond(Ti ) i=1 N P min λi (i)

λi ∈σ(A

N 

P (i)

1− A · F i=1

) i=1 N Q



·  k∆Dk kDk +

cond(Ti ) i=1 N P min λ i

λi ∈σ(A(i) ) i=1

N P

k∆A(i) kF

i=1 N P

kA(i) kF i=1 P  N k∆A(i) kF  ·  i=1 N P kA(i) kF i=1

(2.3)

 

,

(2.4)

4

M. Najafi-Kalyani, F. P. A. Beik and K. Jbilou

provided

(i)

A

N X i=1

F

(i)

!

·

P N



∆A(i) cond(Ti ) F  i=1 i=1  < 1. N ·  N

 P P (i)



min λi A F N Q

λi ∈σ(A(i) ) i=1

i=1

For deriving (2.4), matrices A s are assumed to be diagonalizable, i.e., there exists nonsingular matrices Ti and diagonal matrices Λi such that Ti−1 A(i) Ti = Λi for i = 1, 2, . . . , N . In [22], Liang and Zheng assumed that A(1) = A(2) = · · · = A(N ) = A with A being positive stable. They obtained the following upper bound,   k∆AkF k∆Xk k∆Dk −1 ≤ N kA + ∆AkF kA k2 + , (2.5) kX + ∆Xk kD + ∆Dk kA + ∆Ak F

by which the authors mentioned deriving better bound than (2.4). The equality “kXk = kvec(X)k2 ” makes it clear that perturbation analysis of (1.1) is closely related to obtaining bounds for the condition number of A, i.e., cond(A) = kAk2 kA−1 k2 . In fact, considering the linear system of equations Ax = b and A(x + ∆x) = b + ∆b, it has been already known that k∆bk2 k∆xk2 ≤ cond(A) . kxk2 kbk2

Also if (A + ∆A)(x + ∆x) = b + ∆b with A−1 2 k∆Ak2 < 1, then   k∆bk2 k∆xk2 k∆Ak2 cond(A) ≤ + , k∆Ak2 kxk2 kAk2 kbk2 1 − cond(A)

(2.6)

kAk2

we refer readers to [11] for more details about error analysis of perturbed linear system of equations. In the rest of this section, we focus on deriving upper bounds for condition number of A. In particular, we consider the following three cases: • Case I. The matrix A is symmetric. • Case II. The splitting A = H − S is contractive, i.e., for some matrix norm kH−1 Sk < 1.

• Case III. The matrix A has a large skew-symmetric part. More precisely, unlike to Case II, the matrix A has a dominant skew-Hermitian part. The established bound is also valid for the case A is only assumed to be (nonsymmetric) positive definite. We first need to recall the following proposition which is an immediate consequence of Weyl’s Theorem, see [16, Theorem 4.3.1]. Proposition 2.1. Suppose that A, B ∈ Rn×n are two symmetric matrices. Then, λmax (A + B) ≤ λmax (A) + λmax (B), λmin (A + B) ≥ λmin (A) + λmin (B).

Now, we establish the succeeding useful proposition which plays a key role for deriving cheap upper bounds for the condition number of A. Proposition 2.2. Let A be given by (1.3) where H and S denote its Hermitian and skew-Hermitian parts, receptively. Then, the following statements hold: (1) The maximum eigenvalue of S T S and kHk2 are respectively given by !2 N q X T T λmax (S S) = λmax (Si Si ) , i=1

) ( N N X X kHk2 = max |λ| = max λmin (Hi ) , λmax (Hi ) , λ∈σ(H) i=1

i=1

Tensor form of global schemes and their flexible variants for (ill-posed) STEs

5

(2) and the minimum eigenvalue of S T S and kH−1 k2 are respectively given by !2 N X (i) T λmin (S S) = min Im(λ ) , λ(i) ∈σ(Si ) i=1 N X −1 −1 (i) kH k2 = min |λ| = min λ , λ∈σ(H) λ(i) ∈σ(Hi ) i=1

(i)

(i)

where, for i = 1, 2, . . . , N , Hi = H(A ), Si = S(A ) and Im(λ(i) ) denotes the imaginary part of λ(i) .

Proof. In view of Proposition 2.1, we can conclude that λmax (S T S) ≤ It is not difficult to verify that

N X

λmax (SiT Si ) + 2

i=1

i=1 j=i+1

λmax (−Si ⊗ Sj ) = which implies λmax (S T S) ≤ =

N X

N X N X

q

λmax (−Si ⊗ Sj ).

q λmax (SiT Si ) λmax (SjT Sj ), N X N q q X λmax (SiT Si ) λmax (SjT Sj )

λmax (SiT Si ) + 2

i=1

i=1 j=i+1

N q X

!2

λmax (SiT Si )

i=1

.

p It is well-known that ı λmax (SiT Si ) is an eigenvalue of Si (“ı” denotes the imaginary unit), so there exists xi such that kxi k2 = 1 and q Si xi = ı λmax (SiT Si )xi , for i = 1, 2, . . . , N. Let x ¯ = (xN ⊗ xN −1 ⊗ · · · ⊗ x1 ), it can be seen that x ¯T S T S x ¯= =

N X

λmax (SiT Si ) + 2

i=1 j=1

i=1

N q X i=1

N X N q X

λmax (SiT Si )

!2

q λmax (SiT Si ) λmax (SjT Sj )

.

Therefore, we may conclude the first assertion from the fact that λmax (S T S) ≥ x ¯T S T S x ¯ with k¯ xk2 = 1. The expressions for extreme eigenvalues of H (in modules) can be verified using the symmetry of H. Notice that S T S = −S 2 = (ıS)2 , so we can easily conclude the expression for the minimum eigenvalue of S T S using the fact that ıS is a Hermitian matrix.  The following two remarks are direct conclusions of the previous proposition. Remark 2.3. From Proposition 2.2, it can be seen that if matrices A(i) s for i = 1, 2, . . . , N are positive definite then λmax (H) =

N X

λmax (Hi ),

i=1

λmin (H) =

N X i=1

λmin (Hi ).

6

M. Najafi-Kalyani, F. P. A. Beik and K. Jbilou

Remark 2.4. Note that in the case that A is symmetric, by Proposition 2.2, we have  N  N P P max λmin (Hi ) , λmax (Hi ) max |λ| λ∈σ(A) i=1 i=1 . cond(A) = = N P min |λ| (i) λ∈σ(A) min λ λ(i) ∈σ(Hi ) i=1

Now, we consider “Case II” and derive an upper bound for the condition number of A.

Proposition 2.5. Let A be defined as in (1.3). If N X i=1

kSi k2 < min |λ| ,

(2.7)

λ∈σ(H)

where Si denotes the skew-symmetric part of A(i) (i = 1, 2, . . . , N ). Then, 1 N cond(A) ≤ P (i) min λ (1 − γ) λ(i) ∈σ(Hi )

β+

i=1

i=1

 N  N P P where β = max λmin (Hi ) , λmax (Hi ) and γ = i=1

i=1

N X

kSi k2

!

,

N P

kSi k2 . N P (i) min λ (i)

λ

i=1

∈σ(Hi ) i=1

Proof. By Proposition 2.1, it can be seen that the assumption (2.7) is equivalent to kH−1 k2 kSk2 < 1. Consequently, we have kA−1 k2 = k(I + H−1 S)−1 H−1 k2 ≤ ≤

kH−1 k2 1 − kH−1 Sk2

kH−1 k2 . 1 − kH−1 k2 kSk2

Now, we may conclude the assertion from Proposition 2.1 and the fact that kAk2 = kH + Sk2 ≤ kHk2 + kSk2 .



We end this part by presenting an upper bound for the condition number of A in the third case. To do so, we need to recall the following proposition which can be immediately concluded from [12, Proposition 2.1]. Proposition 2.6. Let λmax and λmin be the largest and smallest eigenvalues of H. If

λmax λmin > −λmax (S T S), ˜ 2 < 1 where H ˜ = H − αI and S˜ = αI + S. In particular, then there exists an α for which kS˜−1 k2 kHk the parameter α can be chosen by λmax + λmin αc = , (2.8) 2 ˜ is minimized. for which the value of kS˜−1 k2 kHk 2 Notice that in view of Proposition 2.2, the condition (2.8) (in Proposition 2.6 ) is equivalent to !2 N X λmax (H)λmin (H) > − kSi k2 . i=1

Tensor form of global schemes and their flexible variants for (ill-posed) STEs

7

We comment that for the case that A is positive definite (or equivalently H is symmetric positive definite), the above condition is always satisfied. Therefore, the following remark can be used for the case that A is positive definite. Remark 2.7. Under the same hypotheses assumed in Proposition 2.6 and for its provided parameter α, we have ! 1 N X ϕ− 2 cond(A) ≤ β+ kSi k2 , 1 1 − ϕ− 2 β˜ i=1  N N 2   N P P P where ϕ = α2 + min Im(λ(i) ) , β = max λmin (Hi ) , λmax (Hi ) (i) i=1  i=1  Nλ ∈σ(Si ) i=1 N P P ˜ and β = max λmin (Hi ) − α , λmax (Hi ) − α . i=1

i=1

Proof. Following a similar strategy used in the proof of Proposition 2.5, we get kS˜−1 k2 cond(A) ≤ (kHk2 + kS k2 ) , ˜ 2 1 − kS˜−1 k2 kHk

˜ = H − αI and S˜ = S + αI. It can be seen that S˜T S˜ = α2 I − S 2 . Hence, it is where A = H + S, H deduced that ˜ = α2 + λmin (−S 2 ) = α2 + λmin (S T S). λmin (S˜T S)  The provided bound for the condition number of A in the above remark is numerically examined for Example 5.1 where the value of α is computed by (2.8). Remark 2.8. Considering Remark 2.7, we comment that if S is singular then we have ϕ = α2 . Evidently, this case always happens when the dimension of S is odd. 3. Global iterative schemes and their flexible variants In the current section, we discuss on applying global variants of iterative schemes in tensor form based on the global Hessenberg process which incorporates the tensor form of the Arnlodi process [7] as a special case. To this end, we first need to recall the definition of contracted product between two tensors. Then, we propose the global Hessenberg and CMRH iterative methods in tensor framework. Moreover, we briefly point out to the application of flexible global iterative schemes to solve the mentioned STE. 3.1. Contracted product. In [7], the N product between two N -mode tensors X ∈ RI1 ×I2 ×···×IN −1 ×IN ˜ and Y ∈ RI1 ×I2 ×···×IN −1 ×IN is defined as an IN × I˜N matrix whose (i, j)-th entry is [X N Y]ij = tr(X::···:i N −1 Y::···:j ),

where

X 2 Y = XT Y,

N = 3, 4, . . . , ˜

X ∈ RI1 ×I2 , Y ∈ RI1 ×I2 .

˜

We comment that the N product between X ∈ RI1 ×I2 ×···×IN −1 ×IN and Y ∈ RI1 ×I2 ×···×IN −1 ×IN is a reformulation of a special case of the contracted product [9]. More precisely, the product X N Y is the contracted product of N -mode tensors X and Y along the first N − 1 modes. It is immediate to see that for X, Y ∈ RI1 ×I2 ×···×IN , we have and

hX, Yi = tr(X N Y),

N = 2, 3, . . . ,

2

kXk = tr(X N X) = X (N +1) X,

for X ∈ RI1 ×I2 ×···×IN . We end the current subsection by recalling the following useful proposition from [7].

(3.1)

8

M. Najafi-Kalyani, F. P. A. Beik and K. Jbilou

Proposition 3.1. Suppose that B ∈ RI1 ×I2 ×···×IN ×m is an (N + 1)-mode tensor with the column tensors B1 , B2 , . . . , Bm ∈ RI1 ×I2 ×···×IN and z = (z1 , z2 , . . . , zm )T ∈ Rm . For an arbitrary (N + 1)mode tensor A with N -mode column tensors A1 , A2 , . . . , Am , the following statement holds ¯ N +1 z) = (A (N +1) B)z. A (N +1) (B× (3.2) 3.2. Tensor form of the Hessenberg process. Considering the following linear operator M : RI1 ×I2 ×···×IN → RI1 ×I2 ×···×IN ,

X 7→ M(X) := X ×1 A(1) + X ×2 A(2) + · · · + X ×N A(N ) ,

(3.3)

the STE (1.1) can be written as follows:

M(X) = D.

(3.4)

Km (M, R0 ) = span{R0 , M(R0 ), . . . , Mm−1 (R0 )},

(3.5)

We consider the tensor Krylov subspace as follows:

where R0 = D − M(X0 ) and X0 ∈ R is a given initial guess to the solution of (1.1). As a natural extension, the tensor form of the Hessenberg process (Hessenberg− BTF) produces a basis {V1 , V2 , . . . , Vm } for Km (M, R0 ) such that I1 ×I2 ×···×IN

hVi+1 , Yj i = 0,

for j = 1, 2, . . . , i,

where tensors Yi s are chosen to be linearly independent. Algorithm 1: Hessenberg− BTF process. 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11:

Input the initial guess X(0) and the parameter m; Set R0 = D − M(X0 ), β = hR0 , Y1 i and V1 = R(0) /β. for j = 1, 2, . . . , m do W = M(Vj ); for i = 1, 2, . . . , j do hij = hYi , Wi; W = W − hij Vi ; end for hj+1,j = hYj+1 , Wi. If hj+1,j = 0, then stop; Vj+1 = W/hj+1,j ; end for

It is obvious that if we set Yi = Vi for i = 1, 2, . . . , then the Hessenberg− BTF process reduces to the Arnoldi− BTF process [7]. Particularly, in Subsection 3.4 and numerical experiments, each of tensors Yi s is chosen to have only one nonzero entries equal to one. The idea is an extension of the process exploited in [29]. Using the values of hij computed in Lines 6 and 9 of Algorithm 1, we construct an (m + 1) × m ¯ m = [hij ] and define an m × m matrix Hm by removing the last row upper Hessenberg matrix H ˜ m and Y ˜ m are (N + 1)-mode tensors with column tensors Vi s and Yi s for ¯ m . Suppose that V of H ˜ m (N +1) V ˜ m = Im . The following theorem can be proved with i = 1, 2, . . . , m. It can be seen that Y a quite similar strategy used in [7, Theorem 3.1], hence its proof is omitted. ˜ m be the (N + 1)-mode tensor with column tensors Wj := M(Vj ) for j = Theorem 3.2. Let W 1, . . . , m. Then the following statements hold T ˜m =V ˜ m+1 ×(N +1) H ¯m W , (3.6)

˜m =V ˜ m ×(N +1) H T + hm+1,m Z ×(N +1) Em , W (3.7) m in which Z is an (N + 1)−mode tensor with “m” column tensors 0, . . . , 0, Vm+1 and Em is an m × m matrix of the form Em = [0, . . . , 0, em ] where em is the mth column of the identity matrix of order m.

Tensor form of global schemes and their flexible variants for (ill-posed) STEs

9

The following proposition is a direct conclusion of the previous theorem. ˜ m and W ˜ m are (N +1)-mode tensors with the column tensors Yj and Proposition 3.3. Suppose that Y Wj := M(Vj ) where Vj is produced by Algorithm 1 for j = 1, 2, . . . , m. Then the following equality holds ˜ m (N +1) W ˜ m = Hm . Y (3.8) ˜ m (N +1) V ˜ m = Im . Proof. Evidently, Eq. (3.8) can be deduced from (3.7) and the fact that Y



3.3. Global Hessenberg− BTF and CMRH− BTF methods. For a given initial guess X0 ∈ RI1 ×I2 ×···×IN with the corresponding residual R0 , we seek the m-th approximate solution such that Xm ∈ X0 + Km (M, R0 ).

(3.9)

Note that V1 , V2 , . . . , Vm , produced via Algorithm 1, form a basis for Km (M, R0 ). Hence, from (3.9), we have ˜ m× ¯ (N +1) ym , Xm = X0 + V (3.10) ˜ ¯ (N +1) ym , Rm = D − M(Xm ) = R0 − Wm × ˜ m is defined as in Proposition 3.3, V1 = R0 /β and the unknown vector ym ∈ Rm needs to be where W determined. • Global Hessenberg− BTF method: One may find the vector ym such that hRm , Yi i = 0,

for i = 1, 2, . . . , m.

(3.11)

In view of Proposition 3.1, the orthogonality condition (3.11) implies that   ˜ m (N +1) W ˜ m ym = Y ˜ m (N +1) R0 , Y

˜ m is a (N + 1)-mode tensor with column tensors Yi s for i = 1, 2, . . . , m. The facts where Y (N +1) ˜ ˜ ˜ m (N +1) R0 = Y ˜ m (N +1) (V ˜ m× ¯ (N +1) βe1 ) = βe1 imply that ym Ym  Wm = Hm and Y is the solution of the succeeding linear system of equations Hm ym = βe1 .

(3.12)

Straightforward computations show that (m) ˜ m× ¯ (N +1) (βe1 ) − hm+1,m Z× ¯ (N +1) (ym Rm = R 0 − V em ), (m)

where ym denotes the last component of ym . It is not difficult to verify that the above equation reduces to (m) Rm = −hm+1,m ym Vm+1 . (3.13) Notice that the above relation is satisfied for any linearly independent set of tensors {Yi }i=1,2,... . Therefore, the expression (3.13) covers the previously established result for the global FOM− BTF [7] and it provides an expression for the norm of residual which is cheap in the computational sense. • Global CMRH− BTF method: In view of (3.6) and [7, Lemma 2.1], it is observed that ˜ m× ¯ (N +1) ym Rm = R0 − W ˜ m+1 × ¯ m ym ). ¯ (N +1) (βe1 − H =V

In the global CMRH− BTF method, the vector ym is determined such that

¯ m y . ym = arg min βe1 − H y∈Rm

2

Notice that unlike to the GMRES− BTF method the norm of kRm k is not minimized (over ˜ m+1 (N +1) V ˜ m+1 is not an identity matrix. In the numerical X0 + Km (M, R0 )) since V

10

M. Najafi-Kalyani, F. P. A. Beik and K. Jbilou

experiments, the corresponding least-squares problem is solved by using plan rotation [30, Chapter 6]. 3.4. Flexible variants of global methods. In the rest of the paper, we only consider the case that N = 3. Therefore, the linear operator M(·) takes the following form: M(X) = X ×1 A(1) + X ×2 A(2) + X ×3 A(3) .

Here we briefly mention the implementation of some global schemes in tensor forms in conjunction with a preconditioner consisting of solving the same STE inexactly. To this end, we apply tensor form of BiCGStab algorithm [8] which is a global version of BiCGstab method for solving linear system of equations [30, Chapter 7]. The resulting algorithms are referred as flexible global schemes. The idea is a natural extension of the results given in [30] and [34], therefore we left details to conscientious readers. Algorithm 2: Global BiCGStab− BTF 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15:

Input coefficients A(1) , A(2) and A(3) , the right-hand-side D, X0 and tolerance ε; Compute R0 = D − M(X0 ) and choose an arbitrary tensor R∗0 ; P0 = R 0 ; for j = 0, 1, . . . , until convergence do hR ,R∗ 0i αj = hM(Pj j ),R ∗i ; 0 Sj = Rj − αj M(Pj ); hM(Sj ),Sj i ωj = hM(Sj ),M(S ; j )i Xj+1 = Xj + αj Pj + ωj Sj ; Rj+1 = Sj − ωj M(Sj ); if kRj+1 k/kR0 k < ε break; end if hR ,R∗ α 0i βj = hRj+1 × ωjj ; ∗ j ,R0 i Pj+1 = Rj+1 + βj (Pj − ωj M(Pj )); end for

Algorithm 3: Flexible Arnoldi− BTF process. 1: Set V1 = D/kDk; 2: for j = 1, . . . , k do 3: Apply Algorithm 2 to find Zj as an approximate solution of M(Z) = Vj with respect to a prescribed tolerance ε; % For Arnoldi− BTF process skip this step and set Zj = Vj . 4: V = M(Zj ); 5: for i = 1, . . . , j do 6: hi,j = hV, Vi i; 7: V = V − hi,j Vi ; 8: end for 9: hj+1,j = kVk; 10: if hj+1,j > 0 then 11: Vj+1 = V/hj+1,j ; 12: else 13: exit; 14: end if 15: end for

Tensor form of global schemes and their flexible variants for (ill-posed) STEs

11

As pointed out earlier, in the following, we apply the Hessenberg− BTF process such that at each step, the i-th tensor Yi has only a nonzero entry equal to one. To make the performance of global Hessenberg− BTF and CMRH− BTF methods and their flexible variants more stable, we need to find a triple of indices for which a maximum values of a given tensor is reached. The idea is taken from the maximum strategy used in the global CMRH method for solving linear system of equations with multiple right-hand sides [17] and the pivoting strategy exploited in CMRH method for solving linear system of equations [29]. Therefore, we refer the corresponding algorithm as pivoting strategy for a 3−mode tensor which is given by Algorithm 4. We summarize the flexible Hessenberg− BTF process with maximum strategy in Algorithm 5 and the pseudo-code for proposed global iterative schemes in Algorithm 6. Algorithm 4: Pivoting strategy for a 3−mode tensor Require: Input a tensor R ∈ Rn1 ×n2 ×n3 ; Ensure: Triple (i0 , j0 , k0 ); 1: [∼, j] l= max(|vec(R)|); m 2:

3:

k0 =

j n1 n2

;

` = j l− (k m0 − 1)(n1 n2 ); ` n1

;

4:

j0 =

5:

i0 = ` − (j0 − 1)n1 ;

Algorithm 5: Flexible Hessenberg− BTF process with maximum strategy. Require: Input an n1 × n2 × n3 real tensor V = R0 and the restart parameter m. ˜ m with the column ¯ m = [hi,j ](m+1)×m and 4-mode tensor Z Ensure: The upper Hessenberg matrix H tensors Zi ’s. 1: Determine triple (i0 , j0 , k0 ) using Algorithm 4 for the input V; 2: Set β = Vi0 ,j0 ,k0 ; V1 = V/β; p1,1 = i0 ; p1,2 = j0 ; p1,3 = k0 ; 3: for j = 1, . . . , m do 4: Apply Algorithm 2 to find Zj as an approximate solution of M(Z) = Vj with respect to a prescribed tolerance ε; % For Hessenberg− BTF process skip this step and set Zj = Vj . 5: U = M(Zj ); 6: for i = 1, . . . , j do 7: hi,j = Upi,1 ,pi,2 ,pi,3 ; 8: U = U − hi,j Vi ; 9: end for 10: Determine triple (i0 , j0 , k0 ) using Algorithm 4 for the input U; 11: Set hj+1,j = Ui0 ,j0 ,k0 ; Vj+1 = U/hj+1,j ; pj+1,1 = i0 ; pj+1,2 = j0 ; pj+1,3 = k0 ; 12: end for

12

M. Najafi-Kalyani, F. P. A. Beik and K. Jbilou

Algorithm 6: Global schemes based on tensor forms. Require: Input X0 as a zero tensor; Itermax : maximum number of allowed iterations; tolerance  > 0; the restart parameter m. 1: for k = 1, . . . , Itermax do 2: Compute R0 = D − M(X0 ); ˜ m by applying ¯ m and Z 3: Compute H ( Algorithm 3 to Km (M, R0 ); (FFOM-BTF/FGMRES-BTF) Algorithm 5 to Km (M, R0 ); (FHess-BTF/FCMRH-BTF)  (m) Hm ym = βe1 ; (FFOM-BTF/FHess-BTF) (m) 4: Solve the problem y = argmin kH ¯ k; (FGMRES-BTF/FCMRH-BTF) m ym − βe  m 1

y∈Rm

˜ m× ¯ 4 ym ; Compute the approximate solution Xm = X0 + Z Compute Rm = D − M(Xm ); if kRm k/kDk ≤  then return Xm ; else 10: X0 = Xm , R0 = Rm and goto 3; 11: end if 12: end for 5: 6: 7: 8: 9:

4. Flexible iterative schemes to solve ill-posed problem Consider the case that right-hand-side of the mentioned STE contains an error E ∈ Rn1 ×n2 ×n3 ˜ + E where D ˜ is the unknown noise-free unavailable right-handwhich is called “noise”, i.e., D = D ˜ side. Let X be the minimum norm solution of the unavailable STE with the error-free right-hand-side, i.e., ˜ X ×1 A(1) + X ×2 A(2) + X ×3 A(3) = D. (4.1) (i) n1 ×n2 ×n3 Here the ni × ni real matrices A s (for i = 1, 2, 3) and D ∈ R are known and X ∈ Rn1 ×n2 ×n3 is the unknown tensor to be determined. In this part, the matrices A(1) , A(2) and A(3) are assumed to ˜ by solving (4.1) with rightbe extremely ill-conditioned which makes it inefficient to approximate X hand-side D using available global iterative methods without using any regularizations. Basically, using the idea of Tikhonov regularization, one remedy is solving the minimization problem n o 2 2 min kM(X) − Dk + λkXk , (4.2) n ×n ×n X∈R

1

2

3

instead of solving M(X) = D in which λ > 0 is called the regularization parameter. Now we briefly describe how the tensor forms of Arnoldi and Hessenberg processes can be applied to solve largescale Tikhonov regularization problems (4.2). The corresponding algorithms are called AT− BTF and HT− BTF and their flexible versions are named FAT− BTF and FHT− BTF. After briefly pointing out to the derivation of algorithms, we summarize the iterative schemes in Algorithms 7 and 8. Here we first explain how the FAT− BTF can be constructed. Assume that k steps of FAT− BTF ˜ k and V ˜ k+1 be tensors of mode four with column tensors M(Zj ) and Vi with were performed. Let W j = 1, 2, . . . , k and i = 1, 2, . . . , k + 1. It is not difficult to verify that the following statement holds: ˜k =V ˜ k+1 ×4 H ¯T. W (4.3) k

Without loss of generality, we assume that X0 is zero. Performing k steps of Algorithm 3, we determine an approximate solution Xk for (4.2) such that Xk =

k X i=1

(i)

˜k× ¯ 4 yk , yk Zi = Z

(1)

(2)

(k)

where yk = (yk , yk , . . . , yk )T .

(4.4)

Tensor form of global schemes and their flexible variants for (ill-posed) STEs

13

In view of (4.3), it can be seen that ˜ k× ˜ k+1 × ¯ k yk − kDke1 ). ¯ 4 yk − D = V ¯ 4 (H M(Xk ) − D = W

Now, we conclude that

¯ k yk − kDke1 k2 . kM(Xk ) − Dk = kH

As a result, we can reformulate the minimization problem (4.2) by the following low-dimensional standard Tikhonov regularization problem, o n 2 ˜k× ¯ k y − kDke1 k2 + λkZ ¯ 4 yk2 . (4.5) min kH 2 y∈Rk

Note that

2 ˜k× ˜ k 4 Z ˜ k )y. ¯ 4 yk2 = y T (Z kZ

˜ k 4 Z ˜ k is a nonsingular Gram matrix, we can compute its Cholesky factorization, i.e., Since Z ˜ k 4 Z ˜ k = Lk LT . Z k

Evidently, we can reformulate (4.5) as follows: n o 2 ˆ k yˆ − kDke1 k2 + λkˆ min kH y k 2 2 ,

(4.6)

yˆ∈Rk

2 ˜k× ˆk = H ¯ k L−T . It should be commented that the idea of rewriting kZ ¯ 4 yk2 by where yˆ = LTk y and H k ˜ k 4 Z ˜ k is an extension of the idea exploited in [18]. In our numerical using the Cholesky factor of Z experiments, the value of λ is determined with using the discrepancy principle with the coefficient “1.01” for the upper bound (for further information see [14, 15, 18, 28]). The HT− BTF can be proposed with a similar strategy saying that the corresponding minimization (4.2) reduces to n o 2 ˜ k+1 4 (H ˆ k yˆ − βe1 )k2 + λkˆ min kV yk . (4.7) 2

yˆ∈Rk

2

In practice, the following least-squares problem is solved instead of (4.7), n o 2 ˆ k yˆ − βe1 k2 + λkˆ min kH y k 2 2 , yˆ∈Rk

where β is determined from Line 2 of Algorithm 5. ˜ k 4 Z ˜ k and Note that in the implementation of the AT− BTF algorithm, we do not need to form Z it is indeed equal to the identity matrix. Algorithm 7: Flexible Arnoldi-Tikhonov based on tensor form (FAT− BTF)

Require: Input A(1) , A(2) , A(3) and tensor D; 1: for k = 1, 2, . . . until convergence do ˜ k with the column tensors Zi ’s and H ¯ k by Algorithm 3; 2: Construct Z 4 ˜ ˜ ˜ k 4 Z ˜ k = Lk LT ; 3: Compute the Cholesky factor Lk of Zk  Zk , i.e., Z k −T ˆk = H ¯kL ; 4: Set H k 5: Compute regularization parameter λ by the discrepancy principle with user–chosen constant η = 1.01; n o 2 ˆ k yˆ − kDke1 k2 + λkˆ 6: Compute yk are argmin kH y k 2 2 ; 7: 8: 9:

Set yk = L−T ˆk ; k y k P (i) ˜k× ¯ 4 yk , where yk = (yk(1) , yk(2) , . . . , yk(k) )T ; Compute Xk = yk Zi = Z

end for

i=1

14

M. Najafi-Kalyani, F. P. A. Beik and K. Jbilou

Algorithm 8: Flexible Hessenberg Tikhonov (FHT− BTF)

Require: Input A(1) , A(2) , A(3) and tensor D; 1: Determine triple (i0 , j0 , k0 ) using Algorithm 4 and β = D(i0 , j0 , k0 ); 2: for k = 1, 2, . . . until convergence do ˜ k with the column tensors Zi ’s and H ¯ k by Algorithm 5; 3: Construct Z 4 ˜ ˜ ˜ k 4 Z ˜ k = Lk LT ; 4: Compute the Cholesky factor Lk of Zk  Zk , i.e., Z k −T ˆk = H ¯kL ; 5: Set H k 6: Compute regularization parameter λ by the discrepancy principle with user–chosen constant η = 1.01 [18]; o n 2 ˆ k yˆ − βe1 k2 + λkˆ yk ; 7: Compute yk by min kH 8:

9: 10:

y∈Rk −T Set yk = Lk yˆk ; k P (i) Compute Xk = yk Zi i=1

2

2

˜k× ¯ 4 yk , where yk = (yk(1) , . . . , yk(k) )T ; =Z

end for

5. Numerical experiments In this section, we examine feasibility and applicability of proposed iterative schemes for solving some test problems. All computations were carried out in MATLAB R2018b with an Intel Core i7-4770K CPU @ 3.50GHz processor and 24GB RAM using Tensor Toolbox [4]. In tables, we report total required number of iterations and consumed CPU-time (in seconds) under “Iter” and “CPU–times(s)”, respectively. Example 5.1. [7, Example 5.4] Consider the following convection–diffusion equation: −ν∆u + cT ∆u = f

in

u=0

Ω = [0, 1] × [0, 1] × [0, 1],

in δΩ.

By using a standard finite difference discretization on a uniform grid for the diffusion term and a second-order convergent scheme (Fromms scheme) for the convection term with the mesh-size h = 1/(n + 1), the discrete system matrix of the form X ×1 A(1) + X ×2 A(2) + X ×3 A(3) = D,

such that

A(i)



2 −1 1   = 2 h  

−1 2 .. .

−1 .. . −1

..

. 2 −1

 3   1  c   i   +  4h   −1  2 

−5

1

3 .. .

−5 .. . 1

..

.

..

.

3 1

(5.1) 

   , 1  −5 3

i = 1, 2, 3.

We set c1 = 1, c2 = 2 and c3 = 4. The initial guess X0 is chosen to be zero and the right-hand-side D is constructed so that X∗ = rand(n, n, n) is the exact solution of (5.1). We examined the obtained bound for the condition number of A corresponding to Example 5.1. Based on the capacity of our exploited system, we only could compute the exact values of cond(A) for 3 3 A ∈ Rn ×n up to n = 20. We comment that for computing cond(A) with MATLAB function cond(·), we did not save the matrix A in sparse form. In Table 1, we disclose condition numbers of A(1) , A(2) and A(3) together with the derived upper bound of cond(A) given in Remark 2.7 for different values of n. We further plot the computed upper bound and the exact value of cond(A) in Figure 1.

Tensor form of global schemes and their flexible variants for (ill-posed) STEs

15

104

103

102

101 0

10

20

30

40

50

60

70

80

Figure 1. The exact value of cond(A) versus provided upper bound in Remark 2.7 for Example 5.1 with respect to different values of n. Table 1. Condition number of matrices for Example 5.1. n

cond(A(1) )

cond(A(2) )

25 50 100

280.33 1.06e+3 4.13e+3

281.78 1.05e+3 4.07e+3

cond(A(3) ) Upper bound of cond(A) 269.49 990.46 3.77e+3

308.13 1.22e+3 4.27e+3

We applied discussed iterative schemes to solve the corresponding STE so that algorithms was stopped as soon as, kD − M(Xk )k ≤ 10−8 . rk := kDk As pointed out before, in all flexible iterative schemes, the inner STE is solved by Algorithm 2. Here the inner iterations were terminated when the residual norm had been reduced by a factor of 10. Obtained numerical results are disclosed in Table 2 where “IterBiCGStab ” and ek stand for the total number of inner iterations and relative true error of the k-th computed approximate solution, respectively. The results show that the (flexible) iterative schemes based on (flexible) Hessenberg− BTF process work slightly faster than those based on (flexible) Arnoldi− BTF process saying that the restart parameter in all algorithms is m = 10. For more details, we also depict the convergence histories of algorithms (for grid 50 × 50 × 50) with respect to different restart numbers in Figure 2. The experimental results after using the flexible variants of iterative schemes motivated us to work with flexible Hessenberg− BTF and Arnoldi− BTF process in conjunction with Tikhonov regularization problem (4.2) for solving STEs with ill-conditioned coefficient matrices, recalling that the corresponding methods are called by FHT− BTF and FAT− BTF or by HT− BTF and AT− BTF whenever we skip Line 4 in Algorithm 5 and Line 3 in Algorithm 3, respectively. To this end, in Examples 5.2, 5.3 and 5.4, we report some numerical results for solving (1.1) with the right-hand-side D contaminated by tensor E. The noise-tensor E has normally distributed random entries with zero mean and with variance chosen so that ` = kEk/kDk for a prescribed value of ` > 0 which is called the level of noise. In tables, we report the relative error ˆ kXλk ,k − Xk ek =: , ˆ kXk

16

M. Najafi-Kalyani, F. P. A. Beik and K. Jbilou FHess−BTF

FOM−BTF

FFOM−BTF

CMRH−BTF

FCMRH−BTF 1

0

0

0

−1

−1

−1

−2

−2

−2

−3

−4

Log10 r(k)

1

Log10 r(k)

Log10 r(k)

Hess−BTF 1

−3

−4

−4

−5

−5

−6

−6

−6

−7

−7

−7

100

200 300 400 500 600 Total number of iterations (m=10)

700

−8

50

100 150 200 250 Total number of iterations (m=30)

300

FGMRES−BTF

−3

−5

−8

GMRES−BTF

−8

50

100 150 200 250 Total number of iterations (m=50)

300

Figure 2. Convergence behaviours of algorithms in Example 5.1 (for grid 50×50×50) with respect to restart number m. ˆ is the exact solution and Xλ ,k is the kth computed approximation. In both HT− BTF and where X k AT− BTF algorithms and their flexible versions (FHT− BTF and FAT− BTF), iterations are terminated once, kXλk ,k − Xλk−1 ,k−1 k ≤ τ, kXλk−1 ,k−1 k or the maximum number of 40 iterations was reached, the value of τ is reported in top of tables. In flexible algorithms, we only use two steps of BiCGStab− BTF as the inner iteration. In the following two examples, we consider color image restoration problems. In fact, for both examples, error free right-hand-sides are constructed such that exact solutions of STEs are color images. Example 5.2. We consider the following Sylvester tensor equation, X ×1 A(1) + X ×2 A(2) = D,

(2) where A(1) = [aij ]512 = A(1) are both Gaussian Toeplitz matrices given by, i,j=1 and A   1 , |i − j| ≤ r, aij = 2r − 1 0, otherwise.

(5.2)

Here we set r = 4 which results cond(A(i) ) = 3.84 × 1016 , i = 1, 2. Using Remark 2.4, we obtain ˜ + E where D ˜ is cond(A) = 3.27e + 16 corresponding to A(1) and A(2) . As pointed earlier, D = D constructed such that the tensor of order 512 × 512 × 3, corresponding to a color image1, is the exact solution of (4.1). For more details, we depict original, noisy and restored images in Figure 3. We comment that using (2.6), the upper bound for the relative error is ` × 3.27e + 16 in which ` is the available level of noise. 1http://sipi.usc.edu/database/preview/misc/4.2.06.png

Tensor form of global schemes and their flexible variants for (ill-posed) STEs

17

Table 2. Comparison results for Example 5.1

Method Hess− BTF FHess− BTF FOM− BTF FFOM− BTF CMRH− BTF FCMRH− BTF GMRES− BTF FGMRES− BTF Method Hess− BTF FHess− BTF FOM− BTF FFOM− BTF CMRH− BTF FCMRH− BTF GMRES− BTF FGMRES− BTF Method Hess− BTF FHess− BTF FOM− BTF FFOM− BTF CMRH− BTF FCMRH− BTF GMRES− BTF FGMRES− BTF

Grid: 25 × 25 × 25

rk

ek

CPU-times(sec)

IterBiCGStab

8.85e-09 8.74e-10 8.68e-09 4.72e-09 8.06e-09 6.17e-09 9.42e-09 4.72e-09

3.91e-07 3.70e-08 3.83e-07 8.53e-08 2.33e-05 6.99e-05 5.66e-07 2.58e-06

0.139 0.091 0.136 0.071 0.097 0.062 0.171 0.063

– 16 – 10 – 15 – 10

Grid: 50 × 50 × 50

rk

ek

CPU-times(sec)

IterBiCGStab

8.39e-09 7.40e-09 9.73e-09 8.84e-10 9.82e-09 1.12e-09 9.43e-09 8.83e-10

4.20e-07 8.17e-07 9.17e-07 8.65e-08 2.23e-04 2.38e-04 2.24e-06 1.38e-06

3.017 0.909 3.208 0.813 1.268 0.727 2.495 0.755

– 36 – 33 – 37 – 33

Grid: 100 × 100 × 100

rk

ek

CPU-times(sec)

IterBiCGStab

8.63e-09 7.70e-10 9.79e-09 9.70e-09 9.47e-09 7.28e-10 9.98e-09 9.67e-09

5.62e-06 5.02e-07 3.25e-06 7.27e-06 9.97e-04 7.15e-04 1.03e-05 3.37e-05

102.362 19.255 157.352 14.709 41.392 13.077 135.808 14.582

– 47 – 55 – 56 – 55

Restart Iter(k) 21 1 17 1 16 1 22 1

5 8 9 7 6 6 8 7

Restart Iter(k) 64 1 72 1 29 1 60 1

6 8 1 8 9 7 8 8

Restart Iter(k) 169 1 241 1 69 1 215 1

5 8 9 7 5 6 1 7

Example 5.3. We consider the case that a tensor of order 1019×1337×33 is the exact solution of (4.1). The exact solution is a tensor corresponding to a hyperspectral image of natural scenes2, for more details see [10]. The coefficient matrices A(1) , A(2) and A(2) are the Gaussian Toeplitz matrices defined by (5.2) with suitable dimensions such that r = 2 which results cond(A(1) ) = 5.27×1016 , cond(A(2) ) = 1.76×1017 and cond(A(3) ) = 3.62×1016 . The original, noisy and restored images are plotted in Figure 4 for further details. We comment that in view of Remark 2.4, we can derive cond(A) = 5.76e + 16 corresponding to this example which results ` × 5.76e + 16 as the upper bound for the relative error by (2.6) corresponding to the available level of noise `. Our examined numerical tests are reported in Tables 3 and 4. The results illustrate that applying the flexible version of AT− BTF (FAT− BTF) can lead to better results in term of accuracy. The 2http://personalpages.manchester.ac.uk/staff/d.h.foster

18

M. Najafi-Kalyani, F. P. A. Beik and K. Jbilou

Table 3. Results for Example 5.2 with τ = 5e − 3. Level of noise (`)

Method

Iter(k)

ek

CPU-times(sec)

0.01

AT− BTF FAT− BTF HT− BTF FHT− BTF

6 2 6 5

0.0339 0.0315 0.0357 0.0359

26.04 16.23 38.71 49.83

0.001

AT− BTF FAT− BTF HT− BTF FHT− BTF

3 4 3 3

0.0276 0.0168 0.0261 0.0193

8.16 45.10 11.01 28.08

Table 4. Results for Example 5.3 with τ = 8e − 3. Level of noise (`)

Method

Iter(k)

ek

CPU-times(sec)

0.01

AT− BTF FAT− BTF HT− BTF FHT− BTF

6 2 7 3

0.0373 0.0357 0.0565 0.0640

85.99 54.29 174.21 105.86

0.001

AT− BTF FAT− BTF HT− BTF FHT− BTF

3 4 7 2

0.0342 0.0208 0.0260 0.0269

27.24 152.26 174.45 51.81

performance of HT− BTF and FHT− BTF are also good. However, for the the above two examples, in conjunction with Tikhonov regularization technique using the (flexible) Arnoldi process can lead to better results than the (flexible) Hessenberg process. As seen, when the level of noise increases FHT− BTF may not work better than HT− BTF. For the preceding two examples, it is seen that using FAT− BTF instead of AT− BTF can lead to a more accurate solution. Now we consider a test problem in which the coefficient matrices are full and extremely ill-conditioned. This kind of STE may arise from discretization of a fully three-dimensional microscale dual phase lag problem by a mixed-collocation finite difference method; see [24, 25, 27] for further details. Example 5.4. Consider the STE (4.1) with a perturbed right-hand-side such that A(`) = [aij ] for ` = 1, 2, 3 are defined by   (−1)i+j π 2  h  2πξ i , i 6= j −2 L j sin2 12 L −xi   aij =  2  − π 2 n +2 , i = j, L 3

where xi = 2π(i−1) , ξj = (j−1)L , i, j = 1, 2, . . . , n with L = 500. In [7], a similar STE was solved by n n global schemes for odd values of n for which the coefficient matrices A(i) are very well-conditioned. Here, the value of n is chosen to be even which results extremely ill-conditioned coefficient matrices. The right-hand-side D is constructed so that X∗ = randn(n, n, n) is the exact solution of (1.1). The obtained numerical results, for Example 5.4, are disclosed in Table 5. We recall that the values for cond(A) (in the table) are computed using Remark 2.4 for different dimensions for which in (2.8) the associated upper for the relative error is ` × cond(A) where ` is the level of noise. As seen, for this example, the (flexible) HT− BTF method can outperform (flexible) AT− BTF method which is anticipated since coefficient matrices in the corresponding STE are full.

Tensor form of global schemes and their flexible variants for (ill-posed) STEs

19

Figure 3. Example 5.2; Up: Left to Right: Original and noisy images and restored image by AT− BTF; Down: Left to Right: Restored images by FAT− BTF, FHT− BTF and HT− BTF for noise level = 0.01. 6. Conclusions and future works We first focused on conditioning of Sylvester tensor equations (STEs) under certain conditions. The dissuasions on conditioning can be helpful for the error analysis of STEs with perturbed data. Then we developed some iterative schemes based on the global Hessenberg process in tensor framework to solve STEs. In particular, we discussed on the implementation of algorithms and their flexible versions. As a STE with extremely ill-conditioned coefficient matrices may appear in practice, we proposed using the standard Tikhonov regularization technique in conjunction with global (flexible) Arnoldi and Hessenberg processes in tensor form to solve STEs with perturbed right-hand-sides. Numerical test problems examined with applications to color image restoration and to 3D convection–diffusion equation and a fully three-dimensional microscale dual phase lag problem. The reported numerical results revealed the feasibility of the proposed iterative schemes. The performance of generalized Tikhonov regularization technique (which benefits from exploiting regularization matrices) in conjunction with (generalized) Arnoldi process in tensor framework is currently under investigation. Experimental results show that when the right-hand-side in STE is restricted to be a low rank tensor, using the extended Hessenberg process for the Kronecker product of extended Krylov subspaces leads to much faster iterative schemes than global iterative schemes. Studying the performance of extended Hessenberg processes in tensor framework is a project to be undertaken in near future.

20

M. Najafi-Kalyani, F. P. A. Beik and K. Jbilou

Figure 4. Example 5.3; Up: Left to Right: Original and noisy images and restored image by AT− BTF; Down: Left to Right: Restored images by FAT− BTF, FHT− BTF and HT− BTF for noise level for noise level = 0.01. Acknowledgement The authors would like to express their sincere gratitude to anonymous referees for their useful comments which improved the quality of the paper. References [1] M. Addam, M. Heyouni, H. Sadok, The block Hessenberg process for matrix equations, Electron. Trans. Numer. Anal. 46 (2017) 460–473. [2] S. Amini, F. Toutounian, M. Gachpazan, The block CMRH method for solving nonsymmetric linear systems with multiple right–hand sides, J. Comput. Appl. Math. 337 (2018) 166–174. [3] S. Amini, F. Toutounian, Weighted and flexible versions of block CMRH method for solving nonsymmetric linear systems with multiple right-hand sides, Comput. Math. Appl. 76 (2018) 2011–2021. [4] B. W. Bader, T. G. Kolda, MATLAB Tensor Toolbox Version 2.5, http://www.sandia.gov/~tgkolda/ TensorToolbox. [5] Z. Z. Bai, G. H. Golub, M. K. Ng, Hermitian and skew-Hermitian splitting methods for non-Hermitian positive definite linear systems, SIAM J. Matrix Anal. Appl. 24 (2003) 603–626. [6] J. Ballani, L. Grasedyck, A projection method to solve linear systems in tensor format, Numer. Linear Algebra Appl. 20 (2013) 27–43. [7] F. P. A. Beik, F. S. Movahed, S. Ahmadi-Asl, On the Krylov subspace methods based on tensor format for positive definite Sylvester tensor equations, Numer. Linear Algebra Appl. 23 (2016) 444–466. [8] Z. Chen, L. Z. Lu, A projection method and Kronecker product preconditioner for solving Sylvester tensor equations, Science China Mathematics. 55 (2012) 1281–1292. [9] A. Cichocki, R. Zdunek, A. H. Phan, S. I. Amari, Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation, John Wiley & Sons, 2009. [10] D. H. Foster, K. Amano, S. M. C. Nascimento, M. J. Foster, Frequency of metamerism in natural scenes, Journal of the Optical Society of America A. 23 (2006) 2359–2372. [11] G. H. Golub, C. F. Van Voan, Matrix Computations, The Johns Hopkins University Press, Batimore, 1996. [12] G. H. Golub, D. Vanderstraeten, On the preconditioning of matrices with skew-symmetric splittings, Numer. Algorithms. 25 (2000) 223–239. [13] G. H. Golub, S. Nash, C. F. Van Loan, A Hessenberg-Schur method for the problem AX + XB = C, IEEE Trans Auto Control, 24 (1979) 909–913. [14] M. Hanke, P. C. Hansen, Regularization methods for large-scale problems, Surv. Math. Ind. 3 (1993) 253–315.

Tensor form of global schemes and their flexible variants for (ill-posed) STEs

21

Table 5. Comparison results for Example 5.4 with τ = 8e − 3. Grid

30 × 30 × 30

60 × 60 × 60

90 × 90 × 90

120 × 120 × 120

cond(A)

2.31e+16

1.09e+18

8.27e+16

1.76e+16

cond(A)

Level of noise (`)

Method

Iter(k)

ek

CPU-times(sec)

0.01

AT FAT− BTF HT− BTF FHT− BTF

13 8 13 6

0.0570 0.0464 0.0547 0.0428

0.15 0.21 0.13 0.11

0.001

AT− BTF FAT− BTF HT− BTF FHT− BTF

12 6 13 7

0.0529 0.0220 0.0459 0.0174

0.14 0.12 0.22 0.13

0.01

AT− BTF FAT− BTF HT− BTF FHT− BTF

13 7 11 6

0.0564 0.0456 0.0646 0.0435

1.28 1.41 1.71 1.06

0.001

AT− BTF FAT− BTF HT− BTF FHT− BTF

12 6 6 6

0.0536 0.0226 0.0541 0.0205

1.10 1.02 2.20 0.99

0.01

AT− BTF FAT− BTF HT− BTF FHT− BTF

13 7 13 6

0.0540 0.0455 0.0564 0.0446

5.23 5.79 10.12 4.14

0.001

AT− BTF FAT− BTF HT− BTF FHT− BTF

12 6 12 7

0.0526 0.0217 0.0544 0.0180

4.61 4.21 6.97 4.17

0.01

AT− BTF FAT− BTF HT− BTF FHT− BTF

12 7 11 6

0.0585 0.0465 0.0623 0.0447

11.35 14.26 17.28 10.33

0.001

AT− BTF FAT− BTF HT− BTF FHT− BTF

12 6 13 5

0.0516 0.0210 0.0497 0.0264

11.39 10.40 17.33 7.42

1.33e+16

9.51e+15

5.20e+16

4.32e+16

[15] P. C. Hansen, Regularization tools: A Matlab package for analysis and solution of discrete ill-posed problems, Numer. Algorithms. 6 (1994) 1–35. Software is available in http://www.netlib.org [16] R. A. Horn, C. R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, UK, 1985. [17] M. Heyouni, The global Hessenberg and CMRH methods for linear systems with multiple right-hand sides, Numer. Algorithms. 26 (2001) 317–332. [18] G. Huang, L. Reichel, F. Yin, On the choice of subspace for large-scale Tikhonov regularization problems in general form, Numer. Algorithms. 2018; 1-23. DOI:10.1007/s11075-018-0534-y. [19] B. N. Khoromskij, Tensors–structured numerical methods in scientific computing: Survey on recent advances, Chemometrics and Intelligent Laboratory Systems, 110 (2012) 1–19. [20] T. G. Kolda, B. W. Bader, Tensor decompositions and applications, SIAM Review. 51 (2009) 455–500. [21] D. Kressner, C. Tobler, Low-rank tensor Krylov subspace methods for parametrized linear systems, SIAM J. Matrix Anal. Appl. 32 (2011) 1288–1316. [22] L. Liang, B. Zheng, Sensitivity analysis of the Lyapunov tensor equation, Linear Multilinear Algebra. 67 (2019) 555–572. [23] B. W. Li, S. Tian, Y. S. Sun, Z. M. Hu, Schur-decomposition for 3D matrix equations and its application in solving radiative discrete ordinates equations discretized by Chebyshev collocation spectral method, J. Comput. Phys. 229 (2010) 1198–1212. [24] A. Malek, Z. K. Bojdi, P. N. N. Golbarg, Solving fully three-dimensional microscale dual phase lag problem using mixed-collocation finite difference discretization, Journal of Heat Transfer. 134 (2012) 094504.

22

M. Najafi-Kalyani, F. P. A. Beik and K. Jbilou

[25] A. Malek, S. H. M. Masuleh, Mixed collocation–finite difference method for 3D microscopic heat transport problems, J. Comput. Appl. Math. 217 (2008) 137–147. [26] C. Martin, D. Moravitz, C. F. Van Loan, A Jacobi-type method for computing orthogonal tensor decompositions, SIAM J. Matrix Anal. Appl. 30 (2008) 1219–1232. [27] S. H. M. Masuleh, T. N. Phillips, Viscoelastic flow in an undulating tube using spectral methods, Comput. & Fluids. 33 (2004) 1075–1095. [28] V. A. Morozov, Methods for Solving Incorrectly Posed Problems, Springer, New York, Chapter 26, 1984. [29] H. Sadok, CMRH: A new method for solving nonsymmetric linear systems based on the Hessenberg reduction algorithm, Numer. Algorithms. 20 (1999) 303–321. [30] Y. Saad, Iterative Methods for Sparse Linear Systems, second ed., Society for Industrial and Applied Mathematics, Philadelphia, 2003. [31] X. Shi, Y. Wei, S. Ling, Backward error and perturbation bounds for high order Sylvester tensor equation, Linear Multilinear Algebra. 61 (2013) 1436–1446. [32] Y. S. Sun, M. Jing, B. W. Li, Chebyshev collocation spectral method for three-dimensional transient coupled radiative–conductive Heat transfer, Journal of Heat Transfer. 134 (2012) 092701–092707. [33] C. F. Van Loan, Structured Matrix Problems from Tensors. In: Benzi M., Simoncini V. (eds) Exploiting Hidden Structure in Matrix Computations: Algorithms and Applications, Lecture Notes in Mathematics, 2173 (2016). Springer, Cham. [34] K. Zhang, C. Gu, Flexible global generalized Hessenberg methods for linear systems with multiple right-hand sides, J. Comput. Appl. Math. 236 (2014) 312–325.