Journal Pre-proof A sparse rank-1 approximation algorithm for high-order tensors
Yiju Wang, Manman Dong, Yi Xu
PII: DOI: Reference:
S0893-9659(19)30464-1 https://doi.org/10.1016/j.aml.2019.106140 AML 106140
To appear in:
Applied Mathematics Letters
Received date : 3 October 2019 Revised date : 12 November 2019 Accepted date : 12 November 2019 Please cite this article as: Y. Wang, M. Dong and Y. Xu, A sparse rank-1 approximation algorithm for high-order tensors, Applied Mathematics Letters (2019), doi: https://doi.org/10.1016/j.aml.2019.106140. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Elsevier Ltd. All rights reserved.
Journal Pre-proof
A Sparse Rank-1 Approximation Algorithm for High-Order Tensors ∗ Yiju Wang†
Manman Dong
‡
Yi Xu§
Abstract
1.
na lP repr oo f
For the best sparse rank-1 approximation to higher-order tensors, we propose a proximal alternating minimization method in which each iteration can be easily computed. Its global convergence is established without any assumption and the numerical experiments are provided to show the efficiency of the proposed method. Keywords. sparse rank-1 approximation, l1 -regularization, proximal alternating minimization.
Introduction
Jo
ur
In recent decades, the rapid development of information processing technology provides a broad data processing platform for modern society. Correspondingly, tensors become a natural tool of higher-order and hierarchical data sets [14]. Tensor is a higher-order extension of matrix, which has wide applications in such as signal and image processing, continuum physics, biomedical science, machine learning and exploratory multi-way data analysis, see [4, 6, 7, 8]. Hence, tensor analysis and computing have received much attention of researchers in recent decade, and they are now developed into a new research branch in mathematics named multilinear algebra, see e.g., [10, 11, 14, 21, 22, 23] and reference therein. Computing the best rank-1 approximation from a given multi-dimensional data is a classical feature extraction process in data mining [14]. For example, for a set of n-dimensional observations, the principal component analysis (PCA) amounts to computing the best rank-1 approximation to the data matrix and projecting the n-dimensional data along several principal orthogonal eigenvectors [1]. While, the best rank-1 approximation to a higher-order tensor provides a unified framework for higher-order data analysis [14]. It should be noted that the density of the latent factors in the best rank-1 approximation may destroy the supporting information behind the data and hence it cannot provide sufficient information [20]. For example, for a gene expression data set with 5000 genes for cancer patients, PCA can give a low dimensional representation which can help cluster cancer versus healthy patients [9]. However, in reality, we do not know in advance which genes should be expressed and hence the dense factors cannot provide sufficient information. Thus, to acquire a clear and accurate inner structure, sparsity needs to be imposed on the rank-1 term so that one can associate cancer versus no cancer with a small group of genes, and this results in the best sparse rank-1 approximation to a higher-order tensor [1]. Now, the sparsity strategy is a popular technology in such as signal processing and biomedical science [16, 20]. For the tensor low-rank decomposition, Sun et al. [17] proposed a truncated power method by incorporating variable selection in the estimation of decomposition components, while Ruiters et al. [15] proposed a compression technique for sparse tensor decomposition. In this paper, motivated by the proximal alternating linearization technique for optimization problem [2, 19], we develop a proximal alternating minimization method for the best sparse rank-1 approximation to higher order tensors, its global convergence is established without any assumption. ∗ This work was supported by the Natural Science Foundation of China (11671228) and Shandong Provincial Natural Science Foundation (ZR2019MA022). † School of Management Science, Qufu Normal University, Rizhao Shandong, 276800, China. E-mail:
[email protected] ‡ School of Management Science, Qufu Normal University, Rizhao Shandong, 276800, China. E-mail: dong
[email protected]. § School of Mathematics, Southeast University, Nanjing Jiangsu, 211189, China. E-mail:
[email protected]
1
Journal Pre-proof
To end this section, we give some notations used in the paper. We use Rn to denote the n dimensional real Euclidean space. Vectors are denoted by bold lowercase letters i.e. x, y, · · · , matrices are denoted by bold capital letters i.e. A, B, · · · , and tensors are written as calligraphic capitals such as A, B, · · · . In this paper, we focus on third-order tensors A = (aijk ) ∈ RI×J×K with indices 1 ≤ i ≤ I, 1 ≤ j ≤ J and 1 ≤ k ≤ K. Certainly, all discussions in this paper can be extended to tensors of arbitrary higher order case. We know that a third-order tensor A has column, row, and tube fibers, which are defined by fixing every index but one and denoted by a:jk , ai:k and aij: , respectively. Correspondingly, we obtain three matricizations of a tensor A: A(1) =[a:11 , a:21 , · · · , a:J1 , a:12 , · · · , a:J2 , · · · , a:1K , · · · , a:JK ] ∈ RI×JK , A(2) =[a1:1 , a2:1 , · · · , aI:1 , a1:2 , · · · , aI:2 , · · · , a1:K , · · · , aI:K ] ∈ RJ×IK , A(3) =[a11: , a21: , · · · , aI1: , a12: , · · · , aI2: , · · · , a1J: , · · · , aIJ: ] ∈ RK×IJ .
If we stack one column on another from the first to the last for matrix A(1) , then we can obtain the vectorization of tensor A, denoted by vec(A) such that
na lP repr oo f
vec(A) = (a:11 ; a:21 ; · · · ; a:J1 ; a:12 ; a:22 ; · · · ; a:J2 ; · · · ; a:1K ; a:2K ; · · · ; a:JK )> . The outer product of nonzero vectors x ∈ RI , y ∈ RJ and z ∈ RK is denoted by x ◦ y ◦ z ∈ RI×J×K with entries (x ◦ y ◦ z)ijk = xi yj zk ,
1 ≤ i ≤ I,
1 ≤ j ≤ J,
1 ≤ k ≤ K.
We use k · k0 to denote l0 -norm of a vector which refers to the number of nonzero entries, use k · k1 to denote l1 -norm of a vector which refers to the sum of the absolute value of the entries, use k·k2 to denote the l2 -norm of a vector, and use k · k∞ to denote the maximum of the absolute value of the vector entries. The inner product of tensors A, B ∈ RI×J×K I,J,K p P is defined as hA, Bi := aijk bijk , and the Frobenius norm of tensor A is defined as kAkF = hA, Ai. i,j,k=1
2.
Formulation of the best sparse rank-1 approximation
Mathematically, the best sparse rank-1 approximation to tensor A can be formulated as the following optimization problem min{kxk0 , kyk0 , kzk0 , kA − x ◦ y ◦ zk2F }.
Due to the difficulty in tackling the l0 norm, it is usually relaxed as an l1 norm [3]. Further, if we associate each l1 -norm term with a positive multiplier to control the sparsity, then we can obtain the following optimization problem min Ψ(x, y, z) = f (x, y, z) + λx kxk1 + λy kyk1 + λz kzk1
x,y,z
(2.1)
where f (x, y, z) = 12 kA − x ◦ y ◦ zk2F , and λx , λy , λz > 0 are regularization parameters. For simplicity, we denote ω = (x, y, z) in the subsequent analysis. For this problem, in view of Claim 1 in [13], we can obtain the following norm-balancing property.
ur
Proposition 2.1 For any solution (x∗ , y∗ , z∗ ) of problem (2.1) with regularization parameters λx , λy , λz , it holds that λx kx∗ k1 = λy ky∗ k1 = λz kz∗ k1 .
Jo
It is well known that, for the Lasso model minx kAx − bk22 +λkxk1 , when the regularization factor λ > 0 is sufficiently large, or more precisely, if λ ≥ 2kA> bk∞ , then its solution is a zero vector [12]. The conclusion was extended to the best sparse rank-1 approximation to high order tensors, i.e.,(2.1) by Wang et al. in [18] which provides a theoretical guidance for the choice of the regularization parameters. Proposition 2.2 If problem (2.1) has a unique solution for positive numbers λx , λy , λz , then the optimal solution is p 1 zero provided that 3 λx λy λz > max{2kAkF + 2, kAk2F }. 3
For problem (2.1), the l1 -regularization terms λx kxk1 , λy kyk1 and λz kzk1 guarantees that the optimal solution of the problem lies in a bounded set, and hence we have the following conclusion. Proposition 2.3 The global optimization of problem (2.1) always exists.
2
Journal Pre-proof
3.
Algorithm and convergence
To solve problem (2.1), we recall the proximal linearized minimization algorithm [2] for the following problem min g(u) + λkuk1
u∈Rn
where g : Rn → R is continuously differentiable and its gradient ∇g(u) is Lipschitz continuous with constant Lg , λ > 0 is a regularized parameter. Generally, the proximal linearized minimization algorithm [2] for the above problem generates a new iterate via the following iterative scheme u+ ∈ arg min{g(u) + hv − u, ∇u g(u)i + v
t kv − uk2 + λkvk1 }, 2
(3.1)
where t > 0 is a constant. For the iterative scheme, we have the following sufficient descent property [2].
na lP repr oo f
Lemma 3.1 Let g : Rn → R be continuously differentiable and its gradient ∇g be Lg -Lipschitz continuous. Then for any t > Lg and u ∈ Rn , it holds that g(u+ ) + λku+ k1 ≤ g(u) + λkuk1 −
t − Lg ku − u+ k2 . 2
where u+ is generated by (3.1).
Based on the conclusion, we may design the following alternating proximal minimization scheme for problem n o tx xk+1 = arg min f (xk , yk , zk ) + hx − xk , ∇x f (xk , yk , zk )i + k kx − xk k2 + λx kxk1 , x 2 n o ty yk+1 = arg min f (xk+1 , yk , zk ) + hy − yk , ∇y f (xk+1 , yk , zk )i + k ky − yk k2 + λy kyk1 , y 2 n o tz zk+1 = arg min f (xk+1 , yk+1 , zk ) + hz − zk , ∇z f (xk+1 , yk+1 , zk )i + k kz − zk k2 + λz kzk1 , z 2
(2.1): (3.2) (3.3) (3.4)
where txk , tyk , tzk are positive numbers to be determined. Using the shrinkage operator [2],
S(x, α) = sign(x) max{|x| − α, 0} =
x − α,
0, x + α,
x ∈ [α, ∞),
x ∈ [−α, α],
x ∈ (−∞, −α],
ur
for x ∈ R and α > 0, we can obtain an explicit formula to iterative schemes (3.2)-(3.4): 1 λx xk+1 =S xk − x ∇x f (xk , yk , zk ), x tk tk 1 λy yk+1 =S yk − y ∇x f (xk+1 , yk , zk ), y tk tk 1 λz zk+1 =S(zk − z ∇z f (xk+1 , yk+1 , zk ), z ). tk tk
(3.5) (3.6) (3.7)
Jo
To guarantee the descent property of the iterative scheme, by Lemma 3.1, we need to estimate the Lipschitz constant of the gradient function ∇f (x, y, z) w.r.t. x, y, z. For this, via the matricization of tensors A and x ◦ y ◦ z, the smooth term f (x, y, z) in the objective function of (2.1) can be written as 1 1 1 kA(1) − xu> k2F = kA(2) − yv> k2F = kA(3) − zw> k2F , 2 2 2
where u = z ⊗ y, v = z ⊗ x, w = y ⊗ x, and x ⊗ y = (x1 y1 , · · · , x1 yJ , · · · , xI y1 , · · · , xI yJ )> . Then, by Lemma 3.1, we may take txk = s max{u> k uk , 1},
tyk = s max{vk> vk , 1},
3
tzk = s max{wk> wk , 1},
(3.8)
Journal Pre-proof
where s > 1 and uk = zk ⊗ yk ,
vk = zk ⊗ xk+1 ,
wk = yk+1 ⊗ xk+1
(3.9)
for the iterative schemes (3.2)-(3.4). Based on this, we may establish the following algorithm for solving problem (2.1). Algorithm 3.1 Input: A third order tensor A, three positive parameters λx , λy , λz , scalar s > 1. Initial Step Take initial random non-zero vectors x0 , y0 , z0 , and set k = 0 Iterative step Compute uk by (3.9) and txk by (3.8), compute xk+1 by (3.5); Compute vk by (3.9) and tyk by (3.8), compute yk+1 by (3.6); Compute wk by (3.9) and tzk by (3.8), compute zk+1 by (3.7). ¯◦y ¯◦z ¯, where x ¯, y ¯, z ¯ is the Output: The sparse rank-1 tensor B¯ = x limitation of {xk , yk , zk }.
na lP repr oo f
The following conclusion shows the descent property of the objective function in problem (2.1) at each iteration of Algorithm 3.1. Lemma 3.2 For the sequence {xk , yk , zk } generated by Algorithm 3.1, it holds that 1 (s − 1)kxk − xk+1 k2 , 2 1 Ψ(xk+1 , yk , zk ) − Ψ(xk+1 , yk+1 , zk ) ≥ (s − 1)kyk − yk+1 k2 , 2 1 Ψ(xk+1 , yk+1 , zk ) − Ψ(xk+1 , yk+1 , zk+1 ) ≥ (s − 1)kzk − zk+1 k2 . 2 Ψ(xk , yk , zk ) − Ψ(xk+1 , yk , zk ) ≥
Proof. Since the gradient functions ∇x f (x, yk , zk ), ∇y f (xk+1 , y, zk ), ∇z f (xk+1 , yk+1 , z) are Lipschitz continuous > > with Lipschitz constants max{u> k uk , 1}, max{vk vk , 1}, max{wk wk , 1}, respectively, from (3.2)-(3.4) and Lemma 3.1, it holds that f (xk+1 , yk , zk ) + λx |xk+1 | ≤f (xk , yk , zk ) + λx |xk | − and therefore,
(s − 1) max{u> k uk , 1} kxk+1 − xk k2 , 2
1 2 Ψ(xk , yk , zk ) − Ψ(xk+1 , yk , zk ) ≥ (s − 1) max{u> k uk , 1}kxk − xk+1 k 2 1 ≥ (s − 1)kxk − xk+1 k2 . 2 The second and the third inequalities can be similarly proved.
k→∞
ur
Lemma 3.3 For sequence {ωk } = {(xk , yk , zk )} generated by Algorithm 3.1, it holds that (i) sequence {ωk } is bounded, and sequence {Ψ(ωk )} is nonincreasing. In particular, for any k ∈ N, there is a scalar γ > 0 such that Ψ(ωk ) − Ψ(ωk+1 ) ≥ γkωk − ωk+1 k2F . (ii) lim kxk − xk+1 k → 0, lim kyk − yk+1 k → 0, lim kzk − zk+1 k → 0. k→∞
k→∞
Jo
Proof. The boundedness of sequence {ωk } follows from the regularization term in the objective function Ψ(ωk ). For the sequence generated by Algorithm 3.1, it follows from Lemma 3.2 that Ψ(ωk ) − Ψ(ωk+1 ) ≥
s−1 kωk − ωk+1 k2F . 2
The first conclusion follows. For (ii), since sequence {Ψ(ωk )} is nonincreasing with bounded from below, we conclude that lim {Ψ(ωk ) − Ψ(ωk+1 )} = 0.
k→∞
4
(3.10)
Journal Pre-proof
Combining this with (3.10) yields that
lim kωk − ωk+1 k = 0.
k→∞
Following the discussion in [19], we can obtain the Lipschitz upper bounds for subdifferentials of function Ψ(ω), and establish the global convergence of Algorithm 3.1. Lemma 3.4 Let ωk = (xk , yk , zk ) be the sequence generated by Algorithm 3.1. Then there exist positive scales L1 , L2 , L3 such that for any k, there exist θkx ∈ ∂x Ψ(ωk ), θky ∈ ∂y Ψ(ωk ), θkx ∈ ∂z Ψ(ωk ) such that kθkx k ≤ L1 kωk − ωk−1 kF , Proof.
kθky k ≤ L2 kωk − ωk−1 kF ,
kθkz k ≤ L3 kωk − ωk−1 kF .
From iterative scheme (3.2) on x, we know that ∇x f (xk−1 , yk−1 , zk−1 ) + txk−1 (xk − xk−1 ) + λx u1k = 0, λx u1k = −∇x f (xk−1 , yk−1 , zk−1 ) − txk−1 (xk − xk−1 ).
na lP repr oo f
where u1k ∈ ∂kxk k1 . Then
Hence, for θkx = ∇x f (xk , yk , zk ) + λx u1k ∈ ∂x Ψ(ωk ), it holds that kθkx k =k∇x f (xk , yk , zk , ) + λx u1k k
=k∇x f (xk , yk , zk ) − ∇x f (xk−1 , yk−1 , zk−1 ) − txk−1 (xk − xk−1 )k
≤k∇x f (xk , yk , zk ) − ∇x f (xk−1 , yk−1 , zk−1 )k + txk−1 kxk − xk+1 k.
By Lemma 3.3, the generated sequence {ωk } is bounded and hence the gradient function ∇f (ω) is Lipschitz continuous on a closed bounded set containing sequence {ωk }. Thus, there exists constant L1 > 0 such that kθkx k ≤ L1 kωk − ωk−1 kF .
Similarly, we can show that there exists positive constants L2 , L3 such that kθky k ≤ L2 kωk − ωk−1 kF ,
kθkz k ≤ L3 kωk − ωk−1 kF .
Theorem 3.1 For the sequence {ωk }n∈N generated by Algorithm 3.1, its any cluster ω ∗ is a critical point of Ψ(ω), i.e., 0 ∈ ∂Ψ(ω ∗ ). Proof.
By Lemma 3.2, for any k, it holds that
Ψ(ωk−1 ) − Ψ(ωk ) ≥
s−1 kωk − ωk−1 k2 . 2
ur
By Lemma 3.4, there exists constant L > 0 and θk ∈ ∂Ψ(ωk ) such that for any k, kθk kF 6 Lkωk − ωk−1 kF .
4.
Jo
Combining this with the fact that the sequence {Ψ(ωk )} is convergent, we can obtain the desired result.
Numerical Experiments
To give a numerical experiment for the proposed method, we introduce a coefficient α ∈ R into the concerned problem and normalize all the latent factors in each step to establish a practical version of the algorithm. In detail, we set f (x, y, z, α) =
1 kA − αx ◦ y ◦ zk2F , 2
and consider the following optimization problem
5
Journal Pre-proof
min
α,kxk=kyk=kzk=1
Ψ(x, y, z, α) = f (x, y, z, α) + λx kxk1 + λy kyk1 + λz kzk1 .
For this problem, based on Lemma 3.1, we may use the following iterative scheme to generate a new iterate: ¯ k+1 = arg min{f (xk , yk , zk , αk ) + hx − xk , ∇x f (xk , yk , zk , αk i + x x
xk+1
¯ k+1 x = , k¯ xk+1 k
t1x kx − xk k2 + λx kxk1 } 2
αk = αk k¯ xk+1 k;
yk+1 = arg min{f (xk+1 , yk , zk , αk ) + hy − yk , ∇y f (xk+1 , yk , zk , αk i + y
yk+1
¯ k+1 y = , k¯ yk+1 k
t2y ky − yk k2 + λy kyk1 } 2
αk = αk k¯ yk+1 k.
zk+1 = arg min{f (xk+1 , yk+1 , zk , αk ) + hz − zk , ∇z f (xk+1 , yk+1 , zk , αk i + z
¯k+1 z , k¯ zk+1 k
αk = αk k¯ zk+1 k.
na lP repr oo f
zk+1 =
αk+1 = arg min f (xk+1 , yk+1 , zk+1 , α) = α
t3z kz − zk k2 + λz kzk1 } 2
where txk = s max{u> k uk , 1},
1 kA − [α; xk+1 , yk+1 , zk+1 ]k2F 2
tyk = s max{vk> vk , 1},
uk =αk zk ⊗ yk , >
tzk = s max{wk> wk , 1}, and
vk = αk zk ⊗ xk+1 ,
∇x f (x, y, z, α) =(xu − A(1) )u,
wk = αk yk+1 ⊗ xk+1 ,
∇y f (x, y, z, α) = (yv> − A(2) )v,
∇z f (x, y, z, α) = (zw> − A(3) )w.
Hence, we have the following practical version for Algorithm 3.1.
Algorithm 4.1 Input: A third order tensor A, regularization parameters λx , λy , λz and scalar s > 1; Initial step Take initial non-zero unit vectors x0 ∈ RI , y0 ∈ RJ , z0 ∈ RK , compute q0 = (z0 ⊗ y0 ⊗ x0 )> and α0 = q> 0 vec(A). Set k = 0. Iterative step Compute uk = αk zk ⊗ yk , txk = s max{u> kuk , 1}, and 1 ∇x f (xk , yk , zk , αk ), λtxx , tx k k ¯ k+1 x Set xk+1 = k¯xk+1 k , and αk = αk k¯ xk+1 k. Compute vk = αk zk ⊗ xk+1 , tyk = s max{vk>vk , 1} and λ yk+1 = S yk − t1y ∇y f (xk+1 , yk , zk , αk ), tyy , k k ¯ k+1 y Set yk+1 = k¯yk+1 k , and αk = αk k¯ yk+1 k. Computewk = αk yk+1 ⊗ xk+1 , tzk = s max{wk> wk , 1} and
¯ k+1 = S xk − x
λz 1 y ∇z f (xk+1 , yk+1 , zk , αk ), z tk tk
ur
zk+1 = S yk −
Set zk+1 =
¯k+1 z , k¯ zk+1 k
,
and αk = αk k¯ zk+1 k.
Jo
Compute qk+1 = (zk+1 ⊗ yk+1 ⊗ xk+1 )> and αk+1 = qk+1 > vec(A). Output: The sparse rank-1 tensor B = αx ◦ y ◦ z, where x, y, z is the limitation of {xk , yk , zk }. To test the performance of the proposed method, we do a set of numerical experiments in Matlab2014b and run on a Lenovo computer with 8GB RAM and an i7-7700 processor. In our numerical experiments, we take three 3-order running tensors with distinct dimensions whose entries are normally generated, and the algorithm terminates when the iteration number exceeds 10000. The numerical results are shown in Table 4.1, where the “running time” means the computing times and “sparsity” means the nonzero numbers in the output latent factors for each problem.
6
Journal Pre-proof
Table 4.1 Numerical results for Algorithm 4.1 dimension (20,20,30) (20,30,40)
parameter s = 2, λ = (200, 150, 100) s = 2, λ = (45, 30, 17)
running time 6.424 8.482
sparsity (1,1,1) (6,6,5)
(30,40,50)
s = 2, λ = (487, 230, 120) s = 2, λ = (485, 230, 120)
19.200 17.068
(1,4,3) (4,2,2)
From Table 4.1, we can see that the algorithm can provide a sparse rank-1 approximation to a three-order tensor in short running times.
5.
Conclusion
References
na lP repr oo f
This paper presented a proximal alternating minimization method for the best sparse rank-1 approximation problem of higher-order tensors, and its global convergence is guaranteed generally. A set of numerical experiments are given to show the efficiency of the method. From the numerical experiments, we can see that the proposed method is sensitive to the selection of regularization parameters λx , λy , λz . Thus, to improve the performance of the method, the choice of the regularization parameters should be further considered.
[1] A. Aspremont, F. Bach, L. Ghaoui, Optimal solutions for sparse principal component analysis, J Mach Learn Res, 9(2008)1269-1294. [2] J.Bolte, S. Sabach, M. Teboulle, Proximal alternating linearized minimization nonconvex and nonsmooth problems, Math Programming,146 (2014)459-494. [3] E.J. Cand`es, M. Wakin, S. Boyd, Enhancing sparsity by reweighted l1 minimization, J Fourier Anal Appl, 14 (2007) 877-905. [4] Y. Chen, Y. Dai, D.Han, W. Sun, Positive semidefinite generalized diffusion tensor imaging via quadratic semidefinite programming, SIAM J Imaging Sciences, 6(2013)1531-1552. [5] L.B. Cui, M.H. Li, Y.S. Song, Preconditioned tensor splitting iterations method for solving multi-linear systems, Appl Math Letters, 96(2019), 89-94. [6] D.R. Han, L. Qi, H.H. Dai, Conditions for strong ellipticity of antisotropic elastic materials. J Elasticity, 97(2009) 1-13.
ur
[7] S.L. Hu, L. Qi, G.F. Zhang, Computing the geometric measure of entanglement of multipartite pure states by means of non-negative tensors, Physical Review A, 93(2016),012304. [8] Z.H. Huang, L. Qi, Formulating an n-person noncooperative game as a tensor complementarity problem, Comput Optim Appl, 66(2017),557-576.
Jo
[9] E.Hyman, P. Kauraniemi, S. Hautaniemi, M. Wolf, et al., Impact of DNA amplification on gene expression patterns in breast cancer, Cancer Res 62(2002) 6240-6245. [10] J.Ji, Y.M. Wei, The Drazin inverse of an even-order tensor and its application to singular tensor equations, Computers & Math Appl, 75(2018) 3402-3413. [11] C. Ling, H. He and L. Qi, On the cone eigenvalue complementarity problem for higherorder tensor, Comput Optim Appl,63(2016)1-26. [12] M.R. Osborne, B. Presnell, B.A. Turlach, On the Lasso and its dual, J. Comput. Graph. Statist. 9(2) (2000) 319-337.
7
Journal Pre-proof
[13] E.E. Papalexakis, N. D. Sidiropoulos, R. Bro, From K-means to higher-way coclustering: multilinear decomposition with sparse latent factors, IEEE Trans Signal Process, 61(2013) 493-506. [14] L. Qi, H. Chen, Y. Chen. Tensor Eigenvalues and Their Applications. Springer, Singapore, 2018. [15] R. Ruiters and R. Klein, BTF Compression via Sparse Tensor Decomposition, Eurographics Symposium on Rendering, 2009, 28(4),1181-1188. [16] N.D. Sidiropoulos, A. Kyrillidis, Multi-way compressed sensing for sparse low-rank tensors, IEEE Signal Process Lett, 19(2012)757-760. [17] W.W. Sun, J.W. Lu, H. Liu, Provable sparse tensor decomposition, J Royal Statis Society, B, 79(2017) 899-916. [18] Y. Wang, W. Liu, L. Caccetta, G. Zhou, Parameter selection for nonnegative l1 matrix/tensor sparse decomposition, Operations Research Letters 43 (2015) 423-426. [19] X.F Wang, Carmeliza Navasca, Low-rank approximation of tensors via sparse optimization, Numer Linear Algebra Appl, 25(2018),e2136.
na lP repr oo f
[20] D.M. Witten, R. Tibshirani, T. Hastie, A penalized matrix decomposition with applications to sparse principal components and canonical correlation analysis, Biostatistics 10(2009) 515-534. [21] N.Zhao, Q.Z. Yang, Y.J. Liu, Computing the generalized eigenvalues of weakly symmetric tensors, Comput Optim Appl, 66(2017) 285-307. [22] L.Zhang, L.Qi, G.Zhou, M-tensors and some applications, SIAM J Matrix Anal Appl, 2014, 35, 437-452.
Jo
ur
[23] X.Z. Zhang, Z.H Huang, L.Q Qi, Comon’s conjecture, rank decomposition, and symmetrix rank decomposition of symmetric tensors, SIAM J Matrix Anal Appl, 37(2016)1719-1728.
8
Journal Pre-proof
CRediT Author Statement Yiju Wang: Conceptualization, Methodology, Investigation, WritingReviewing and Editing. Manman Dong: Development or design of methodology; Writing-Original draft preparation.
Jo
ur
na lP repr oo f
Yi Xu: Software, Validation.
1