Exact probabilities for typical ranks of 2 × 2 × 2 and 3 × 3 × 2 tensors

Exact probabilities for typical ranks of 2 × 2 × 2 and 3 × 3 × 2 tensors

Linear Algebra and its Applications 438 (2013) 663–667 Contents lists available at SciVerse ScienceDirect Linear Algebra and its Applications journa...

219KB Sizes 2 Downloads 54 Views

Linear Algebra and its Applications 438 (2013) 663–667

Contents lists available at SciVerse ScienceDirect

Linear Algebra and its Applications journal homepage: w w w . e l s e v i e r . c o m / l o c a t e / l a a

Exact probabilities for typical ranks of 2 × 2 × 2 and 3 × 3 × 2 tensors Göran Bergqvist Matematiska institutionen, Linköpings universitet, SE-581 83 Linköping, Sweden

ARTICLE INFO

ABSTRACT

Article history: Received 12 May 2010 Accepted 16 February 2011 Available online 24 March 2011

We show that the probability to be of rank 2 for a 2 × 2 × 2 tensor with elements from a standard normal distribution is π/4, and that the probability to be of rank 3 for a 3 × 3 × 2 tensor is 1/2. In the proof results on the expected number of real generalized eigenvalues of random matrices are applied. For n × n × 2 tensors with n ≥ 4 we also present some new aspects of their rank. © 2011 Elsevier Inc. All rights reserved.

Submitted by V. Mehrmann AMS classification: 15A69 15B52 Keywords: Tensors Multi-way arrays Typical rank Random matrices

1. Introduction The rank concept for multi-way arrays or tensors is not as simple as for matrices. If T is a real m × n × p 3-way array (or 3-tensor), then it can be expanded as T

=

r  i=1

ci ui

⊗ v i ⊗ wi

(ci ∈ R, ui ∈ Rm , vi ∈ Rn , wi ∈ Rp )

(1)

[2,8,10], where ⊗ denotes the tensor (or outer) product. This is called the CP expansion (or CANDECOMP for canonical decomposition, or PARAFAC for parallel factors) of T , and the extension to higher order tensors or arrays is obvious. The are several rank concepts for tensors [1,3,4,7,10,11]. The rank of T is the minimal possible value of r in the CP expansion (1) and is always well defined. The column (mode-1) rank of T is the E-mail address: [email protected] 0024-3795/$ - see front matter © 2011 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.laa.2011.02.041

664

G. Bergqvist / Linear Algebra and its Applications 438 (2013) 663–667

dimension of the subspace of Rm spanned by the np columns of T (for every fixed pair of values of jk we have such a column). The row (mode-2) rank r2 and the mode-3 rank r3 are defined analogously. The triple (r1 , r2 , r3 ) is called the multirank of T . A typical rank of T is a rank which appears with nonzero probability if the elements Tijk are randomly chosen from a continuous probability distribution. A generic rank is a typical rank which appears with probability 1. In the matrix case, the number of terms r in the singular value expansion is always equal to the column and row ranks of the matrix. However, for tensors, r, r1 , r2 and r3 can all be different. For matrices the typical and generic ranks of an m × n matrix are always min(m, n). However, for a higher order tensor a generic rank over the real numbers does not necessarily exist (over the complex numbers a generic rank always exists, but in this paper we only consider real tensors and real CP expansions). Both the typical and generic ranks of an m × n × p tensor may be strictly greater than min(m, n, p), and are in general hard to calculate. In 1989, using numerical simulations with each tensor element drawn from a normal distribution with zero mean, Kruskal [12] reported that the probabilities for a 2 × 2 × 2 tensor to have rank 2 or 3 are approximately 79% and 21%, respectively. Hence, ranks 2 and 3 are typical, and no generic rank exists (over the complex numbers it is 2). It was later shown [15] for all n ≥ 2, that for n × n × 2 tensors there is no generic rank and that the typical ranks are n and n + 1. The generic and typical ranks of m × n × p tensors for several small values of m, n and p have recently been determined [3,16]. While a generic rank seems to exist for most (m, n, p), another case with two typical ranks is 5 × 3 × 3 tensors for which 5 and 6 are typical ranks. Until now, the probabilities for random tensors to be of the different possible typical ranks have only been studied by numerical simulations. Below we derive the first exact values of such probabilities, namely for 2 × 2 × 2 and 3 × 3 × 2 tensors. For other aspects of the CP expansion, such as uniqueness of the expansion, estimates of maximal rank, and low rank approximations, we refer to the review paper [10] and the references therein, other types of tensor decompositions (e.g., Tucker and higher order singular value decompositions) and applications are also described there. 2. A characterization of n × n × 2 tensors of rank n We now assume that T is a real n × n × 2 tensor whose elements Tijk , 1 ≤ i, j ≤ n, 1 ≤ k ≤ 2, are picked from some continuous probability distribution. The theorem presented in this section is essentially given by ten Berge [14], although the probablistic view was not used there (but see [13]). Define the n × n matrices T1 and T2 , the frontal slices of T , by (T1 )ij = Tij1 and (T2 )ij = Tij2 . We have Theorem 1. With probability 1: rank(T )

= n ⇐⇒ det(T2 − λT1 ) = 0 has n real solutions

Proof. Suppose first that rank(T ) ⎛ ⎞ n  di T = ui ⊗ vi ⊗ ⎝ ⎠ ei i=1

= n. Then we can write (2)

Following [14], we define the n × n matrices U with columns ui , V with columns vi , D diagonal with elements di on the diagonal, and E diagonal with elements ei on the diagonal. Then T1 = UDVT and T2 = UEVT . If we assume that T1 is invertible (true with probability 1), then det U  = 0  = det V and all di  = 0. Hence  the generalized eigenvalue equation 0 = det(T2 − λT1 ) = det U det(D − λE) det V = det U det V ni=1 (ei − λdi ) has n real solutions. Conversely, if det(T2 −λT1 )

1 = 0 has n real solutions, then (again assuming T1 invertible) det(T2 T− 1

1 − λI) = 0 has n real solutions. With probability 1, all eigenvalues of T2 T− 1 are distinct and it can −1 − 1 be diagonalized as T2 T1 = PP (with coinciding eigenvalues there could be non-trivial Jordan blocks). Define U = P and VT = P−1 T1 . Then T1 = PVT = UIVT and T2 = PP−1 T1 = UVT and T

has the expansion

G. Bergqvist / Linear Algebra and its Applications 438 (2013) 663–667

T

=

n 



⊗ vi ⊗ ⎝

ui

i=1

665

⎞ 1

λi



(3)

and is therefore of rank n.  Notice that in the proof, the CP expansion of T is actually constructed. The exceptional (probability 0) cases were discussed in some detail by ten Berge [14] but are not relevant for the results of this paper. 3. The expected number of real generalized eigenvalues of random matrices Consider random matrices whose elements are independent random variables and normally distributed with mean 0 and variance 1. Kanzieper and Akemann [9] (see also Edelman [6]) have found the probabilities for such a random n × n matrix A to have k real eigenvalues, i.e., the probability for −1 having k real solutions to det(A − λI) = 0. Since T2 T1 is not normally distributed we cannot apply their result to it. The same problem for the generalized eigenvalue problem det(A − λB) = 0, with B having the same distribution as A, seems to be unsolved. In [5], however, Edelman et al. found both the expected number of real eigenvalues and the expected number of real generalized eigenvalues for such random matrices. Let En be the expectation value for the number of real solutions to det(A − λB) = 0 with A and B as above. Then [5]  En

=

n+1 √  2  π n



(4)

2

It turns out that this information is sufficient to solve our problem for n × n × 2 tensors with n and n = 3.

=2

4. Probabilities for typical ranks of 2 × 2 × 2 and 3 × 3 × 2 tensors Now assume that T is an n × n × 2 tensor whose elements are independent random variables and normally distributed with mean 0 and variance 1. Then the slices T1 and T2 are random n × n matrices. The expected number En of real solutions to det(T2 − λT1 ) = 0 is given by (4). We also have En

=

n  k=1

kpn (k)

(5)

where pn (k) is the probability for having k real generalized eigenvalues. Since complex eigenvalues come in pairs we can write En

=

[(n− 1)/2]  k=0

(n − 2k)pn (n − 2k)

(6)

The following is our main theorem. Theorem 2. Suppose that T is an n × n × 2 tensor whose elements are independent random variables which are normally distributed with mean 0 and variance 1, and let Pn denote the probability that T has rank n. Then P2 = π/4 and P3 = 1/2.

= pn (n) is equal to the probability that det(T2 −λT1 ) = 0 has n real solutions. √ ( 3 ) = 2, by (4), the expected number of real solutions is E2 = π (12 ) . Recall that  (m) = (m − 1)! √ √ √ π /2 −2)!! if m is an integer and  ( m ) = π 2(m π 1 = π2 . By (6), (m−1)/2 if m is an odd integer. Hence E2 = 2 π E2 = 2p2 (2) and we conclude that P2 = p2 (2) = 4 .

Proof. By Theorem 1, Pn For n

666

G. Bergqvist / Linear Algebra and its Applications 438 (2013) 663–667

For n

= 3, again by (4), E3 =

3p3 (3) + 1(1 − p3 (3))

√ (2) √ 1 π ( 3 ) = π √π/ = 2. By (6), E3 = 3p3 (3) + 1p3 (1) = 2 2

= 2p3 (3) + 1 = 2P3 + 1 so we conclude that P3 = 1/2. 

Since the only other typical rank of an n × n × 2 tensor is n + 1 we immediately have: Corollary 3. Suppose that T is an n × n × 2 tensor whose elements are independent random variables which are normally distributed with mean 0 and variance 1, and let P˜ n denote the probability that T has rank n + 1. Then P˜ 2 = 1 − π/4 and P˜ 3 = 1/2. For n ≥ 4, there are at least three terms in (6) so the condition sufficient to determine Pn = pn (n). For n = 4 we get 2p4 (2) + 4p4 (4) and for n

= E4 =

[n/2] k=0

pn (n

− 2k) = 1 is not

√  ( 52 ) 3π = π ; p4 (0) + p4 (2) + p4 (4) = 1  (2) 4

=5

1p5 (1) + 3p5 (3) + 5p5 (5)

= E5 =

√  (3) 8 π 5 = ; p5 (1) + p5 (3) + p5 (5) = 1 3 ( 2 )

(7)

(8)

and so on for increasing values of n. Remark. It is interesting to interpret the above results as a result for how often curves intersect. For the case n = 2, expanding det(T2 − xT1 ) = 0 to an equation of the type (a1 x − a2 )(a3 x − a4 ) = (a5 x − a6 )(a7 x − a8 ), we see that the probability for two paraboloids with real roots to intersect is π/4, when the coefficients ai are chosen from a standard normal distribution. Since the equation has real solutions if B2 − 4AC ≥ 0, where A = a1 a3 − a5 a7 , B = a5 a8 + a6 a7 − a1 a4 − a2 a3 , and C = a2 a4 − a6 a8 (for T , B2 − 4AC will just be its hyperdeterminant [4]), it is very simple to verify that the value of P2 must be close to π/4 by running a large number of tests with each ai from the normal distribution N (0, 1) and checking how often B2 ≥ 4AC. Since π/4 ≈ 0.7854, we see that the value 79% of Kruskal’s original simulation [12] is a good approximation. For n = 3 a result for qubic equations is obtained and one can also by simple simulations see that the value of P3 must be near the exact value 1/2 found above. References [1] G. Bergqvist, E.G. Larsson, The higher-order singular value decomposition: theory and an application, IEEE Signal Proc. Mag. 27 (2010) 151–154. [2] J.D. Carroll, J.J. Chang, Analysis of individual differences in multidimensional scaling via N-way generalization of Eckart–Young decomposition, Psychometrika 35 (1970) 283–319. [3] P. Comon, J.M.F. ten Berge, L. De Lathauwer, J. Castaing, Generic and typical ranks of multi-way arrays, Linear Algebra Appl. 430 (2009) 2997–3007. [4] V. De Silva, L.-H. Lim, Tensor rank and the ill-posedness of the best low-rank approximation problem, SIAM J. Matrix Anal. Appl. 30 (2008) 1084–1127. [5] A. Edelman, E. Kostlan, M. Shub, How many eigenvalues of a random matrix are real? J. Amer. Math. Soc. 7 (1994) 247–267. [6] A. Edelman, The probability that a random real Gaussian matrix has real eigenvalues, related distributions, and the circular law, J. Multivariate Anal. 60 (1997) 203–232. [7] S. Friedland, On the generic rank of 3-tensors, 2009. Available from: arXiv:0805.3777v3. [8] R. Harshman, Foundations of the PARAFAC procedure: models and conditions for an explanatory multi-modal factor analysis, UCLA Working Pap. Phonetics 16 (1970) 1–84. [9] E. Kanzieper, G. Akemann, Statistics of real eigenvalues in Ginibre’s ensemble of random real matrices, Phys. Rev. Lett. 95 (2005) 230201 [10] T.G. Kolda, B.W. Bader, Tensor decompositions and applications, SIAM Rev. 51 (2009) 455–500. [11] J.B. Kruskal, Three-way arrays: rank and uniqueness of trilinear decompositions, with applications to arithmetic complexity and statistics, Linear Algebra Appl. 18 (1977) 95–138. [12] J.B. Kruskal, Rank, decomposition, and uniqueness for 3-way and N-way arrays, Multiway Data Analysis, North-Holland, Amsterdam, 1989, pp. 7–18. [13] A. Stegeman, Degeneracy in CANDECOMP/PARAFAC explained for p × p × 2 arrays of rank p + 1 or higher, Psychometrika 71 (2006) 483–550.

G. Bergqvist / Linear Algebra and its Applications 438 (2013) 663–667

667

[14] J.M.F. ten Berge, Kruskal’s polynomial for 2 × 2 × 2 arrays and a generalization to 2 × n × n arrays, Psychometrika 56 (1991) 631–636. [15] J.M.F. ten Berge, H.A.L. Kiers, Simplicity of core arrays in three-way principal component analysis and the typical rank of p × q × 2 arrays, Linear Algebra Appl. 294 (1999) 169–179. [16] J.M.F. ten Berge, A. Stegeman, Symmetry transformations for squared sliced three-way arrays, with applications to their typical rank, Linear Algebra Appl. 418 (2006) 215–224.