Linear Algebra and its Applications 306 (2000) 45–57 www.elsevier.com/locate/laa
On sums of three square-zero matrices聻 K. Takahashi1 Department of Mathematics, Hokkaido University, Sapporo 060, Japan Received 23 September 1999; accepted 15 October 1999 Submitted by R.A. Brualdi
Abstract Wang and Wu characterized matrices which are sums of two square-zero matrices, and proved that every matrix with trace-zero is a sum of four square-zero matrices. Moreover, they gave necessary or sufficient conditions for a matrix to be a sum of three square-zero matrices. In particular, they proved that if an n × n matrix A is a sum of three square-zero matrices, the dim ker(A − αI ) 6 3n/4 for any scalar α = / 0. Proposition 1 shows that this condition is not necessarily sufficient for the matrix A to be a sum of three square-zero matrices, and characterizes sums of three square-zero matrices among matrices with minimal polynomials of degree 2. © 2000 Elsevier Science Inc. All rights reserved.
Let Cn×m denote the space of all n × m complex matrices. The block diagonal matrix with diagonal blocks A1 , . . . , Am is denoted by A1 ⊕ · · · ⊕ Am , and if A1 = · · · = Am = A, then we write A1 ⊕ · · · ⊕ Am = A(m) . Let In denote the n × n identity matrix. Proposition 1. Let A be an n × n matrix with tr A = 0 and assume that A is similar to (m) β 0 ⊕ αIr , 0 α 聻 All correspondence to: Pei Yuan Wu, Department of Applied Mathematics, National Chiao Tung University, 1001 Ta Hsueh Road, Hsinchu, Taiwan, ROC. E-mail address:
[email protected] (P.Y. Wu). 1 Deceased.
0024-3795/00/$ - see front matter 2000 Elsevier Science Inc. All rights reserved. PII: S 0 0 2 4 - 3 7 9 5 ( 9 9 ) 0 0 2 3 9 - 6
46
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
where α = / β and m, r > 1. Then A is a sum of three square-zero matrices if and only if r is a divisor of 2m. Lemma 1. Let (2) α 0 ∈ C4×4 A= 0 β
and
α= / β.
Then, for any γ and δ such that γ = / δ and γ + δ = α + β, there is a square-zero matrix N such that A + N is similar to γ 1 δ 1 ⊕ . 0 γ 0 δ Proof. Clearly, A is similar to γ I2 I2 , cI2 δI2 where c = (α − γ )(γ − β). By considering the matrix 1 0 I2 −1 γ I2 ⊕ I1 , S for S = I1 ⊕ S cI2 δI2 1 1 we see that there is a square-zero matrix N such that A + N is similar to γ 1 1 0 0 γ 0 1 0 0 δ −1 , 0 0 0 δ which is similar to γ 1 δ ⊕ 0 γ 0 because γ = / δ.
1 δ
The following lemma is a special case of Proposition 1. Lemma 2. Let A be an n × n matrix with tr A = 0 and suppose that A is similar to (m) β 0 ⊕ αIr , 0 α where α and β are scalars with α = / β. If r 6 2, then A is a sum of three square-zero matrices. Proof. Since tr A = 0, the condition α = / β is equivalent to α = / 0. The case when r = 0 or 1 follows from [1, Proposition 3.3] and its proof. (Indeed, since α = / β, the proof of [1, Proposition 3.3] with c = −α shows the case of r = 1.) Thus we
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
47
consider the case when r = 2. Suppose that m is even. Then A is similar to A1 ⊕ A1 , where (m/2) β 0 ⊕ αI1 and tr A1 = 0. A1 = 0 α As remarked above, A1 is a sum of three square-zero matrices, and hence so is A. Next, suppose that m is odd and m = 2k + 1. The condition tr A = 0 implies β + α = −2α/m. By Lemma 1, for 1 6 i 6 k, there is a square-zero matrix Ni such that (2) β 0 + Ni 0 α is similar to −(2i + 1)α/m 1 (2i − 1)α/m 1 ⊕ . 0 −(2i + 1)α/m 0 (2i − 1)α/m Also, there is a square-zero matrix M such that the matrix β 0 +M 0 α is similar to −α/m 1 0 −α/m (see [1] or the proof of Lemma 1). Let 0 1 . N = N1 ⊕ · · · ⊕ Nk ⊕ M ⊕ 0 0 Then N 2 = 0 and the matrix ! (m) β 0 ⊕ αI2 + N 0 α is similar to k M −(2i + 1)α/m
1 B= 0 −(2i + 1)α/m i=1 (2i − 1)α/m 1 ⊕ 0 (2i − 1)α/m −α/m 1 α 1 ⊕ ⊕ . 0 −α/m 0 α Clearly, B is similar to −B, and so by [1, Theorem 2.11] B is a sum of two squarezero matrices. Therefore it follows that A is a sum of three square-zero matrices. Lemma 3. Let A be an n × n matrix whose minimal polynomial is m(λ) = (λ − α)(λ − β), and let N be an n × n square-zero matrix. If γ is the eigenvalue of A + N and γ = / α, β, then α + β − γ is also the eigenvalue of A + N.
48
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
Proof. Since N 2 = 0, we can take an invertible matrix P such that 0 0 −1 , P NP = N1 0 where N1 ∈ Cr×(n−r) . Let A11 A12 −1 , P AP = A21 A22 where A21 ∈ Cr×(n−r) . Then, since (A − αI )(A − βI ) = 0, we have A11 A12 + A12 A22 = (α + β)A12 and the invariant polynomials of the matrix polynomials A12 [A11 − λI, A12 ] and A22 − λI are divisors of (λ − α)(λ − β). Hence the lemma follows from [2, Theorem 6(b)].
Proof of Proposition 1. Suppose that 2m = rs for some integer s. Then, according as r is odd or even, A is similar to C (r) or D (r/2), where (s) (s/2) β 0 β 0 ⊕ αI1 and D = ⊕ αI2 . C= 0 α 0 α In each case, the condition tr A = 0 implies tr C = 0 or tr D = 0. By Lemma 2, the matrices C and D are sums of three square-zero matrices and so A is a sum of three square-zero matrices. Conversely, assume that A is a sum of three square-zero matrices. Then there is a square-zero matrix N such that A + N is a sum of two square-zero matrices. Since rank(A − αI ) < n/2 and rank N 6 n/2 because N is square-zero, we have α ∈ σ (A + N), so it follows from [1, Theorem 2.11] that −α ∈ σ (A + N). Since tr A = 0, the conditions α = / β and r > 1 imply that −(kα + (k − 1)β) = / β for every integer k > 1 (and −α = / α). Therefore, if −(kα + (k − 1)β) = / α for all integers k > 2, then it follows from Lemma 3 and [1, Theorem 2.11] that kα + (k − 1)β ∈ σ (A + N) for all k, which is impossible. Thus we have −(kα + (k − 1)β) = α for some integer k > 2 and therefore 2m = (k − 1)r because tr A = 0. For a matrix A, let µA = max{dim ker(A − αI ) : α ∈ C}. If tr A = 0 and µA = dim ker A, then the rational form of A shows that A is similar to A1 ⊕ · · · ⊕ Am ⊕ 0, where each Ai is a cyclic matrix of size at least 2. By [1, Proposition 3.3], A1 ⊕ · · · ⊕ Am is a sum of three square-zero matrices, hence so is A. Thus we consider / 0. matrices A with µA = dim ker(A − αI ) for some α = Lemma 4. Let A be an n × n matrix with µA > n/2 and N be an n × n square-zero matrix. Then there is an invertible matrix P such that
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
αIk1 ∗ P −1 AP = ∗ ∗ and
0 B11 B21 ∗
αIk1 ∗ P −1 (A + N)P = ∗ ∗
0 B12 B22 ∗
49
0 0 0 αIk2
0 B11 B21 + M ∗
0 0 , 0 αIk2
0 B12 B22 ∗
where α is a scalar such that dim ker(A − αI ) = µA , k1 + k2 = 2µA − n, Bij (i, j = 1, 2) and M are (n − µA ) × (n − µA ) matrices and ∗ are some matrices. Proof. Let r = n/2 or r = (n − 1)/2 according as n is even or odd. Since N is square-zero, we may assume that 0 0 , N= N1 0 where N1 ∈ Cr×(n−r) . We write A11 A12 A= , A21 A22 where A21 ∈ Cr×(n−r) . Then rank[A11 − αI, A12 ] 6 rank[A − αI ] = n − µA and
rank
A12 6 n − µA . A22 − αI
Therefore there are invertible matrices Q1 ∈ C(n−r)×(n−r) and Q2 ∈ Cr×r such that 0 0 0 αIµA −r #−1 " # " ∗ 0 0 0 Q1 B11 B12 Q1 A = , ∗ 0 0 Q2 0 Q2 B21 B22 ∗ ∗ ∗ αIµA +r−n where Bij ∈ C(n−µA )×(n−µA ) (i, j = 1, 2), and we can write 0 0 0 0 #−1 " # " 0 0 0 0 0 0 Q1 Q1 N = ∗ M 0 0 0 Q2 0 Q2 ∗
∗
0
0
50
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
in the same block form as the one of the matrix −1 Q1 0 0 Q1 A . 0 Q2 0 Q2 This proves the lemma. Proposition 2. Let A be an n × n matrix with tr A = 0, and suppose that µA = dim ker(A − αI ) for some α = / 0. (1) If n = 4m and µA = 3m, then A is a sum of three square-zero matrices if and only if A is similar to A(m) 1 , where A1 = diag(−3α, α, α, α). (2) If n = 4m − 1 and µA = 3m − 1, then A is a sum of three square-zero matrices ⊕ A2 , where A1 = diag(−3α, α, α, α) and if and only if A is similar to A(m−1) 1 A2 = diag(−2α, α, α). Proof. The “if” parts of the assertions (1) and (2) follow from the fact that A1 and A2 are sums of three square-zero matrices (see [1, Corollary 3.5]). So suppose that A is a sum of three square-zero matrices, or equivalently, there is a square-zero matrix N such that A + N is a sum of two square-zero matrices. (1) By Lemma 4, we may assume that 0 0 0 αIk1 B 0 10 B11 B12 A= B20 B21 B22 0 B30 and
B31
αIk1 B 10 A+N = ∗
B32 0 B11
αIk2 0 B12
0 0 , 0
B21 + M1 B22 ∗ ∗ B32 αIk2 where B11 , B22 and M1 are m × m matrices and k1 = k2 = m. Let # # " " B11 B12 0 0 , and M = B= M1 0 B21 B22 which are 2m × 2m matrices. Since A + N is a sum of two square-zero matrices, it follows from [1, Theorem 2.11] that A + N is similar to −(A + N) and therefore σ (B + M) = {−α}, which implies that A + N is similar to 0 αIk1 ⊕ (B + M). ∗ αIk2 Then, since A + N and −(A + N) are similar, it follows that (B + M + αI )2 = 0. On the other hand, the invertibility of B + M − αI implies rank[B11 − αI, B12 ] = m = rank(A − αI ).
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
51
Hence there are matrices F and G such that F [B10 , B11 − αI, B12 ] = [B20 , B21 , B22 − αI ] and G[B10 , B11 − αI, B12 ] = [B30 , B31 , B32 ], and therefore we have −1 I I 0 0 0 0 I 0 0 A 0 0 0 F I 0 0 0 G 0 I and
I F
0 I
−1
I (B + M) F
0 I F G
0 0 I 0
0 αI 0 = B10 0 0 0 I
0 B11 + B12 F = M1 I
0 B11 + B12 F 0 0
0 B12 αI 0
0 0 0 αI
B12 . αI
Since (B + M + αI )2 = 0, it follows that the matrix B12 is invertible and (B11 + B12 F + 3αI )B12 = 0, so that B11 + B12 F = −3αI . Therefore we can conclude (m) / α. that A is similar to A1 because −3α = (2) The argument similar to the proof of (1) with k1 = m and k2 = m − 1 shows that the characteristic polynomial of B + M is p(λ) = λ(λ + α)2m−1 and its minimal polynomial is a divisor of λ(λ + α)2 . Thus B + M − αI is invertible and as in the proof of (1), we see that A and B + M are similar to αI 0 0 0 ∗ 0 C B12 C B12 and , 0 M1 αI 0 αI 0 0 0 0 αI where C ∈ Cm×m . We also have that rank(B + M + αI )2 = 1. Hence rank(C + 3αI )B12 6 1 and, since the invertibility of B + M − αI implies that of B12 , dim ker(C + 3αI ) > m − 1. This, together with the identity tr B = −(2m − 1)α, shows that −2α is the eigenvalue of C and dim ker(C + 3αI ) = m − 1(α = / 0). Thus (m−1) ⊕ A2 . A is similar to A1 × m cyclic matrix and r 6 m − 2. Lemma 5. Let A = B ⊕ αIr , where B is an mP Then, for any m + r scalars δ1 , . . . , δm+r with m+r i=1 δi = tr A, there is a squarezero matrix N such that σ (A + N) = {δ1 , . . . , σm+r }. Proof. The case of r = 0 is shown in [1, Lemma 3.2], and if α ∈ / σ (B), then B ⊕ B ⊕ αI instead of B in this case, we may asαI1 is cyclic. Thus, by considering 1 Q (λ − β ) be the characteristic polynomial of sume that α ∈ σ (B). Let p(λ) = m i i=1 B, where β1 = α, and let P be an (m + r) × (m + r) invertible matrix whose jth column pj is
52
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
j Y (B − βi I )x ⊕ 0 i=1 x ⊕0 pj = jY −m (B − βi I )x ⊕ ej −m
for j 6 m − 1, for j = m, for m + 1 6 j 6 m + r,
i=1
where x is a cyclic vector of B and {e1 , e2 , . . . , er } is a basis for Cr . Then we have A12 A11 , P −1 AP = 0 αIr+1 where
A11
and
A12
β2 1 = 0 .. . 0 1 0 . . . . = .. . .. . . . 0
0
···
···
β3 .. .
0 .. . .. . 0
··· .. . .. . 1
..
. ··· β2
0
1 .. . .. . .. . .. . ···
β3 .. . .. . .. . .. . ···
··· .. . .. . .. . 0 .. . ···
0 .. . .. (m−1)×(m−1) . ∈C 0 βm 0 .. . .. . ∈ C(m−1)×(r+1). βr+1 1 .. . 0
Since the pair (A11 , A12 ) is of full range, that is, rank[A12 , A11 A12 , . . . , Am−2 11 A12 ] = m − 1, and rank A12 = r + 1, by [2, Theorem 1] there is a matrix X such that A11 A12 = {δ1 , . . . , δm+r }. σ X αI Therefore, if N is the square-zero matrix defined by 0 0 −1 N =P P , X 0 then we have σ (A + N) = {δ1 , . . . , δm+r }, which proves the lemma.
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
53
Lemma 6. Let A = B ⊕ αIm−2 , where B is an m × m cyclic matrix such that α ∈ σ (B), and let 1 6 s 6 m − 1. Then, for scalars γ and δ1 , . . . , δ2(m−1)−s such that sγ +
2(m−1)−s X
δi = tr A
and
δi = / δ
f or 1 6 i 6 2(m − 1) − s,
i=1
there is a square-zero matrix N such that A + N is similar to C1 ⊕ C2 , where C1 ∈ Cs×s and C2 ∈ C(2(m−1)−s)×(2(m−1)−s) are matrices such that (C1 − γ I )2 = 0
σ (C2 ) = {δ1 , . . . , δ2(m−1)−s }.
and
Proof. The proof of Lemma 5 with r = m − 2 shows that A12 is invertible and so A is similar to I A1 , A˜ = 0 αI where A1 ∈ C(m−1)×(m−1) is cyclic. Since D = A1 + αI is cyclic, for a cyclic vector Qj −1 x of D, the matrix Q whose jth column is i=1 (D − di )x, where di = δi + δm−1+i for 1 6 i 6 m − 1 − s and di = δi + γ for m − s 6 i 6 m − 1, is invertible, and d1 0 ··· ··· c1 .. .. 1 . . d2 c2 .. . . . −1 . . . Q DQ = 0 . . . . , .. .. .. .. . cm−2 . . . 0 ··· ··· 1 dm−1 for some scalars c1 , c2 , . . . , cm−2 . Let δ1 0 ··· ··· 0 .. .. .. 1 . . . δ2 .. .. .. .. −1 G = Q . . . . Q 0 .. .. .. .. . . . . 0 0 ··· ··· 1 δm−1 and H = D − G. Then σ (G) = {δ1 , . . . , δm−1 }
H11 and H = 0
H12 , H22
where H11 = diag(δm , δm+1 , . . . , δ2(m−1)−s ), 0 ··· 0 c1 .. ∈ C(m−1−s)×s H12 = ... · · · ... . 0
···
0
cm−1−s
54
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
and
H22
γ 0 = . . . 0
0 γ .. . ···
cm−s .. . ∈ Cs×s . cm−2 γ
··· .. . .. . 0
Since γ = / δi for all i, H is similar to γ I )2 = 0. We also have −1 I I 0 A˜ G − A1 G − A1 I
diag(δm , . . . , δ2(m−1)−s)) ⊕ H22 , and (H22 − G 0 = I H G − αA1
so that if N is the square-zero matrix defined by 0 0 I I 0 N= G − A1 I αA1 − H G 0 G − A1
I , H
0 I
−1 ,
then A˜ + N is similar to G I . 0 H But, since γ = / δi for all i, the matrix G I 0 H is similar to G J ⊕ H22 , 0 H11 where
Im−1−s ∈ C(m−1)×(m−1−s) . J = 0
This proves the lemma.
Note that the proof of Lemma 6 shows that in Lemma 6, if s < m − 2, the matrix C1 can be also taken to be C1 = γ I . Proposition 3. Let A be an n × n matrix with tr A = 0, and let m be the number of its invariant polynomials of degree 2. If µA 6 (2n − m)/3, then A is a sum of three square-zero matrices. Proof. Let α be a scalar such that dim ker(A − αI ) = µA , and let ` and r be the numbers of the invariant polynomials of A of degreeat least 3 and of degree 1,
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
L
55
`+m respectively. Using the rational form of A, we may assume that A = i=1 Bi ⊕ αIr , where each Bi is a cyclic matrix with α ∈ σ (Bi ) whose size ki is ki > 3 for 1 6 i 6 ` and kP i = 2 for ` + 1 6 i 6 ` + m. (Indeed, B`+1 = · · · = B`+m .) Note that since n = `i=1 ki + 2m + r and µA = ` + m + r, the condition µA 6 (2n − P P m)/3 is equivalent to r 6 `i=1 (2ki − 3). First suppose that r 6 `i=1 (ki − 2), which P` is equivalent to µA 6 n/2. Take ` nonnegative integers r1 , . . . , r` such that αIri for 1 6 i 6 ` and i=1 ri = r and ri 6 ki − 2 for all i, and let Ai = Bi ⊕L A . Let ti = tr Ai Ai = Bi for ` + 1 6 i 6 ` + m. Then A is similar to A˜ = `+m P`+m i=1 i for 1 6 i 6 ` + m and take a scalar c such that c > i=1 |ti |. As in the proof of [1, Proposition 3.3], we apply Lemma 5 to the matrices A1 , . . . , A`+m to obtain square-zero matrices N1 , . . . , N`+m such that i−1 i X X tj , tj − c, 0, . . . , 0 σ (Ai + Ni ) = c − j =1
j =1
Pi P for 1 6 i 6 ` + m. Then, since c − i−1 j =1 tj − c are different nonzero j =1 tj and numbers, each Ai + Ni is similar to i−1 i X X tj , tj − c, 0, . . . , 0 , diag c − j =1
j =1
L ˜ and so the matrix A˜ + N, where N = `+m i=1 Ni , is similar to −(A + N). Hence ˜ it follows from [1, Theorem 2.11] that A + N is a sum of two square-zero matrices. Since N 2 = 0, we can conclude that A is a sum of three square-zero matrices. P P Next suppose that r > `i=1 (ki − 2), and let s = r − `i=1 (ki − 2). Then A is L`+m similar to A˜ = ( i=1 Ai ) ⊕ αIs , where Ai = Bi ⊕ αIki −2 for 1 6 i 6 ` + m. P` P Since r 6 `i=1 (2ki − 3) by assumption, Pq 0 < s 6 i=1 (ki − 1), so we can take q integers s1 , . . . , sq (q 6 `) such that i=1 si = s and 1 6 si 6 ki − 1 for each i. Let for 1 6 i 6 q and ti = tr Ai for q + 1 6 i 6 ` + m, and let c be a ti = tr Ai + αsiP Pi−1 scalar with c > `+m i=1 |ti | + |α|. Then, for each i, the numbers −α, c − j =1 tj and Pi j =1 tj − c are nonzero and mutually different. Hence, for 1 6 i 6 q, by Lemma 6 there is a square-zero matrix Ni such that Ai + Ni is similar to Ci ⊕ Di , where Ci is an si × si matrix with (Ci + αI )2 = 0 and i−1 i X X tj , tj − c, 0, . . . , 0 ∈ C(2(ki −1)−si )×(2(ki −1)−si ) . Di = diag c − j =1
j =1
Also, for q + 1 6 i 6 ` + m, we have a square-zero matrix Ni such that Ai + Ni is similar to
56
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
i−1 X
tj c − j =1 Di = 0
i X tj − c 0
j =1
L (see [1] or Lemma 5). By [1, Theorem 2.11], `+m i=1 Di is a sum of two square-zero matrices, and since σ (Ci ) = {−α}, the matrices Ci ⊕ (Ci + 2αIsi ), i = 1, . . . , q, are sums of two square-zero matrices too. Now, let ! ! q `+m M M Ni ⊕ (Ci + αI ) . N= i=1
Then
i=1
= 0 and A˜ + N is similar to ! ! q `+m M M Di ⊕ (Ci ⊕ (Ci + 2αI )) ,
N2
i=1
i=1
which is a sum of two square-zero matrices. Thus it follows that A is a sum of three square-zero matrices. Corollary 1. Let A be an n × n matrix with tr A = 0. If µA 6 n/2 + 1, then A is a sum of three square-zero matrices.
Proof. Let ` and r be the numbers of the invariant polynomials of A of degree at leastP 3 and of degree 1, respectively. The condition µA 6 n/2 + 1 is equivalent to r 6 `i=1 (ki − 2) + 2, where k1 , . . . , k` are the degrees of the invariant polynomials of A with degree > 3. So, if ` = 0, then r 6 2 and it follows from Lemma P 2 that A is a sum of three square-zero matrices. If ` = / 0, then the inequality r 6 `i=1 (ki − P 2) + 2 implies r 6 `i=1 (2ki − 3), so the assertion follows from Proposition 3. Corollary 2. Let A be an n × n matrix with tr A = 0 and suppose that µA = dim ker(A − αI ) for some α = / 0. (1) When n = 6, A is a sum of three square-zero matrices if and only if µA 6 4. (2) When n = 7, A is a sum of three square-zero matrices if and only if (i) µA 6 4 or (ii) µA = 5 and A is similar to (−3αI1 ) ⊕ (−2αI1 ) ⊕ αI5 . (3) When n = 8, A is a sum of three square-zero matrices if and only if (i) µA 6 5 or (ii) µA = 6 and A is similar to (−3αI2 ) ⊕ αI6 . Proof. The “only if” parts of (1)–(3) follow from [1, Theorem 3.1] and Proposition 2. On the other hand, the “if” parts follow from [1, Corollary 3.4] and Corollary 1.
K. Takahashi / Linear Algebra and its Applications 306 (2000) 45–57
57
References [1] J.-H. Wang, P.Y. Wu, Sums of square-zero operators, Studia Math. 99 (2) (1991) 115–127. [2] K. Takahashi, Eigenvalues of matrices with given block upper triangular part, Linear Algebra Appl. 239 (1996) 175–184.