Linear Algebra and its Applications 521 (2017) 263–282
Contents lists available at ScienceDirect
Linear Algebra and its Applications www.elsevier.com/locate/laa
Some sharp bounds for the commutator of real matrices ✩ Che-Man Cheng ∗ , Yaru Liang Department of Mathematics, University of Macau, Macao, China
a r t i c l e
i n f o
Article history: Received 17 May 2016 Accepted 24 January 2017 Available online 3 February 2017 Submitted by A. Böttcher MSC: 15A45 Keywords: Schatten p-norm Commutator Norm inequality
a b s t r a c t We determine, for some values of p, q, r, the smallest constant R Cp,q,r such that R XY − Y Xp ≤ Cp,q,r Xq Y r
for all real square matrices X and Y , where · p denotes the Schatten p-norm. © 2017 Elsevier Inc. All rights reserved.
1. Introduction Let Md (F) denote the set of d × d matrices with real or complex entries according to F = R or F = C. The singular values of X ∈ Md (F) are arranged in non-increasing order s1 (X) ≥ · · · ≥ sd (X). The Schatten p-norm of X, Xp , is simply the lp norm 1/p d p of s(X) = (s1 (X), . . . , sd (X))T , i.e., Xp = s (X) for 1 ≤ p < ∞, and i=1 i X∞ = s1 (X). ✩ This research is supported by research grants MYRG065(Y1-L2)-FST13-CCM and MYRG2016-00121FST from the University of Macau. * Corresponding author. E-mail addresses:
[email protected] (C.-M. Cheng),
[email protected] (Y. Liang).
http://dx.doi.org/10.1016/j.laa.2017.01.031 0024-3795/© 2017 Elsevier Inc. All rights reserved.
264
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
For d ≥ 2 and 1 ≤ p, q, r ≤ ∞, we consider the problem of finding the smallest positive F constant Cp,q,r such that F XY − Y Xp ≤ Cp,q,r Xq Y r
for all X, Y ∈ Md (F).
√ R The story began in [2]. It was conjectured there and proved in [8,7] that C2,2,2 = 2. The √ C R result was generalized to complex matrices in [3,1] and we have C2,2,2 = C2,2,2 = 2. C The general Cp,q,r problem is broadly solved in [10,4] by determining these values for many norm configurations, but there is still one region of (p, q, r) for that the problem is open. For more details and some related problems, one may refer to the survey [5]. C For all the known results about Cp,q,r obtained in [10], pairs of real matrices are used C R C to achieve the commutator bounds Cp,q,r and thus we know that there Cp,q,r = Cp,q,r . In [4], it is proved that for odd d ≥ 3, C Cp,q,r
=
2(d − 1) p − q − r 1
2d
1 1 1 p−q−r
1
1
π cos 2d
if 0 < 1 pd
if
−
1 p
≤
1 p
−
1 q
−
1 q
1 r
≤
1 pd ,
− 1r ,
π where pd = ln( d−1 d )/ ln(cos 2d ). Moreover, the first constant is achieved by a pair of real matrices, while the second constant is achieved by a pair of complex matrices. We first show below that when restricted to real matrices, this region no longer splits:
Theorem 1.1. Suppose d ≥ 3 is odd. For Md (R), with 1 ≤ p, q, r ≤ ∞ satisfying
1 p
> 1q + 1r ,
R Cp,q,r = 2(d − 1) p − q − r . 1
1
1
C Proof. By means of a pair of real matrices, it was shown in [10] and [4] that Cp,q,r ≥
2(d − 1) p − q − r , 1 ≤ p, q, r ≤ ∞. Thus, it remains valid that 1
1
1
R Cp,q,r ≥ 2(d − 1) p − q − r . 1
1
1
(1.1)
We first show that R Cp,∞,∞ = 2(d − 1) p , 1
1 ≤ p ≤ ∞.
(1.2)
R It is obvious that C∞,∞,∞ = 2. Suppose now that 1 ≤ p < ∞. It is proved in [4, Theorem 2.2] that C Cp,∞,∞ ≤ max{Id − Zp : Z ∈ Md (C) is unitary, det Z = 1}.
The same idea of regarding the dual representation on the extreme points also works for the real case when ‘unitary’ is replaced by ‘orthogonal’, and we have R Cp,∞,∞ ≤ max{Id − Zp : Z ∈ Md (R) is orthogonal, det Z = 1}.
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
265
Let Z ∈ Md (R) be orthogonal with det Z = 1. We claim that 1 is an eigenvalue of Z. Otherwise, as d is odd, the eigenvalues of Z would be −1 (with odd multiplicity) and complex conjugate pairs. Then, their product is det Z = −1 and there is a contradiction. Thus, we know that Z is orthogonally similar to a matrix of the form [1] ⊕ Z ,
Z ∈ Md−1 (R) is orthogonal, det Z = 1.
Then, as · p is an orthogonally invariant norm, Id − Zp = Id−1 − Z p ≤ Id−1 p + Z p = 2(d − 1)1/p . R Consequently, we get Cp,∞,∞ ≤ 2(d − 1)1/p . Together with (1.1), we get (1.2). To prove the result in general, one just needs to use the successful Riesz–Thorin theorem and the symmetries in Lemma 2.4 below, and repeat those arguments used in the proof of [4, Theorem 4.1]. Our (1.2) here can be used to replace [4, (4.2)]. The arguments following [4, (4.2)] (until the end of the proof) give the result as required. We note that there are additional assumptions on the indices p, q, r in the Riesz–Thorin theorem for the real case (see [10, Theorem 1]). In our case, these assumptions are satisfied. 2
√ F Up to now, with the exception that C∞,1,1 = 27/4 is known from [10], the value of F Cp,q,r is open when 2 < p ≤ ∞, 1 ≤ q < 2 and 1 ≤ r < 2. The main purpose of this paper is to tackle the corresponding bordering configurations for the real case. The set of extreme points of the unit ball {X ∈ Md (F) : X1 ≤ 1} is exactly the set of rank one matrices with s1 (X) = 1. Thus, using those arguments used in the proof of C [10, Theorem 5(b)] for the consideration of C∞,1,1 , we get F Cp,1,1 = {XY − Y Xp : X, Y ∈ Md (F), rank X = rank Y = 1, s1 (X) = s1 (Y ) = 1}
= {(sp1 (XY − Y X) + sp2 (XY − Y X))1/p : X, Y ∈ Md (F), rank X = rank Y = 1, s1 (X) = s1 (Y ) = 1}. For k = 1, . . . , d, let k
SkF
= {s(XY − Y X) : X, Y ∈ Md (F), s(X) = s(Y ) = (1, . . . , 1, 0, . . . , 0)T } ⊂ Rd .
Note that when X and Y are of rank one, the rank of XY − Y X is at most two and thus we may regard S1F ⊂ R2 . Recently, it is proved in [9] that Theorem 1.2. For d ≥ 2, S1R is the region R bounded by the segment joining (0, √ 0) and √ 2+1 (1, 1), the segment joining (0, 0) and (1, 0), the segment joining (1, 0) and ( 2 , 2−1 2 ), and the curve
266
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
Fig. 1.1. The region R.
√ 4 cos φ sin φ (cos φ, sin φ), 1 + 2 cos φ sin φ where Φ = tan−1
√
√2−1 2+1
φ ∈ [Φ, π/4] ,
.
Fig. 1.1 shows the region R (green) defined in Theorem 1.2. The curve in Theorem 1.2 C first appeared in [3]. It is used to give lower bounds to Cp,1,1 numerically in [10] by choosing points on the curves with their lp norms as large as possible. Moreover, it is C also conjectured there that the resulting value is equal to Cp,1,1 , p > 2. When restricted to real matrices, it follows from Theorem 1.2 that the conjecture is true and R Cp,1,1 = max{xp : x ∈ R} √
4 cos φ sin φ (cosp φ + sinp φ)1/p : φ ∈ [Φ, π/4] . = max h(φ) = 1 + 2 cos φ sin φ
The derivative h of the function h is quite complicated and we tried in vain to find the maximum explicitly by using h . Nevertheless, Theorem 1.2 tells us an important fact R that Cp,1,1 is independent of d, the order of the matrices X and Y . For the complex case, C the determination of the constants Cp,1,1 , 2 < p < ∞, remains open. R R = C∞,q,1 when 1/p +1/q = 1. This allows From Lemma 2.4 below, we know that Cp,1,1 R R us to consider C∞,q,1 in order to find Cp,1,1 . In Section 2, we show that the determination R of C∞,q,1 can be reduced to solving a simpler problem (2.3) and the maximum can be found by solving a simple polynomial-like equation (2.2). For some rational numbers q, R the equation is solved and hence C∞,q,1 is found. As mentioned in [9], founded on the dimension independence of the bounds in this C R region, it is likely that the result S1C = S1R = R holds. If this is true, then Cp,1,1 = Cp,1,1 follows. However, the problem is still open. In contrast to the numerically suggested
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
267
result that S1R = S1C , we know from Theorem 1.1, which covers the dimension dependent region, that when d ≥ 3 is odd the two sets SdR and SdC are not the same. 2. Main results R We are going to determine Cp,q,r for several configurations by transforming the original maximization to another one that can be solved implicitly. In this manner, we obtain the constants for the boundary edges of the parameter space. For 1 < q < 2, define 1/2
M (q) =
4[wq (1 + wqq )1/q ]q−1
(2.1)
(1 + wqq−1 )2
where wq is the unique solution in (0, 1) of the equation w2q−1 − 3wq + 3wq−1 − 1 = 0.
(2.2)
The existence and uniqueness of wq will be proved soon in Lemma 2.3. Our main results are the following two theorems. Theorem 2.1. Let M be the function defined by (2.1) and suppose d ≥ 2. (a) For 1 < q < 2, R C∞,q,1 = M (q).
(b) For 1 < r < 2, R C∞,1,r = M (r).
(c) For 2 < p < ∞, R Cp,1,1 = M (p/(p − 1)).
The value of wq can be determined explicitly in some cases and we have Theorem 2.2. Let wq be the unique root in (0, 1) of the equation (2.2). Then
(a) w3/2 = (b) w4/3 =
√ 2 3− 5 , 2 √ √ 3 1+ 5− 2(1+ 5) , 2
(c) w5/3 =
3 A1 − A21 −4 2
where A1 = 2Re
61 54
+
1 6
1/3
469 3 i
− 13 ,
268
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
(d) w5/4 =
4 √ √ 3+ 13− 6(1+ 13) , 4
(e) w6/5 = (f) w7/6 =
5 A1 − A21 −4 2 6 A1 − A21 −4 2
where A1 = 2Re where A1 = 2Re
26 27 1 2
+ +
1 3 1 6
1/3
229 3 i
1/3
473 3 i
+ 23 ,
+ 1.
(In (c), (e) and (f), the cube roots are taken so that the arguments lie between 0 and π/2.) For the proofs, we need some lemmas. The first one deals with wq . Lemma 2.3. Let 1 < q < 2. The equation (2.2) has a unique solution wq in the interval (0, 1). Proof. To consider the number of solutions of the equation, let f (w) = w2q−1 − 3wq + 3wq−1 − 1. Then f (w) = (2q − 1)w2q−2 − 3qwq−1 + 3(q − 1)wq−2 and for 0 < w < 1, f (w) = (2q − 1)(2q − 2)w2q−3 − 3q(q − 1)wq−2 + 3(q − 1)(q − 2)wq−3 = (q − 1)wq−3 [2(2q − 1)wq − 3qw + 3(q − 2)] < (q − 1)wq−3 [2(2q − 1)w − 3qw + 3(q − 2)] = (q − 1)wq−3 (w + 3)(q − 2) < 0. Note that (i) f (0) = −1 < 0,
(ii) f (1) = 0,
(iii) f (1) = 2(q − 2) < 0, and
(iv) f (w) < 0 on (0, 1) all together imply there is a unique 0 < wq < 1 such that f (wq ) = 0. 2 The following lemma helps to prove (b) and (c) in Theorem 2.1 from the knowledge of (a) via duality. It is proved in [10, Proposition 4] for F = C and the proof there also works for F = R. For 1 ≤ p ≤ ∞, let p be given by 1/p + 1/p = 1.
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
269
Lemma 2.4. Suppose 1 ≤ p, q, r ≤ ∞. Then, for F = C, R, F F (a) Cp,q,r = Cp,r,q ,
F (b) Cp,q,r = CrF ,q,p ,
F (c) Cp,q,r = CqF ,p ,r .
The following proposition gives the crucial reformulated maximum. Proposition 2.5. For 1 < q < 2, max{2 sin 2x(s1 cos2 x + s2 sin2 x) : s1 ≥ s2 ≥ 0, sq1 + sq2 = 1, x ∈ R} √ = max 27/4, M (q), 21−1/q
(2.3)
where the function M is defined in (2.1). Proof. We use the method of Lagrange’s multipliers. When s1 = 1 and s2 = 0, the problem reduces to max{2 sin 2x cos2 x : x ∈ R} and it is straightforward to show that the √ maximum is 27/4. When s1 = s2 = 2−1/q , the problem reduces to max{21−1/q sin 2x : x ∈ R} and the maximum is 21−1/q trivially. Suppose 0 < s2 < 2−1/q < s1 < 1 and sq1 + sq2 = 1. Let F (x, s1 , s2 , λ) = 2 sin 2x(s1 cos2 x + s2 sin2 x) + λ(sq1 + sq2 − 1). Then, the conditions
∂F ∂F ∂F = 0, = 0 and = 0 give, respectively, ∂x ∂s1 ∂s2
4 cos 2x(s1 cos2 x + s2 sin2 x) − 2(s1 − s2 ) sin2 2x = 0, 2
2 sin 2x cos x +
λqsq−1 1
(2.4)
= 0,
(2.5)
2 sin 2x sin2 x + λqsq−1 = 0. 2
(2.6)
and
From (2.4), with cos 2x = 2 cos2 x − 1 and sin2 2x = 1 − cos2 2x = 4 cos2 x − 4 cos4 x, we get 4(s1 − s2 ) cos4 x − (3s1 − 5s2 ) cos2 x − s2 = 0. The sum of (2.5) and (2.6) gives −λq =
2 sin 2x . sq−1 + sq−1 1 2
Substitute this back to (2.5), we get 2 sin 2x cos2 x −
2 sin 2x sq−1 = 0. 1 + sq−1 2
sq−1 1
(2.7)
270
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
For finding the maximum value, we may assume sin 2x = 0. Thus we get cos2 x =
sq−1 1 . + sq−1 2
(2.8)
sq−1 1
Letting w = s2 /s1 so that 0 < w < 1, we get from (2.7) and (2.8) that 4(1 − w) cos4 x − (3 − 5w) cos2 x − w = 0
(2.9)
and cos2 x =
1 . 1 + wq−1
(2.10)
Eliminating cos x in (2.9) using (2.10), we get equation (2.2). By Lemma 2.3, let wq be the unique solution in (0, 1) of (2.2). We get sq1 (1 + wqq ) = 1 and hence s1 = (1 + wqq )−1/q ,
and s2 = wq (1 + wqq )−1/q .
Using (2.8), which also gives sin2 x =
sq−1 2 , sq−1 + sq−1 1 2
we get 2 sin 2x(s1 cos2 x + s2 sin2 x) = ±4 sin2 x cos2 x(s1 cos2 x + s2 sin2 x) =±
4(s1 s2 )(q−1)/2 (sq−1 + sq−1 )2 1 2 1/2
=±
4[wq (1 + wqq )1/q ]q−1 (1 + wqq−1 )2
.
For the maximum value, we take the positive value and the result follows. 2 We want to understand the relations between the three parts of the recently √ given maximum. Fig. 2.1 below shows the graphs of the curves y = 27/4 (blue), y = M (q) (red) and y = 21−1/q (green) for 1 < q < 2. It highly suggests that √ √ 1−1/q max 27/4, M (q), 2 = M (q) for all 1 < q < 2, and that the numbers 27/4 √ and 21−1/q q=2 = 2 are the one-sided limits of M (q). We defer the proof of the first fact to the end of the proof of Theorem 2.1, at the point when we have proved √ R C∞,q,1 = max 27/4, M (q), 21−1/q . There, we are able to use a pair of 2 × 2 matrices √ R and can easily show that C∞,q,1 > 27/4. For the two one-sided limits, we have
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
271
√ Fig. 2.1. The three curves y = 27/4 (blue), y = M (q) (red) and y = 21−1/q (green) for 1 < q < 2. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
Proposition 2.6. (a) lim+ M (q) =
√
q→1
27/4
and
(b) lim− M (q) = q→2
√ 2.
Proof. (a) When we replace wq−1 by z in (2.2), we obtain 1
1
z 2+ q−1 − 3z 1+ q−1 + 3z − 1 = 0.
(2.11)
It is clear that when q is close to 1, the unique root zq ∈ (0, 1) is close to 1/3. From (2.2), we obtain wqq = (3wqq−1 − 1)/(3 − wqq−1 ) = (3zq − 1)/(3 − zq ). Thus, the function M can be expressed in terms of zq as 1/2
3−1/q zq (1 + zq )1−1/q ˜ (q) = 2 M (1 + zq )2 (3 − zq )1−1/q
where zq is the unique solution in (0, 1) of √ the equation (2.11). When q tends to 1, zq ˜ (q) = 27/4, as expected. tends to 1/3 and thus we get lim+ M q→1
(b) When q tends to 2, we easily deduce that wq tends to 1 and the result follows. 2 We remark here that numerical computation of wq is not suggested when q is close to 1. In the proof of (a) of the above proposition, zq (= wq−1 ) is close to 1/3 when q is close to 1. This implies that wq is close to (1/3)1/(q−1) which is practically 0 if, say, q = 1.01. Though M (q) and wq can hardly be evaluated, they play an important role in our abstract study. We now prove our main theorems.
272
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
Proof of Theorem 2.1. We are going to prove (a) only. Parts (b) and (c) follow immeR R diately with the help of Lemma 2.4. For 1/p + 1/q = 1, we have C∞,q,1 = Cp,1,1 by R Lemma 2.4(c). Theorem 1.2 implies that Cp,1,1 is independent of d (for d ≥ 2). Thus, we just need to prove the result for d = 2. We again rely on the dual representation, which was introduced in [10] for even d and worked well for the adaption to odd d in the proof of Theorem 1.1. The set of extreme points of the convex unit ball {Y ∈ M2 (R), Y 1 ≤ 1} is exactly the set of rank one matrices Y ∈ M2 (R) with s1 (Y ) = 1. As a norm is a convex function, for 1 ≤ q < ∞, we have C∞,q,1 = max{XY − Y X∞ : X, Y ∈ M2 (R), Xq ≤ 1, Y 1 ≤ 1} = max{XY − Y X∞ : X, Y ∈ M2 (R), Xq = 1, rank Y = 1, s1 (Y ) = 1} = max{aT (XY − Y X)b : X, Y ∈ M2 (R), Xq = 1, rank Y = 1, s1 (Y ) = 1, a, b ∈ R2 , a = b = 1}. To determine the maximum, depending on the sign of det X, we have two cases. Consider det X ≤ 0. By singular value decomposition, we may let
cos u − sin u cos v − sin v s1 X= −s2 sin v cos v sin u cos u cos u − sin u = s1 cos v − sin v − s2 sin v cos v . sin u cos u For rank one Y with s1 (Y ) = 1, let Y =
cos h [cos k sin k]. sin h
cos a cos b For unit vectors a, b ∈ R , let a = and b = . We assume a, b, h, k, u, v ∈ R. sin a sin b Then 2
aT (XY − Y X)b = s1 cos(a − u) cos(v + h) cos(k − b) − s2 sin(a − u) sin(v + h) cos(k − b) − s1 cos(a − h) cos(k − u) cos(v + b) + s2 cos(a − h) sin(k − u) sin(v + b) = s1 cos(u − a) cos(v + h) cos(b − k) + s2 sin(u − a) sin(v + h) cos(b − k) + s1 cos(a − h) cos(k − u) cos(π − v − b) + s2 cos(a − h) sin(k − u) sin(π − v − b) = s1 cos x2 cos x3 cos x1 + s2 sin x2 sin x3 cos x1 + s1 cos x4 cos x5 cos x6 + s2 cos x4 sin x5 sin x6
(2.12)
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
273
where x1 = b − k, x2 = u − a, x3 = v + h, x4 = a − h, x5 = k − u, x6 = π − v − b.
(2.13)
Note that x1 + · · · + x6 = π,
(2.14)
and that, conversely, for any x1 , . . . , x6 ∈ R satisfying (2.14), we can find a, b, h, k, u, v ∈ R such that (2.13) is true. For example, choose a = 0, b = x1 +x2 +x5 , h = −x4 , k = x2 +x5 , u = x2 and v = x3 + x4 . Thus, equivalently, we need to find max{f (x1 , . . . , x6 , s1 , s2 ) : x1 + · · · + x6 = π, s1 ≥ s2 ≥ 0, sq1 + sq2 = 1}
(2.15)
where f (x1 , . . . , x6 , s1 , s2 ) = cos x1 (s1 cos x2 cos x3 + s2 sin x2 sin x3 ) + cos x4 (s1 cos x5 cos x6 + s2 sin x5 sin x6 ) s1 − s2 s1 + s2 cos(x2 − x3 ) + cos(x2 + x3 ) = cos x1 2 2 s1 + s2 s1 − s2 cos(x5 − x6 ) + cos(x5 + x6 ) . + cos x4 2 2 (2.16) Since f (x1 + π, x2 + π, x3 − 2π, x4 , x5 , x6 , s1 , s2 ) = f (x1 , . . . , x6 , s1 , s2 ), we may assume 3 x2 +x3 cos x1 ≥ 0 at the maximum. Then, since f (x1 , . . . , x6 , s1 , s2 ) ≤ f (x1 , x2 +x , 2 , 2 x4 , x5 , x6 , s1 , s2 ), we may assume x2 = x3 at maximum. Similarly, we may assume x5 = x6 at maximum. The problem now reduces to finding the maximum of fˆ(x1 , x2 , x4 , x5 ) = cos x1 (s1 cos2 x2 + s2 sin2 x2 ) + cos x4 (s1 cos2 x5 + s2 sin2 x5 ) subject to x1 + 2x2 + x4 + 2x5 = π, 0 ≤ s2 ≤ s1 ≤ 1 and sq1 + sq2 = 1. Let x1 + 2x2 = π/2 − c and thus x4 + 2x5 = π/2 + c. So, x1 = π/2 − c − 2x2
and x4 = π/2 + c − 2x5 .
Then, instead of fˆ, we have to maximize, subject to free c, x2 , x5 and the restrictions 0 ≤ s2 ≤ s1 ≤ 1 and sq1 + sq2 = 1, g(c, x2 , x5 , s1 , s2 ) = sin(2x2 + c)(s1 cos2 x2 + s2 sin2 x2 ) + sin(2x5 − c)(s1 cos2 x5 + s2 sin2 x5 ). (2.17)
274
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
We now treat s1 and s2 as fixed parameters and consider the maximum of gs1 ,s2 (c, x2 , x5 ) = g(c, x2 , x5 , s1 , s2 ). If s1 = s2 = 2−1/q , it is trivial that the maximum is 21−1/q . Suppose s1 > s2 . From ∂gs1 ,s2 ∂gs1 ,s2 ∂gs1 ,s2 = 0, = 0 and = 0 we get, respectively, ∂c ∂x2 ∂x5 cos(2x2 + c)(s1 cos2 x2 + s2 sin2 x2 ) − cos(2x5 − c)(s1 cos2 x5 + s2 sin2 x5 ) = 0, (2.18) 2 cos(2x2 + c)(s1 cos x2 + s2 sin x2 ) − 2(s1 − s2 ) sin x2 cos x2 sin(2x2 + c) = 0, 2
2
(2.19) and 2 cos(2x5 − c)(s1 cos2 x5 + s2 sin2 x5 ) − 2(s1 − s2 ) sin x5 cos x5 sin(2x5 − c) = 0. (2.20) Using (2.18), the difference of (2.19) and (2.20) gives, with s1 − s2 = 0, sin 2x5 sin(2x5 − c) = sin 2x2 sin(2x2 + c), which gives 1 2 [cos c
− cos(4x5 − c)] = 12 [cos(−c) − cos(4x2 + c)]
and hence cos(4x5 − c) = cos(4x2 + c). Thus, for some integer n, 4x5 − c = 2nπ ± (4x2 + c).
(2.21)
We may restrict our attention to n = 0, 1. The other cases can be reduced to one of these 4 cases. This can be seen as follows: Suppose n = 2m + r where r = 0 or r = 1. Then, 4x5 − c = 2nπ + (4x2 + c) ⇐⇒ 4x5 − c = 2rπ + [4(x2 + mπ) + c], 4x5 − c = 2nπ − (4x2 + c) ⇐⇒ 4x5 − c = 2rπ − [4(x2 − mπ) + c], and we see that gs1 ,s2 (c, x2 ± mπ, x5 ) = gs1 ,s2 (c, x2 , x5 ).
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
275
When n = 0 we have two cases: Case 1. 4x5 − c = 4x2 + c. Then, 2x5 = 2x2 + c and (2.18) can be rewritten as cos 2x5 (s1 cos2 x2 + s2 sin2 x2 ) − cos 2x2 (s1 cos2 x5 + s2 sin2 x5 ) = 0, that is, (cos2 x5 − sin2 x5 )(s1 cos2 x2 + s2 sin2 x2 ) −(cos2 x2 − sin2 x2 )(s1 cos2 x5 + s2 sin2 x5 ) = 0, from which we get (s1 + s2 )(sin2 x2 cos2 x5 − sin2 x5 cos2 x2 ) = 0. So, as s1 + s2 = 0, we obtain sin(x2 + x5 ) sin(x2 − x5 ) = (sin x2 cos x5 + sin x5 cos x2 )(sin x2 cos x5 − sin x5 cos x2 ) = 0. Hence, for some integer m, x2 ± x5 = mπ. When x2 + x5 = mπ, we get x5 = mπ − x2 and hence gs1 ,s2 (c, x2 , mπ − x2 ) = [sin(2x2 + c) − sin(2x2 + c)](s1 cos2 x2 + s2 sin2 x2 ) = 0. When x2 − x5 = mπ, we get x5 = x2 − mπ, c = −2mπ and gs1 ,s2 (−2mπ, x2 , x2 − mπ) = 2 sin 2x2 (s1 cos2 x2 + s2 sin2 x2 ).
(2.22)
Note that the maximum of the right-hand side in (2.22) is 21−1/q when s1 = s2 = 2−1/q . Hence, when we change the parameters s1 and s2 back to variables and find the maximum of g in (2.17) we come to the problem in Proposition 2.5 and consequently we know that √ the maximum of the values of stationary points here is max 27/4, M (q), 21−1/q . Case 2. 4x5 − c = −4x2 − c. Then x5 = −x2 and easily we get gs1 ,s2 (c, x2 , −x2 ) = 0. When n = 1 we have two cases:
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
276
Case 3. 4x5 − c = 2π + 4x2 + c. We have 4x5 − (c + π) = 4x2 + (c + π) and we are back to Case 1. With the c in the proof of Case 1 replaced by c + π we finally come to, parallel to (2.22), gs1 ,s2 (−(2m + 1)π, x2 , x2 − mπ) = −2 sin 2x2 (s1 cos2 x2 + s2 sin2 x2 ) = 2 sin 2(−x2 )[s1 cos2 (−x2 ) + s2 sin2 (−x2 )]. In this case, again, the maximum of the values of stationary points is max 21−1/q .
√
27/4, M (q),
Case 4. 4x5 − c = 2π − 4x2 − c. So x5 = π/2 − x2 . Then gs1 ,s2 (c, x2 , π/2 − x2 ) = sin(2x2 + c)(s1 cos2 x2 + s2 sin2 x2 ) + sin(2x2 + c)(s1 sin2 x2 + s2 cos2 x2 ) = (s1 + s2 ) sin(2x2 + c) ≤ 21−1/q . The last inequality follows from the condition sq1 + sq2 = 1. From Cases 1–4, we conclude that the maximum of the values of the stationary points √ is max 27/4, M (q), 21−1/q when det X ≤ 0. In the situation det X ≥ 0, by replacing s2 by −s2 , we may go through the same steps as before and, parallel to (2.15), arrive at max{f˜(x1 , . . . , x6 , s1 , s2 ) : x1 + · · · + x6 = π, s1 ≥ s2 ≥ 0, sq1 + sq2 = 1},
where f˜(x1 , . . . , x6 , s1 , s2 ) = cos x1 (s1 cos x2 cos x3 − s2 sin x2 sin x3 ) + cos x4 (s1 cos x5 cos x6 − s2 sin x5 sin x6 ). Comparing f in (2.15) and f˜ here, we see that one can obtain f˜ by replacing s2 in f by −s2 . Thus, by replacing s2 by −s2 , the proof of the case det X ≤ 0 can be adopted here and only a few amendments are needed. The minus sign in front of s2 is transported straight until (2.17). With basically the same argument used afterwards, we get exactly (2.21) and we have the same 4 cases. Parallel to Case 1, which leads to the consideration of Proposition 2.5, we eventually have to maximize (2.22), but with an altered sign of s2 . The determination of this maximum is much simpler. At the maximum, we may assume sin 2x2 ≥ 0 (otherwise, just replace x2 by −x2 ). Then we easily see that s1 = 1, s2 = 0
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
277
√ and calculate that the maximum value is 27/4. The proof of the other cases are similar to those of Cases 2–4, and there are no new stationary points. Up to now, we have proved that √ R C∞,q,1 = max{ 27/4, M (q), 21−1/q }. Proposition 2.6 implies that these newly obtained constants continuously match the √ R R known values for the end-points of the interval [1, 2], i.e., C∞,1,1 = 27/4 and C∞,2,1 = √ 2. We now show that, for 1 < q < 2, M (q) is straightly bigger than the other two values. Take √ 1 3 −1 01 1/2 √ √ X= = √ and Y = 3/2 −1/2 , 4 3 − 3 3/2 00 so that 1 Z = XY − Y X = 4
√ 3 −2 3 . 0 −3
√ √ It is routine to check that s1 (Z) = 27/4 and s2 (Z) = 3/4. As rank (X) = rank (Y ) = 1, we use Lemma 2.4(c) and get, for 1 < q < 2 and 1/p + 1/q = 1, R R C∞,q,1 = Cp,1,1 ≥
√ √ √ p ( 27/4)p + ( 3/4)p > 27/4.
On the other hand, let z(x, y) =
4(xy)(q−1)/2 (xq−1 + y q−1 )2
and we consider finding the maximum of z(x, y) subject to xq + y q = 1, x ≥ y ≥ 0. We have xq + y q = 1 ⇒ qxq−1 + qy q−1
dy xq−1 dy =0⇒ = − q−1 . dx dx y
With y = (1 − xq )1/q , we may regard z as a function in x. By direct calculation, q−3
2(q − 1)(xy) 2 dz = q−1 [y q (y q−1 − 3xq−1 ) − xq (xq−1 − 3y q−1 )]. dx (x + y q−1 )3 y q−1 Thus, for y > 0,
dz = 0 implies dx y q (y q−1 − 3xq−1 ) = xq (xq−1 − 3y q−1 ).
(2.23)
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
278
y (0 ≤ y ≤ x ⇒ 0 ≤ w ≤ 1), we get exactly equation (2.2). With the unique x solution wq in (0, 1), let wq = xq /yq . Since xqq + yqq = 1, we have With w =
xqq =
1 1 + wqq
and yqq = 1 − xqq =
wqq 1 + wqq
and we can have z(xq , yq ) in terms of wq : (q−1)/2
z(xq , yq ) =
4wq
(1 + wqq )(q−1)/q
(1 + wqq−1 )2
= M (q).
That is, M (q) is also the value of a stationary point of z(x, y). Note that z(2−1/q , 2−1/q ) = 21−1/q . To show that M (q) > 21−1/q , it suffices to show that M (q) is the maximum value of z(x, y) subject to xq + y q = 1. To this, as there is only one stationary point, we just need to consider z at (1, 0) and at (2−1/q , 2−1/q ). When (x, y) = (1, 0), z(1, 0) = 0. When (x, y) = (2−1/q , 2−1/q ), we use the following: dz (I) From (2.23), we see that = 0. dx x=y=2−1/q (II) By direct calculation, d q q−1 [y (y − 3xq−1 ) − xq (xq−1 − 3y q−1 )] dx = [(2q − 1)y 2q−2 − 3qxq−1 y q−1 + 3(q − 1)xq y q−2 ]
dy dx
− [(2q − 1)x2q−2 − 3qxq−1 y q−1 + 3(q − 1)xq−2 y q ] and so
d q q−1 q−1 q q−1 q−1 [y (y − 3x ) − x (x − 3y )] = (8 − 4q)2(2−2q)/q > 0. dx x=y=2−1/q
Consequently, d2 z > 0, dx2 x=y=2−1/q Using (I) and (II), we know z(2−1/q , 2−1/q ) = 21−1/q is a local minimum value. Moreover, M (q) is the maximum value of z and M (q) > 21−1/q . The result follows and the proof is now complete. 2
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
279
Proof of Theorem 2.2. Suppose q = m/n is rational where m and n are relatively prime positive integers satisfying n < m < 2n. With x = w1/n , equation (2.2) becomes fm,n (x) = 0 where fm,n (x) = x2m−n − 3xm + 3xm−n − 1. When n is even, ±1 are zeros of fm,n whereas when n is odd, 1 is a zero of fm,n . Let rn (x) =
x2 − 1 if n is even, x − 1 if n is odd.
Then fm,n (x) can be factorized as fm,n (x) = rn (x)Pm,n (x) where, if n is even, Pm,n (x) = x2m−n−2 + x2m−n−4 + · · · + x2 + 1 − 3(xm−2 + xm−4 + · · · + xm−n ), (2.24) and, if n is odd, Pm,n (x) = x2m−n−1 + x2m−n−2 + · · · + x + 1 − 3(xm−1 + xm−2 + · · · + xm−n ). (2.25) Note that deg(Pm,n ) = 2m − n − [(3 + (−1)n )/2], wm/n is the unique zero in (0, 1) of Pm,n (x) and Pm,n (1) = 0. On the other hand, we see that α is a zero of fm,n if and only if 1/α is a zero of fm,n . So, Pm,n can be factorized as Pm,n (x) =
l
(x2 − Ak x + 1)
(2.26)
k=1
with l = deg(Pm,n )/2, Ai = αi + 1/αi , i = 1, . . . , l, where α1 , 1/α1 , . . . , αl , 1/αl are the zeros of Pm,n . Moreover, since (2.2) has only one root in (0, 1), we see that Pm,n has exactly one root wm/n in (0, 1) and exactly one root 1/wm,n in (1, ∞). Thus, there is only one Ai such that Ai > 2. The others are less than 2 or non-real. Let us suppose wm/n + 1/wm/n = A1 > 2. (a) (m, n) = (3, 2). Referring to (2.24), P3,2 (x) = x2 − 3x + 1. √ 2 √ Easily, the unique zero in (0, 1) is x = 3−2 5 . Consequently, w3/2 = 3−2 5 . (b) (m, n) = (4, 3). Referring to (2.25), we are to find the zero in (0, 1) of P4,3 (x) = x4 − 2x3 − 2x2 − 2x + 1. Using (2.26), we consider x4 − 2x3 − 2x2 − 2x + 1 = (x2 − A1 x + 1)(x2 − A2 x + 1).
280
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
Comparing the coefficients of x3 and x2 , we get −A1 − A2 = −2, A1 A2 + 2 = −2. Thus, A1 is a zero of Q4,3 (x) = x2 − 2x − 4 and so, as A1 > 2, A1 = 1 + √ solve x + 1/x = 1 + 5 and get 1+ x± =
√ √ 5 ± (1 + 5)2 − 4 2
As 0 < w4/3 < 1, we get w4/3 =
x3−
=
√
5. Then, we
.
3 √ √ 1+ 5− 2(1+ 5) . 2
(c) (m, n) = (5, 3). Similarly to (b), using (2.25) and (2.26), we consider P5,3 (x) = x6 + x5 − 2x4 − 2x3 − 2x2 + x + 1 = (x2 − A1 x + 1)(x2 − A2 x + 1)(x2 − A3 x + 1). Comparing the coefficients of x5 , x4 and x3 , we have A1 + A2 + A3 = −1 A1 A2 + A2 A3 + A1 A3 = −5 A1 A2 A3 = 4. So, A1 > 2 is a zero of Q5,3 (x) = x3 + x2 − 5x − 4 = 0. By the Cardan’s formula for the roots of a cubic equation (e.g., see [6, p. 310, Exercise 5]), we get the expression for A1 . By solving x + 1/x = A1 , we can deduce w5/3 , as required. The proofs of (d), (e) and (f) are similar, using Q5,4 (x) = x2 − 3x − 1, Q6,5 (x) = x3 − 2x2 − 5x + 2 and Q7,6 (x) = x3 − 3x2 − 2x + 3, respectively.
2
Readers may be aware that one can go a step further. In the above proof, instead of solving Pm,n (x) = 0 directly to obtain wm/n , quadratic factors x2 − Ai x + 1 are used to reduce the problem to finding the zeros of a polynomial with degree half of that of Pm,n
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
281
(deg(Pm,n ) is always even). It is well-known that any polynomial with degree four or less can be solved by radicals. Thus, if deg(Pm,n ) ≤ 8, one can explicitly write down the wm/n . Theorem 2.2 does the job for deg(Pm,n ) = 2, 4, 6. For deg(Pm,n ) = 8, there are four cases: (m, n) = (7, 4), (7, 5), (8, 7), (9, 8). As the formulas for the roots of a quadric polynomial are quite involved, we leave these for interested readers. 3. Some remarks R The maximal pairs (i.e., those pairs that attain C∞,q,1 ) for Theorem 2.1(a) can be found by tracing the proof of the theorem. They can be expressed in terms of the wq . We leave this for interested readers. On the other hand, we see that the maximal pairs vary (necessarily) with q. This is opposite to the known results in [10] and [4] that a specific pair of matrices can serve as a maximal pair for a whole region of (p, q, r). R , using Riesz–Thorin theorem (real version, see [10, For an upper bound to Cp,q,r Theorem 1]), Lemma 2.4, and those arguments used in the proof of [4, Theorem 4.1], it follows from Theorem 2.1 that we have the following presumably not best-possible estimates:
Theorem 3.1. For d ≥ 2, 2 < p ≤ ∞, 1 ≤ q < 2, and 1 ≤ r < 2, with R Cp,q,r ≤M
1 q
+
1 r
1 −
1 2
< 1q + 1r − p1 −1 < 1,
1 p
−1
where the function M is defined in (2.1). Acknowledgement The authors thank the referee for his/her many valuable and constructive suggestions which greatly improve the presentation of the paper. In particular, the derivation of fˆ from f in the proof of Theorem 2.1 is much simplified using the current form in (2.16) suggested by the referee. References [1] K. Audenaert, Variance bounds, with an application to norm bounds for commutators, Linear Algebra Appl. 432 (2010) 1126–1143. [2] A. Böttcher, D. Wenzel, How big can the commutator of two matrices be and how big is it typically?, Linear Algebra Appl. 403 (2005) 216–228. [3] A. Böttcher, D. Wenzel, The Frobenius norm and the commutator, Linear Algebra Appl. 429 (2008) 1864–1885. [4] C.-M. Cheng, C. Lei, On Schatten p-norms of commutators, Linear Algebra Appl. 484 (2015) 409–434. [5] C.-M. Cheng, X. Jin, S. Vong, A Survey on the Böttcher–Wenzel conjecture and related problems, Oper. Matrices 9 (2015) 659–673. [6] T.W. Hungerford, Algebra, Grad. Texts in Math., vol. 73, Springer, New York, 1974.
282
C.-M. Cheng, Y. Liang / Linear Algebra and its Applications 521 (2017) 263–282
[7] Z. Lu, Normal scalar curvature conjecture and its applications, J. Funct. Anal. 261 (2011) 1284–1308. [8] S. Vong, X. Jin, Proof of Böttcher and Wenzel’s conjecture, Oper. Matrices 2 (2008) 435–442. [9] D. Wenzel, A strange phenomenon for the singular values of commutators with rank one matrices, Electron. J. Linear Algebra 30 (2015) 605–625. [10] D. Wenzel, K. Audenaert, Impressions of convexity: an illustration for commutator bounds, Linear Algebra Appl. 433 (2010) 1726–1759.