The Hessian matrix of Lagrange function

The Hessian matrix of Lagrange function

Linear Algebra and its Applications 531 (2017) 537–546 Contents lists available at ScienceDirect Linear Algebra and its Applications www.elsevier.co...

274KB Sizes 0 Downloads 42 Views

Linear Algebra and its Applications 531 (2017) 537–546

Contents lists available at ScienceDirect

Linear Algebra and its Applications www.elsevier.com/locate/laa

The Hessian matrix of Lagrange function Pingge Chen, Yuejian Peng 1 , Suijie Wang ∗,2 Institute of Mathematics, Hunan University, Changsha, China

a r t i c l e

i n f o

Article history: Received 6 January 2016 Accepted 8 June 2017 Available online 13 June 2017 Submitted by R. Brualdi MSC: 05C50 15A18

a b s t r a c t We study the Hessian matrix of the Lagrange function of a dense 3-uniform hypergraph and show that the Lagrange function of a dense 3-uniform hypergraph has a unique optimal weight. We also give a characterization to the optimal weight of a dense 3-uniform hypergraph by the Hessian matrix of the Lagrange function and a simple application on the regular 3-uniform hypergraphs. © 2017 Elsevier Inc. All rights reserved.

Keywords: Lagrange function Hessian matrix Dense graph Regular graph

1. Introduction We start with some definitions and notations. For a positive integer r, an r-uniform   hypergraph or r-graph G consists of a vertex set V (G) and an edge set E(G) ⊆ V (G) , r V (G) where r denotes the family of all r-subsets of V (G). A subgraph H of G is an r-graph (r) with V (H) ⊆ V (G) and E(H) ⊆ E(G), denoted H ⊆ G. Let Kt denote the complete * Corresponding author. E-mail addresses: [email protected] (P. Chen), [email protected] (Y. Peng), [email protected] (S. Wang). 1 Supported in part by National Natural Science Foundation of China (No. 11271116). 2 Supported in part by National Natural Science Foundation of China (No. 11401196, 11571097). http://dx.doi.org/10.1016/j.laa.2017.06.012 0024-3795/© 2017 Elsevier Inc. All rights reserved.

538

P. Chen et al. / Linear Algebra and its Applications 531 (2017) 537–546

r-graph on t vertices, that is the r-graph on t vertices containing all r-subsets of the vertex set as edges. We assume that all hypergraphs have the vertex set [n] = {1, 2, . . . , n} throughout the paper if it is not specified.   For an r-graph G with the edge set E ⊆ [n] r , define the Lagrange function of G to be  LG (x) = xi , where x = (x1 , . . . , xn ) ∈ Rn . e∈E i∈e

The Hessian matrix HG (x) of LG (x) is an n × n square matrix defined as follows, ⎡ ∂ 2 LG (x)  HG (x) =



∂ 2 LG (x) ∂xi ∂xj

n×n

⎢ ⎢ ⎢ =⎢ ⎢ ⎣

∂x21 ∂ 2 LG (x) ∂x2 ∂x1

.. . ∂ 2 LG (x) ∂xn ∂x1

∂ 2 LG (x) ∂x1 ∂x2 ∂ 2 LG (x) ∂x22

···

∂ 2 LG (x) ∂xn ∂x2

···

···

.. .

∂ 2 LG (x) ∂x1 ∂xn ∂ 2 LG (x) ∂x2 ∂xn

.. . ∂ 2 LG (x) ∂x2n

⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦

Let Δn be the standard (n − 1)-dimensional closed simplex, i.e.,   Δn = x = (x1 , . . . , xn ) ∈ Rn | xi ≥ 0, xet = 1 , where e denotes the vector of all entries 1 and et the transpose of e. The Lagrangian λ(G) of a graph G is the supremum of the Lagrange function LG (x) in Δn , i.e., λ(G) = sup LG (x). x∈Δn

It is obvious from the compactness of Δn that the supremum can be attained at some vector in Δn . A vector y ∈ Δn is called an optimal weight of G if λ(G) = LG (y) = maxx∈Δn LG (x). An r-graph G is said to be dense if λ(H) < λ(G) for any proper subgraph H of G. This is equivalent to that all optimal weights of G are attained in the interior of Δn (denoted Int(Δn )), i.e., no optimal weight has coordinates zero. In 1965, Motzkin and Straus [4] showed that any graph has the same Lagrangian as its maximum cliques, since that only complete graphs are dense when r = 2, and gave a new proof of the Turán’s classical result on Turán densities of complete graphs. Sidorenko [5] showed that the Turán density of an r-uniform hypergraph F is equal to the supremum of r!λ(G) over all dense F -hom-free r-uniform hypergraphs G. More studies on developing the Lagrange method for hypergraph Turán problems can be found in [1,3,5,6]. In this paper, we will study the Hessian matrix HG (x) of the Lagrange function LG (x) and obtain that the optimal weight y of a dense 3-graph can be characterized by yHG (y) = 6LG (y)e and the negativity of the second largest eigenvalue of HG (y). Finally, we give a simple application on the structure of the regular dense 3-graphs.

P. Chen et al. / Linear Algebra and its Applications 531 (2017) 537–546

539

2. Hessian matrix In this section, our arguments and proofs are practiced on 3-graphs without multiple edges. However, when multiple edges are allowed, all results still hold automatically after some simple amendments of notations.   Let G be a 3-graph with edge set E ⊆ [n] 3 . An edge {i, j, k} will be simply denoted by (ijk) throughout the paper. For i ∈ [n], denote by Ei (G) the link of vertex i, i.e., 

 (jk) ∈

Ei (G) =

[n] 2



 | (ijk) ∈ E .

Let Ai (G) be the n × n matrix whose (j, k)-entry is 1 if (jk) ∈ Ei (G), and 0 otherwise. Note that all entries of the i-th column and i-th row of Ai (G) are zero. Denote by A(Gi ) the (n − 1) × (n − 1) matrix obtained by deleting the i-th column and i-th row of Ai (G), which is the adjacent matrix of the graph Gi with the vertex set [n] \ {i} and edge set n  Ei (G). For any x, y ∈ Rn , we have HG (x) = xi Ai (G) and i=1

yHG (x) = (yA1 (G)xt , . . . , yAn (G)xt ) = (xA1 (G)y t , . . . , xAn (G)y t ) = xHG (y). In addition, we have xHG (x)xt =

 i=j

xi xj

 (ijk)∈E

xk =



xi xj xk = 6LG (x),

(ijk)∈E

and 1 (y + αx)HG (y + αx)(y + αx)t 6 1 1 = LG (y) + αxHG (y)y t + α2 xHG (y)xt + α3 LG (x) 2 2 1 1 = LG (y) + αyHG (x)y t + α2 xHG (y)xt + α3 LG (x). 2 2

LG (y + αx) =

(1) (2)

Lemma 2.1. Let G be a 3-graph and y ∈ Δn . The following three statements are equivalent, (i). yHG (y) = 6LG (y)e, ∂LG ∂LG (ii). (y) = (y), ∀ i, j ∈ [n], ∂yi ∂yj (iii). xHG (y)y t = yHG (x)y t = 0 for all x ∈ Rn satisfying xet = 0.

540

P. Chen et al. / Linear Algebra and its Applications 531 (2017) 537–546



 ∂LG ∂LG ∂LG (y), (y), . . . , (y) and yHG (y)y t = ∂y1 ∂y2 ∂yn 6LG (y), then (i) and (ii) are equivalent. It is obvious that (i) implies (iii). Recall that the orthogonal complement of the subspace {x ∈ Rn | xet = 0} is the line Re. If xHG (y)y t = yHG (x)y t = 0 for all x ∈ Rn with xet = 0, then yHG (y) ∈ Re, i.e., yHG (y) = ce for some constant c ∈ R. Thus we have c = 6LG (y) since yHG (y)y t = 6LG (y) and yet = 1. 2 Proof. Note that yHG (y) = 2

Next we shall prove the following result for dense 3-graphs which is in fact a partial result of [2, Theorem 2.1]. Lemma 2.2. If y ∈ Int(Δn ) is an optimal weight of a dense 3-graph G, then for any x satisfying xet = 0, we have xHG (y)y t = yHG (x)y t = 0, and yHG (y) = 6LG (y)e. Proof. Suppose the vector x ∈ Rn satisfies xet = 0 and xHG (y)y t = 0, then we can choose α small enough such that y + αx ∈ Int(Δn ) and 1 1 αxHG (y)y t + α2 xHG (y)xt + α3 LG (x) > 0. 2 2 From (1), we have LG (y + αx) > LG (y), which is a contradiction to the optimality of y. Thus xHG (y)y t = yHG (x)y t = 0 for all x in the hyperplane xet = 0. Hence, yHG (y) = 6LG (y)e by Lemma 2.1. 2 Theorem 2.3. The optimal weight of a dense 3-graph is unique. Proof. Suppose that y, y  ∈ Int(Δn ) are two optimal weights of a dense 3-graph G and set z = y  −y. Then we have LG (y) = LG (y  ) and zet = 0. Applying (1) and Lemma 2.2, we have 1 LG (y + αz) = LG (y) + α2 zHG (y)z t + α3 LG (z). 2 Taking α = 1, we obtain

1 zHG (y)z t + LG (z) = 0. Thus 2

1 LG (y + αz) = LG (y) + α2 (1 − α)zHG (y)z t . 2 Note that for 0 ≤ α ≤ 1, we have y + αz ∈ Int(Δn ) since Δn is convex and y, y  ∈ Int(Δn ). It follows that zHG (y)z t ≤ 0, otherwise, LG (y + αz) > LG (y) for 0 ≤ α ≤ 1,

P. Chen et al. / Linear Algebra and its Applications 531 (2017) 537–546

541

which is a contradiction. Suppose z = 0, it is easily seen that there exists an α0 ∈ R with α0 > 1 such that y + α0 z ∈ ∂Δn . Then 1 LG (y + α0 z) = LG (y) + α02 (1 − α0 )zHG (y)z t ≥ LG (y). 2 Which is impossible since G is dense. Hence, z = y − y  = 0.

2

Theorem 2.4. Let G be a dense 3-graph, then y is the optimal weight of G if and only if xHG (y)xt < 0 and yHG (x)y t = 0 for all x = 0 and xet = 0. Proof. To prove the necessity, it is enough by Lemma 2.2 to show that HG (y) is negative definite on the hyperplane xet = 0, i.e., xHG (y)xt < 0 for x = 0 and xet = 0. For any x = 0 and xet = 0, by (2) and Lemma 2.2, we have 1 LG (y + αx) = LG (y) + α2 xHG (y)xt + α3 LG (x). 2 Suppose xHG (y)xt > 0. We can choose α small enough such that y + αx ∈ Int(Δn ) and 1 2 α xHG (y)xt +α3 LG (x) > 0, which implies LG (y +αx) > LG (y), contradicting the as2 sumption that y is optimal. Thus xHG (y)xt ≤ 0. Indeed, we shall prove xHG (y)xt < 0. Suppose xHG (y)xt = 0. Then we have LG (y + αx) = LG (y) + α3 LG (x). It follows that LG (x) = 0, otherwise, we choose α small enough such that α3 LG (x) > 0 and y + αx ∈ Int(Δn ) which implies LG (y + αx) > LG (y), a contradiction. However, when xHG (y)xt = 0 and LG (x) = 0, we have LG (y +αx) = LG (y), which is a contradiction to the uniqueness of the optimal weight y by Theorem 2.3. Hence, we obtain xHG (y)xt < 0 for all x = 0 with xet = 0. To prove the sufficiency, suppose y ∈ Int(Δn ) is the unique optimal weight of G and  y = y. Let z = y  − y, then zet = 0 and 1 LG (y + αz) = LG (y) + α2 zHG (y)z t + α3 LG (z). 2 1 zHG (y)z t < 0, b = LG (z), and f (t) = at2 + bt3 . Since a < 0 and 2 f (1) > 0, then f (1) = 3(a + b) − a = 3f (1) − a > 0, which implies that there exists α > 1 satisfying f (α) > f (1) and y  + (α − 1)z = y + αz ∈ Int(Δn ). Then we have LG (y + αz) = LG (y) + f (α) > LG (y) + f (1) = LG (y  ), which contradicts the optimality of y  . 2 Denote by a =

Corollary 2.5. [2] If G is a dense 3-graph, then for any pair {i, j} ∈ edge e ∈ E such that {i, j} ⊆ e.

V (G) 2

, there is an

Proof. Let y be the optimal weight of G. Suppose that no edge of G contains {i, j}, then the (i, j) entry of HG (y) is 0. Let x be the vector whose i-th coordinate is 1,

542

P. Chen et al. / Linear Algebra and its Applications 531 (2017) 537–546

j-th coordinate −1, and 0 otherwise, we have x = 0 and xet = 0. We obtain that xHG (y)xt = 0, contradicting Theorem 2.4. 2 Corollary 2.6. If y is the optimal weight of a dense 3-graph G, then HG (y) is invertible. Proof. Applying the theory of linear algebra, for a square matrix M , the equation xM = 0 has a nonzero solution if and only if the matrix M is degenerate. Suppose x = 0 is a solution of the equation xHG (y) = 0, we have yHG (y) = ce for some constant c by Lemma 2.2. Then xHG (y)y t = cxet = 0. From (2), we have LG (y + αx) = LG (y) +α3 LG (x), which will lead to a contradiction. In fact, if LG (x) = 0, we can choose α small enough such that y+αx ∈ Int(Δn ) and α3 LG (x) > 0, then we have LG (y+αx) > LG (y), contradicting the optimality of y. If LG (x) = 0, we have LG (y + αx) = LG (y), contradicting the uniqueness of the optimal weight of G. 2 3. The second largest eigenvalue Let G be a dense 3-graph. In the following, we shall characterize the optimal weight of G via the second largest eigenvalue of its Hessian matrix. Given any real vector y ∈ Δn , the Hessian matrix HG (y) is a symmetric matrix whose entries are nonnegative and diagonal entries are all zero. Then all eigenvalues of HG (y) are real and can be linearly ordered as λ1 (y) ≥ λ2 (y) ≥ · · · ≥ λn (y). Let ΛG (y) = diag(λ1 (y), . . . , λn (y)) be the diagonal matrix with diagonal entries λi (y)(i ∈ [n]), then there is an orthogonal matrix PG (y), i.e., PG (y)PGt (y) = I, such that HG (y) = PG (y)ΛG (y)PGt (y). If we set v G (y) = yPG (y) = (v1 (y), . . . , vn (y)), aG (y) = ePG (y) = (a1 (y), . . . , an (y)), then x = 0 and xet = 0 are equivalent to u = 0 and uatG (y) = 0, where u = xPG (y) = (u1 , . . . , un ). Remark 3.1. Theorem 2.4 can be restated as follows, y is the optimal weight of a dense 3-graph G if and only if for all u = 0 satisfying uatG (y) = 0, we have uΛG (y)v tG (y) = 0, uΛG (y)ut < 0.

P. Chen et al. / Linear Algebra and its Applications 531 (2017) 537–546

543

Lemma 3.2. If y is the optimal weight of a dense 3-graph G, then the largest eigenvalue λ1 (y) of HG (y) is positive. n Proof. Since λ1 (y), . . . , λn (y) are eigenvalues of HG (y), then i=1 λi (y) is the trace n of HG (y) which is obviously zero, i.e., i=1 λi (y) = 0. It follows that λ1 (y) ≥ 0 since λ1 (y) is the maximal eigenvalue of HG (y). However if λ1 (y) = 0, then we have λi (y) = 0 for all i ∈ [n], namely HG (y) = 0, a contradiction. Hence we obtain λ1 (y) > 0. 2 Lemma 3.3. If y is the optimal weight of a dense 3-graph G, then a1 (y) = 0. Proof. Suppose a1 (y) = 0. Let u = (1, 0, . . . , 0), then uatG (y) = 0 and uΛG (y)ut = λ1 (y) > 0, contradicting Remark 3.1. 2 Lemma 3.4. If y is the optimal weight of a dense 3-graph G, then the second largest eigenvalue λ2 (y) of HG (y) is negative. Proof. Suppose λ2 (y) ≥ 0. Let u = (a2 (y), −a1 (y), 0, . . . , 0) = 0, then uatG (y) = 0 and uΛG (y)ut = λ1 (y)a22 (y) + λ2 (y)a21 (y) ≥ 0. Contradicting Remark 3.1.

2

Now we are ready to prove our main result in this section. Theorem 3.5. A dense 3-graph G has the optimal weight y if and only if the second largest eigenvalue λ2 (y) of HG (y) is negative and yHG (y) = 6LG (y)e. Proof. The necessity is given by Lemma 2.2 and Lemma 3.4. To prove the sufficiency, it is enough to show that uΛG (y)v tG (y) = 0 and uΛG (y)ut < 0 for all u = 0 and uatG (y) = 0 from Remark 3.1. If u = 0, uatG (y) = 0, then x = 0, xet = 0. By Lemma 2.1, we have uΛG (y)v tG (y) = xPG (y)ΛG (y)PGt (y)y t = xHG (y)y t = 0. It remains to prove uΛG (y)ut < 0 for all u = 0 and uatG (y) = 0. Since 6LG (y)e = yHG (y) = yPG (y)ΛG (y)PGt (y) = v G (y)ΛG (y)PGt (y), multiplying PG (y) on both sides, we obtain 6LG (y)aG (y) = v G (y)ΛG (y). On the other hand, we have v G (y)aG (y)t = yPG (y)PGt (y)et = yet = 1. It follows that 1=

t 6LG (y)aG (y)Λ−1 G (y)aG (y)

= 6LG (y)

n  a2 (y) i

i=1

λi (y)

.

(3)

544

P. Chen et al. / Linear Algebra and its Applications 531 (2017) 537–546

n Note that i=1 λi (y) = 0. According to the assumption that the second largest eigenvalue of HG (y) is negative, it is easily seen that λ1 (y) > 0 > λ2 (y) ≥ · · · ≥ λn (y) ai (y) , then uatG (y) = 0 implies and a1 (y) = 0. For 2 ≤ i ≤ n, let bi (y) = a (y) 1 n u1 = − i=2 bi (y)ui . Let dn (y) = λ1 (y), we have t

uΛG (y)u =

n 

λi (y)u2i

i=1

=

n 

λi (y)u2i

i=2

+ dn (y)

 n 

2 bi (y)ui

(4)

i=2

2 n−1  dn (y)bn (y) un + bi (y)ui = λn (y) + λn (y) + dn (y)b2n (y) i=2 2  n−1  n−1   d2n (y)b2n (y) 2 + λi (y)ui + dn (y) − bi (y)ui . λn (y) + dn (y)b2n (y) i=2 i=2 





dn (y)b2n (y)

Let dn−1 (y) = dn (y) −

λn (y)dn (y) d2n (y)b2n (y) = , we have 2 λn (y) + dn (y)bn (y) λn (y) + dn (y)b2n (y) 

n−1  dn (y)bn (y) t 2 uΛG (y)u = λn (y) + dn (y)bn (y) un + bi (y)ui λn (y) + dn (y)b2n (y) i=2 n−1 2 n−1   2 + λi (y)ui + dn−1 (y) bi (y)ui .





i=2

2

i=2

Applying similar operations as (4) to the formula n−1 

λi (y)u2i

+ dn−1 (y)

n−1 

i=2

2 bi (y)ui

,

i=2

we can obtain ⎛ n    uΛG (y)ut = λi (y) + di (y)b2i (y) ⎝ui + i=2

di (y)bi (y) λi (y) + di (y)b2i (y)

i−1 

⎞2 bj (y)uj ⎠ , (5)

j=2

where the sequence di (y) is defined recursively from i = n to i = 1 as follows, dn (y) = λ1 (y), di−1 (y) =

di (y)λi (y) , for i = 2, . . . , n. λi (y) + di (y)b2i (y)

It is easy to check that di−1 (y) =

1+

λ1 (y) n λ1 (y)

2 j=i λj (y) bj (y)

P. Chen et al. / Linear Algebra and its Applications 531 (2017) 537–546

545

by induction on i. To prove uΛG (y)ut < 0 for all u = 0 and uatG (y) = 0, it suffices from (5) to prove λi (y) + di (y)b2i (y) < 0 for i = 2, . . . , n. Note that λi (y) are negative and λi (y) + di (y)b2i (y) =

di (y)λi (y) , it is enough to show di−1 (y)

that di (y) > 0 for i = 2, . . . , n. di−1 (y) n Recall that dn (y) = λ1 (y) = − i=2 λi (y) > 0, then we only need to prove di (y) > 0 for i = 2, . . . , n. Since λ1 (y) > 0 > λ2 (y) ≥ · · · ≥ λn (y), we have 1+

n  λ1 (y) j=i

λj (y)

b2j (y) ≥ 1 +

n  λ1 (y) i=2

λi (y)

b2i (y).

Since di−1 (y) =

1+

λ1 (y) n λ1 (y)

2 j=i λj (y) bj (y)

,

it is enough to prove 1+

n  λ1 (y) i=2

λi (y)

λ1 (y)  a2i (y) > 0. a21 (y) i=2 λi (y) n

b2i (y) = 1 +

Applying (3), it remains to prove λ1 (y) > 0, 6LG (y)a21 (y) which is obvious since λ1 (y) > 0, LG (y) > 0, and a1 (y) = 0. 2 4. Application on regular 3-graphs Given a 3-graph G with edge set E ⊆ i.e.,

[n] 3 , we denote by Eij (G) the link of {i, j} ⊆ [n],

Eij (G) = {k ∈ [n] | (ijk) ∈ E} . A 3-graph G is called s-regular if |Eij (G)| = s for all {i, j} ⊆ [n].

546

P. Chen et al. / Linear Algebra and its Applications 531 (2017) 537–546

Theorem 4.1. If G is an s-regular dense 3-graph, then y = ( n1 , n1 , · · · , n1 ) is the unique optimal weight. Proof. Since HG (y) =

1 s HG (e) = (J − I) and n n yHG (y) =

s s(n − 1) e(J − I) = e. n2 n2

Meanwhile,   |E(G)| 1 n s s(n − 1) = = . n3 3 2 n3 6n2

LG (y) =

Then we have yHG (y) = 6LG (y)e. By Theorem 3.5, it remains to show that the second s largest eigenvalue of HG (y) = (J −I) is negative, where J is the matrix with all entries n 1 and I the identity matrix. Note that J − I has eigenvalues n − 1, −1, . . . , −1, which completes the proof. 2 Corollary 4.2. If G is an s-regular dense 3-graph, then s > 29 n. Proof. Since G is an s-regular dense 3-graph. By Theorem 4.1, we have LG (y) =

s(n − 1) ≥ LG (x) for all x ∈ Δn . 6n2

Taking x = ( 13 , 13 , 13 , 0, · · · , 0), we have 1 s(n − 1) . ≥ 2 6n 27 A simple calculation then yields s≥

2 2n2 > n. 2 9(n − 1) 9

References [1] P. Frankl, Z. Füredi, Extremal problems whose solutions are the blow-ups of the small Witt-designs, J. Combin. Theory Ser. A 52 (1989) 129–147. [2] P. Frankl, V. Rödl, Hypergraphs do not jump, Combinatorica 4 (2–3) (1984) 149–159. [3] P. Keevash, Hypergraph Turán problems, in: Surveys in Combinatorics, Cambridge University Press, 2011, pp. 83–140. [4] T. Motzkin, E. Straus, Maxima for graphs and a new proof of a theorem of Turán, Canad. J. Math. 17 (1965) 533–540. [5] A.F. Sidorenko, The maximal number of edges in a homogeneous hypergraph containing no prohibited subgraphs, Math. Notes 41 (1987) 247–259, translated from Mat. Zametki. [6] J. Talbot, Lagrangians of hypergraphs, Combin. Probab. Comput. 11 (2002) 199–216.