Linear Algebra and its Applications 559 (2018) 95–113
Contents lists available at ScienceDirect
Linear Algebra and its Applications www.elsevier.com/locate/laa
Layout of random circulant graphs Sebastian Richter a,1 , Israel Rocha b,∗,2 a
Fakultät für Mathematik, Technische Universität Chemnitz, D-09107 Chemnitz, Germany b The Czech Academy of Sciences, Institute of Computer Science, Pod Vodárenskou věží 2, 182 07 Prague, Czech Republic 3
a r t i c l e
i n f o
Article history: Received 14 July 2017 Accepted 4 September 2018 Available online 7 September 2018 Submitted by R. Brualdi MSC: 05C50 05C85 15A52 15A18
a b s t r a c t A circulant graph G is a graph on n vertices that can be numbered from 0 to n − 1 in such a way that, if two vertices x and (x + d) mod n are adjacent, then every two vertices z and (z + d) mod n are adjacent. We call layout of the circulant graph any numbering that witness this definition. A random circulant graph results from deleting each edge of G uniformly with probability 1 − p. We address the problem of finding the layout of a random circulant graph. We provide a polynomial time algorithm that approximates the solution and we bound the error of the approximation with high probability. © 2018 Elsevier Inc. All rights reserved.
Keywords: Random graphs Geometric graphs Circulant matrices Random matrices Rank correlation coefficient
* Corresponding author. E-mail addresses:
[email protected] (S. Richter),
[email protected] (I. Rocha). Richter thanks the Institute of Computer Science of The Czech Academy of Sciences, and the Czech Science Foundation, under the grant number GJ16-07822Y, for travel support. 2 Rocha was supported by the Czech Science Foundation, grant number GJ16-07822Y. 3 With institutional support RVO:67985807. 1
https://doi.org/10.1016/j.laa.2018.09.003 0024-3795/© 2018 Elsevier Inc. All rights reserved.
96
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
Fig. 1.1. Circulant graphs.
1. Introduction A layout on the graph G = (V, E) is a bijection function f : V → {1, . . . , |V |}. Layout problems can be used to formulate several well-known optimization problems on graphs. Also known as linear ordering problems or linear arrangement problems, they consist on the minimization of specific metrics. Such metrics would provide the solution to problems as linear arrangement, bandwidth, modified cut, cut width, sum cut, vertex separation and edge separation. All these problems are NP-hard in the general case. A circulant graph H is defined on the set of vertices V = {1, . . . , n} and edges E = {(i, j) : |i − j| ≡ s (modn) , s ∈ N }, where N ⊆ 1, . . . , n−1 2 . There are a few equivalent definitions for a circulant graph: these graphs have a circulant adjacency matrix; a circulant graph H is a graph on n vertices that can be numbered from 0 to n − 1 in such a way that, if two vertices x and (x + d) mod n are adjacent, then every two vertices z and (z + d) mod n are adjacent. We call layout of the circulant graph any numbering that witness the latter definition. A random circulant graph results from deleting each edge of H uniformly with probability 1 − p. Finally, a layout of the random circulant graph is a layout of the deterministic graph H, which we refer as the graph model. Noticeable, circulant graphs carry a nice shape (see Fig. 1.1). From the picture, it is easy to see that starting from zero any sequential numbering given to the vertices around the circle is a layout. In fact, any layout arises this way if we correctly place all vertices around a circle. That means a layout encodes the geometric structure of a circulant graph. Naturally, the structure is expected to be reflected in the random graph as well, which gives rise to the main question of this paper. Precisely, the problem we address is the following: given a random circulant graph, find its layout. In this paper, we present a solution to this problem by means of eigenvectors. This type of problems is related to the Minimum Linear Arrangement problem (MinLA), that is, to find a function f that minimizes the sum uv∈E |f (u) − f (v)|. The MinLA is one of the most important graph layout problems and was introduced in 1964 by Harper to develop error-correcting codes with minimal average absolute errors. In fact, MinLA appear in a vast domain of problems: VLSI circuit design, network reliability, topology awareness of overlay networks, single machine job scheduling, numerical analysis, computational biology, information retrieval, automatic graph drawing, etc. For instance, layout problems appear in the reconstruction of DNA sequences [6],
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
97
using overlaps of genes between fragments. Also, MinLA has been used in brain cortex modelling [7]. In [5] it is presented a good survey on graph layout problems and its applications. We want to call the attention of the reader to the fact that, even tough closely related, the layout of a circulant graph is not a solution of the MinLA for all instances of circulant graphs. The MinLA problem for some classes of circulant graphs is solved in [8], where the authors address the problem of finding an embedding of G into a graph H. In that case, G is a circulant graph and H is a cycle graph. Certain circulant graphs are of particular interest. In [10] it is presented a polynomial time algorithm solving MinLA of Chord graphs, which is a particular case of circulant graphs. The main motivation of [10] is an application to topology awareness of peer-to-peer overlay networks. The solution of [10] assumes that the Chord graph is complete. However, in real overlay networks nodes can disconnect at any moment, so the remaining network can be regarded as a random circulant graph. Thus, the random circulant graph model and the reconstruction of its layout suits well such applications. Nevertheless, layout problems for random graphs are significantly more complicated than its deterministic counterparts and usually the solution is an approximation of the solution of the model graph. The paper [5] is concerned with the approximability of several layout problems on families of random geometric graphs. It is proven that some of these problems are still NP-complete even for deterministic geometric graphs. The authors present heuristics that turn out to be constant approximation algorithms for layout problems on random geometric graphs, almost surely. The authors of [5] remark that their algorithms use the node coordinates in order to build a layout. That is another feature we do not require in our problem. Even though, the random graph follows a geometric graph model (the circulant structure), we do not have the coordinates of the random graph in advance. The input random graph consists of a set of vertices and edges only and we have to retrieve the circulant layout from that. Eigenvectors of random matrices are the main tool we used to reconstruct the layout in our problem. We introduced this idea in [9], where one eigenvector would suffice to recover the structure of a random linear graph. Here, as we will see, one eigenvector alone is not enough to encode the whole layout. Fortunately, we can combine two special eigenvectors to find the linear arrangement. Even tough, the use of eigenvectors in the same fashion is a common feature of both methods, here we require some additional technical details that were not present in [9]. Due to the use of angles between subspaces and SVD decomposition, the technique we use here differs significantly from [9]. There, we pointed out the generality of such method and here it turns out we need a more careful analysis. Nevertheless, we have evidence that these methods can be used to implement a general framework for which layout problems can be solved in a broader class of random geometric graphs. The rest of the paper is organized as follow. In section 2 we define the model matrix, state the algorithm, and the main theorems. In section 3 we describe basic properties of angle between subspaces. Finally, in section 4 we provide the proofs for the results.
98
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
2. Main results A circulant matrix A is a matrix that can be completely specified by only one vector a, that appears in the first column of A. The remaining columns are cyclic permutations of a with offset equal to the column index, i.e., the matrix A is of the following form ⎡
a1 ⎢ an A=⎢ ⎣ ... a2
a2 a1 a3
a3 a2 .. . a4
... ... ...
⎤ an an−1 ⎥ . .. ⎥ . ⎦ a1
A circulant graph is a graph with circulant adjacency matrix. Let H = (V, E) be a circulant graph with vertex set V = {v1 , . . . , vn } and adjacency matrix A, where [a1 , ..., an ] corresponds to the first row of A. We define the set of indices of non-zero elements in the first half of the row of A as N := {k : ak = 1, 1 ≤ k ≤
n−1 }. 2
We call N the adjacency set of the circulant graph. Equivalently, a circulant graph can be defined as the Cayley graph of a finite cyclic group. In this paper, H is referred as the model graph. The random graph we consider is denoted by G = (V, E) which results from deleting edges of H with probability 1 −p. The model matrix M is a circulant matrix that describes the structure of H, where M = pA (see Fig. 2.1).
Fig. 2.1. Model graph and its model matrix.
ˆ be the adjacency matrix of the random graph G. The entries of M ˆ Furthermore, let M correspond to independent Bernoulli variables, where P(m ˆ ij = 1) = mij (see Fig. 2.2). By construction, the labels of vertices in the random graph G corresponds to the same labels as the graph model in the first figure. However, in the real world we do not have the labels in advance. We are talking about a large amount of disorganized data with additional noise. We only know that this data encodes a circulant structure which is hidden from us. In such situations, finding the labels for the random graph can be rather challenging (see Fig. 2.3).
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
99
Fig. 2.2. Random graph and its random matrix.
Fig. 2.3. Data before and after the correct permutation. The first matrix represents the adjacency matrix of a random circulant graph using arbitrary labels for the vertices. The second matrix is the adjacency matrix of the same graph using the correct labels.
That is precisely the problem we address in this paper: given a graph that follows a circulant model, find its circular embedding, or rather, retrieve the correct order of the vertices. We present an algorithm that solves this problem by using eigenvectors correˆ The algorithm can be described sponding the second and third largest eigenvalues M. as follow. Algorithm 1 Require: Random matrix M ) and λ3 (M ) 1: Compute x and y , the eigenvectors for λ2 (M 2: Compute the angular coordinate ϕi for the point of coordinates ( xi , y i ) 3: Define a permutation σ such that σ(i) > σ(j) iff ϕi ≥ ϕj 4: return σ
This simple algorithm is shown to return the correct labels with a bounded error. We quantify the error in terms of a rank correlation coefficient we introduce. Before, let us plot the points described in Algorithm 1 (see Fig. 2.4).
100
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
Fig. 2.4. Points whose coordinates are entries of the eigenvectors. Probabilities: 0.3, 0.5, 0.9. Blue and red points are associated to the eigenvector of the model graph and the random graph, respectively. (For interpretation of the colours in the figure(s), the reader is referred to the web version of this article.)
The circle-like shape follows indeed the layout we are looking to reconstruct. This phenomenon can be explained in terms of angles between spaces which appear in our proofs. Also, notice that the points that are in the wrong position are not a major part. That can be explained in terms of rank correlation coefficients. A rank correlation coefficient measures the degree of similarity between two lists, and can be used to assess the significance of the relation between them. One can see the rank of one list as a permutation of the rank of the other. Statisticians have used a number of different measures of closeness for permutations. Some popular rank correlation statistics are Kendall’s τ , Kendall distance, and Spearman’s footrule. There are several other metrics, and for different situations some metrics are preferable. For a deeper discussion on metrics on permutations we recommend [4]. To count the total number of inversions in σ one can use
D(σ) = 1σ(i)>σ(j) (Kendall Distance) i
First we describe a refined version of the Kendall distance introduced in [9] that is used for a random linear graph model. This version counts inverted pairs whose indices are at least k positions apart. First note that, for a permutation σ, we can rewrite D(σ) as D(σ) = |{(i, j) : σ(j) < σ(i) and i < j}|. Given a permutation σ and an index k ≥ 1, let Dk (σ) = |{(i, j) : σ(j) < σ(i) and i + k ≤ j}|. Thus, Dk counts the number of inverted pairs where the vertices have jumped at least k positions from their original order. In particular, D1 (σ) = D(σ). Notice that the way we quantify the error for a linear model differs from the way we do it for circulant graphs. The inversions must somehow account the circular nature of the ordering, which is one main difference between the model we used in [9] and the circulant model in this paper.
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
101
From now on, we want to explain Dk (σ) for the permutation returned by Algorithm 1 and this gives an good insight about the mechanism behind that algorithm. Consider the eigenvectors x and y for λ2 (M ) and λ3 (M ), respectively. As we will see later, the set of points zi = (xi , yi ) have coordinates equally distributed on a circle in the plane. So let us assume it for a moment. Let ϕ(x) be the angular coordinate for x ∈ R2 . A crucial n observation is that {ϕ(zi )}i=1 is an increasing sequence. That means that the order of ϕ(zi ) provides one possible correct placement for the vertices in the model graph. ) and λ3 (M ), respectively. Similarly, we can consider the eigenvectors x and y for λ2 (M n Here, {ϕ(z i )}i=1 does not necessarily form an increasing sequence. Thus, we want to construct a permutation σ of indices that places all angles ϕ(z i ) in their correct position. What Algorithm 1 does is to simply sort the angles. What we can show is that this simple procedure places almost all vertices in its correct position. In order to quantify how many points are wrong, a criteria to identify them is needed. To precisely describe what we mean by correct position, let us look back to the points zi coming from the deterministic matrix M . Notice that for i < j ≤ i + n/2, it trivially holds that (ϕ(zj ) − ϕ(zi )) mod 2π < π and for i + n/2 < j it holds (ϕ(zi ) − ϕ(zj )) mod 2π ≤ π. That is to say, the first half of the points zj that are positioned after zi on the circle in the clockwise direction form an angle smaller than π; for the second half of points it forms an angle larger than π. That is a property that we also would like for the points coming from the random matrix, that is for all pair of points z i and z j after applying the permutation σ. This way, we will say that ϕ(z i ) and ϕ(z j ) are in the wrong position with respect to each other if either i < j < i + n/2 and (ϕ(z i ) − ϕ(z j )) mod 2π ≤ π or i + n/2 < j and (ϕ(z j ) − ϕ(z i )) mod 2π < π. That gives D1 (σ) = |{(i, j) : σ(j) < σ(i) and i < j}| = |{(i, j) : (ϕ(z i ) − ϕ(z j )) or (ϕ(z j ) − ϕ(z i ))
mod 2π ≤ π and i < j ≤ i + n/2
mod 2π < π and i + n/2 < j ≤ n}|.
Similarly, for any 1 ≤ k < n/2 we can write Dk (σ) = |{(i, j) : σ(j) < σ(i) and i + k ≤ j}| = |{(i, j) : (ϕ(z i ) − ϕ(z j )) or (ϕ(z j ) − ϕ(z i ))
mod 2π ≤ π and i + k ≤ j ≤ i + n/2
mod 2π < π and i + n/2 < j ≤ L(i)}|,
(2.1)
where L(i) =
n i−k
if k ≤ i mod n
otherwise.
The first part of the definition of Dk accounts the first half of indices j that are at least k positions behind of i. For the second half, the function L(i) upper limits the indices
102
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
we account and it is meant to assess the circular nature of the model. That is, for each point i, L(i) ensures that we are only considering indices j that are at least k positions ahead i. In view of the last observations, the permutation σ returned by Algorithm 1 has a neat interpretation in terms of Dk : Dk (σ) counts the pairs in zˆ that disagree with the order induced by z by at least k positions. I.e., the permutation σ of Algorithm 1 has Dk (σ) wrong pairs. Fortunately, the next Theorem bounds the number of such pairs. Theorem 1. Let σ be the permutation returned by Algorithm 1 for a random circulant graph. Let k ∈ Ω(nβ ) and |N | = cn, for a constant c > 0. Then it holds Dk (σ) ∈ O(n3−2β ) with probability 1 − n−3 . In fact, we prove a more general version of Theorem 1 where we allow the edge density to be variable. Theorem 2. Let σ be the permutation returned by Algorithm 1 for a random circulant graph with model satisfying |N | = cnγ , for a constant c > 0. Let k ∈ Ω(nβ ). Then we have Dk (σ) ∈ O(n9−6γ−2β ) with probability 1 − n−3 . Furthermore, depending on the parameters γ and β we can improve the bounds of the last Theorems, as shown in the next result. Theorem 3. Let σ be the permutation returned by Algorithm 1 for a random circulant graph with model satisfying |N | = cnγ , for a constant c > 0. Let k ∈ Ω(nβ ). Then we 10−6γ−β have Dk (σ) ∈ O(n 2 ) with probability 1 − n−3 . Notice that the last result shows that there is a trade off between how far vertices can jump and the total number of such incorrectly placed vertices. That is useful for our purpose to establish metrics on the correctness of the rank. For example, consider the worst case of Theorem 3 when all pairs are incorrect. Assuming γ = 1, the number of pairs that drift less than k positions apart is n2 − Dk . If we fix an 0 < < 2/3, and take β = in Theorem 3, we obtain that n2 − Dk is asymptotically equivalent to n2 as n → ∞. That means almost no vertex will drift more than n slots from its correct position, which gives a nearly constant drift for almost all vertices. From the perspective of Theorem 2, fixing a number 1/2 < β ≤ 1, then again n2 − Dk is asymptotically equivalent to n2 as n → ∞. Which is the same as saying that almost no vertex will drift more than n1/2 slots from its correct position. Finally, the next theorem shows that the permutation returned by Algorithm 1 is well behaved in terms of the usual Kendall distance. Theorem 4. Let σ be the permutation returned by Algorithm 1 for a random circulant graph with model satisfying |N | = cnγ and 1 ≥ γ > 0. Then D(σ) ∈ O(n(11−6γ)/3 ) with probability 1 − n−3 .
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
103
To prove the results, our technique uses singular value decomposition and angles between subspaces, which require expressions for the eigenvalues and eigenvectors of the model matrix. Fortunately, circulant matrices have known spectrum and, as we will see, there is a specific pair of eigenvectors carrying the desired information about the structure of the graph, providing the correct label of vertices. Moreover, consecutive entries of the eigenvectors differ significantly enough so that a small perturbation will have limited effect on the labels. Further, in Section 3 we show that those eigenvectors are close to the eigenvectors of the random graph. In Section 4 we perform the qualitative analysis of the problem proving the main results. 3. SVD and angles between subspaces The definition of an angle between two vectors can be extended to angles between subspaces. Definition 5. Let X ⊂ Rn and Y ⊂ Rn be subspaces with dim(X ) = p and dim(Y) = q. Let m = min(p, q). The principal angles Θ = [θ1 , . . . , θm ] , where θk ∈ [0, π/2] , k = 1, . . . , m, between X and Y are recursively defined by their cosine values. Namely, given s1 , . . . , sk−1 , we define sk by the value of the optimization problem sk = cos(θk ) = max max xT y = xTk yk , x∈X y∈Y
subject to x = y = 1, xT xi = 0, y T yi = 0, for i = 1, . . . , k − 1. The vectors {x1 , . . . , xm } and {y1 , . . . , ym } are called the principal vectors for X and Y. The principal angles and principal vectors can be characterized in terms of a Singular Value Decomposition. That provides a constructive form for the principal vectors, which is what we use in the proofs. That is the subject of the next Theorem proved in [1]. Theorem 6. Let the columns of the matrices X ∈ Rn×p and Y ∈ Rn×q form an orthonormal bases for the subspaces X and Y, respectively. Consider the singular value decomposition X T Y = U ΣV T , where U and V are unitary matrices and Σ is a p × q diagonal matrix with real diagonal entries s1 , . . . , sm in nonincreasing order with m = min(p, q). Then
104
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
cos Θ = [s1 , . . . , sm ] , where Θ denotes the vector of principal angles between X and Y. Furthermore, the principal vectors for X and Y are given by the first m columns of XU and Y V . In [12], the authors prove a variant of Davis–Kahan Theorem, which gives an upper bound for the sine of the principal angles between subspaces in terms of eigenvalues of the matrices whose columns are bases for the subspaces. The original version of Davis– Kahan [3] relies on an eigenvalue separation condition for those matrices. However, these conditions are not necessarily met by the eigenvalues of a random matrix. That is the reason we use a different version of Davis–Kahan Theorem. We recast the result here for the eigenvalues of interest of our problem. Here · F denotes the Frobenius norm. ∈ Rn×n be symmetric matrices, with eigenvalues λ1 ≥ . . . ≥ λn Theorem 7. Let M, M ˆ n , respectively. Let λi and λˆi have corresponding unitary eigenvectors ˆ1 ≥ . . . ≥ λ and λ vi and vˆi . Let min (λ1 − λ2 , λ3 − λ4 ) > 0, define V = v2 v3 and V = v 2 v 3 . Let T be a 2 × 2 diagonal matrix whose diagonal contains the principal angles between the subspaces spanned by the columns of V and V . Then
sin T F ≤
2 min
√ 2 M − M , M − M min (λ1 − λ2 , λ3 − λ4 )
F
.
4. Bounds and proofs of the main results To prove the main theorems, we need to bound the differences λ1 − λ2 and λ3 − λ4 . Fortunately, the spectrum of circulant graphs is well known, see for example [2], so we do not need to compute it. The four largest eigenvalues of H can be expressed as follows λ1 =
2,
k∈N
λ2 = λ3 =
2 cos
k∈N
λ4 =
2 cos
k∈N
2kπ n
4kπ n
, and .
Their corresponding unitary eigenvectors are 1 v1 = √ (1, 1, . . . , 1)T , n 2π 2π 2 2π v2 = √ (1, cos( ), cos(2 ), . . . , cos((n − 1) ))T , n n n 2n
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
105
2 2π 2π 2π v3 = √ (0, sin( ), sin(2 ), . . . , sin((n − 1) ))T , and n n n 2n 2 4π 4π 4π v4 = √ (1, cos( ), cos(2 ), . . . , cos((n − 1) ))T . n n n 2n Denote by v i is the i−th entry of the vector v. An important observation is that the set of points with coordinates (v2i , v3i ) are on a circle in R2 . Thus, these points describe the correct structure of the graph, providing the correct label of vertices. Throughout the paper n is assumed to be large. Lemma 8. Let H be a circulant graph defined by an adjacency set N , with n vertices, and with eigenvalues λ1 ≥ λ2 = λ3 ≥ λ4 . If |N | = cnγ for a constant c > 0 and 1 ≥ γ > 0, there are constants C1 > 0 and C2 > 0 such that λ1 − λ2 ≥ C1 n3γ−2 and λ3 − λ4 ≥ C2 n3γ−2 . Proof. We will show the lower bound for λ1 − λ2 first. Using the expression for the eigenvalues as above we have λ1 − λ2 = 2 1 − cos 2kπ . Note that cos(θ) is a n k∈N
decreasing function in θ for θ ∈ [0, π] and 2kπ n ≤ π for k ∈ N and therefore λ1 − λ2 ≥ |N | 2 θ4 2 1 − cos 2kπ . Using the Taylor series of cos(θ) at θ = 0 we get cos(θ) ≤ 1− θ2 + 24 , n k=1
thus λ1 − λ2 ≥ 2
|N | k=1
=2 ≥ K1
2k2 π 2 n2
−
2k4 π 4 3n4
3 2 2π 2 2|N | +3|N | +|N | n2 6
|N |3 n2
−
|N |5 n4
−
5 4 3 2π 4 6|N | +15|N | +10|N | −|N | 3n4 30
,
for a nonnegative constant K1 > 0. Therefore, there is a constant C1 > 0 such that λ1 − λ2 ≥ C1 n3γ−2 . For the other bound, λ3 − λ4 = 2 cos 2kπ − cos 4kπ . Note, that f (θ) = n n k∈N
cos(θ) − cos(2θ) has a unique maximum x ∈ [0, π] and 2kπ n ≤ π for k ∈ N . For this reason we will split the sum above using the following partition of N : NL := {k ∈ 2kπ N : 2kπ is increasing for k ∈ NL and n ≤ x} and NU := N − NL . Notice that f n decreasing for k ∈ NU . Set kˆ = max{k ∈ N }. The Taylor series for f (θ) at θ = 0, gives f (θ) ≥ (3θ2 )/2 − (5θ4 )/8, which let us with λ3 − λ4 =
k∈NL
≥
|N L | k=1
− cos 4kπ + − cos 4kπ 2 cos 2kπ 2 cos 2kπ n n n n
2 cos 2kπ − cos 4kπ + n n
k∈NU
ˆ k ˆ k=k−|N U |+1
2 cos 2kπ − cos 4kπ n n
106
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
≥ K1
|N L | k=1
= K2
k2 n2
|NL |3 n2
−
−
k4 n4
|NL | n4
5
ˆ k
+ K1
ˆ k=k−|N U |+1
+ K3
ˆ3 k n2
−
ˆ5 k n4
k2 n2
−
− K4
k4 n4
3 ˆ (k−|N U |) n2
−
5 ˆ (k−|N U |) n4
,
for nonnegative constants K1 , K2 , K3 and K4 . Furthermore, kˆ ≥ |N | and kˆ ≥ |NL |. The first inequality implies that there is a constant K5 such that kˆ ≥ K5 nγ . Therefore, there is a constant C2 > 0 with λ3 − λ4 ≥ C2 n3γ−2 . 2 Using Lemma 8 we are able to prove an upper bound for the deviations of the eigenvectors corresponding to the second and third eigenvalues of the model matrix and the random matrix, respectively. We will also need the following concentration inequality from [11]. Lemma 9 (Norm of a random matrix). There is a constant C > 0 such that the following holds. Let E be a symmetric matrix whose upper diagonal entries eij are independent random variables where eij = 1 −pij or −pij with probabilities pij and 1 −pij , respectively, where 0 ≤ pij ≤ 1. Let σ 2 = maxij pij (1 − pij ). If σ 2 ≥ C log n/n, then P(||E|| ≥ Cσn1/2 ) ≤ n−3 Now we are able to prove the following theorem. Theorem 10. Let M be the circulant graph model matrix, defined by an adjacency set N , with constant probability p, variance σ 2 , and |N | = cnγ for a constant 1 ≥ γ > 0. ˆ the random matrix following the model matrix. Let v2 , v3 be unitary eigenLet M ˆ ), λ3 (M ˆ ). Let vectors for λ2 (M ), λ3 (M ) and vˆ2 , vˆ3 be unitary eigenvectors for λ2 (M x,y ∈ Span {v2 , v3 } and x ˆ, yˆ ∈ Span {ˆ v2 , vˆ3 } be the principal vectors for the principal angles between the spaces Span {v2 , v3 } and Span {ˆ v2 , vˆ3 }. Define the matrices z = (x, y) x, yˆ). Then there is an absolute constant C0 > 0 and such that and zˆ = (ˆ ||z − zˆ||2F ≤ C0 σn5−6γ with probability at least 1 − n−3 . T
Proof. In view of Theorem 6, consider the singular value decomposition [v2 , v3 ] [ˆ v2 , vˆ3 ] = U ΣW T . Let θ2 and θ3 denote the principal angles between the spaces spanned by {v2 , v3 } v2 , vˆ3 }. Let T be a 2 × 2 diagonal matrix whose diagonal contains the principal and {ˆ angles. Note that min(λ1 − λ2 , λ3 − λ4 ) > 0, thus we can apply Theorem 7. We have ||z − zˆ||2F = ||(x, y) − (ˆ x, yˆ)||2F = (||x − x ˆ||2 + ||y − yˆ||2 )
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
107
≤ 2(sin2 (θ2 ) + sin2 (θ3 )) = 2|| sin T ||2F √ ˆ −M ||,||M ˆ −M ||F ) 2 2||M ≤ 2 min( min(λ . −λ ,λ −λ ) 1 2 3 4 ˆ as a perturbation of the matrix M . That Now, we can view the adjacency matrix M ˆ is, M = M + E, where the entries of E are eij = 1 − p with probability p and −p with probability 1 − p. Now, E is as in Lemma 9 and with probability at least 1 − n−3 we have √ √ √ ˆ − M ||, ||M ˆ − M ||F ) ≤ 2||M ˆ − M ||op = ||E|| ||E|| ≤ Cσ n. Furthermore, min( 2||M and λi (M ) = pλi (H). Together with Lemma 8 we get for some absolute constant C0 > 0 ||z − zˆ||2 ≤ C0 σ
2 √ n n3γ−2
= C0 σn5−6γ .
That finishes the proof. 2 Now we will provide a lower bound for ||z − zˆ||F in terms of Dk (σ) and eventually prove Theorem 4. Lemma 11. Let v and w be the unitary eigenvectors for λ2 (M ) and λ3 (M ), respectively. If 1 ≤ k ≤ n/2, then 8 (vi − vi+k ) + (wi − wi+k ) = n 2
2
πk sin n
2
≥ 2π 2 k2 n−3 .
Proof. Notice that 2π (i + k) 2 2πi − cos ) and vi − vi+k = √ (cos n n 2n 2π (i + k) 2 2πi − sin ). wi − wi+k = √ (sin n n 2n α−β First, using that cos α − cos β = −2 sin α+β and that sin α − sin β = 2 sin 2 α+β α−β 2 cos 2 sin 2 , we can write
2 πk (2i + k) sin sin π and n n 2 πk (2i + k) 8 2 sin cos π (wi − wi+k ) = n n n 8 n
2
(vi − vi+k ) =
Furthermore, we have 2
2
(vi − vi+k ) + (wi − wi+k ) =
8 n
sin
πk (2i + k) sin π n n
2 +
8 n
sin
πk (2i + k) cos π n n
2
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
108
8 = n =
8 n
πk sin n
sin
πk n
2
(2i + k) (2i + k) sin π + cos2 π n n
2
2 ,
and using that sin x ≥ x/2 for x ∈ [0, π/2], we arrive at the lower bound from the lemma. 2 Theorem 12. Let M be the circulant graph model matrix with constant probability p and ˆ the random matrix following the model matrix. Let v2 , v3 be unitary variance σ 2 , and M ˆ ) and eigenvectors for λ2 (M ) and λ3 (M ), and vˆ2 and vˆ3 be unitary eigenvectors for λ2 (M ˆ λ3 (M ). Let x,y ∈ Span {v2 , v3 } and x ˆ,ˆ y ∈ Span {ˆ v2 , vˆ3 } be the principal vectors for the principal angles between the spaces Span {v2 , v3 } and Span {ˆ v2 , vˆ3 }. Define the matrices β z = (x, y) and zˆ = (ˆ x, yˆ). If n/2 ≥ k ≥ cn , where c > 0, there is a constant C0 > 0 such that 2
z − zˆ > C0 |R|
n2β , n4
where R = {(i, j) : (ϕ(z i ) − ϕ(z j )) or (ϕ(z j ) − ϕ(z i ))
and L(i) =
n
mod 2π ≤ π and i + k ≤ j ≤ i + n/2
mod 2π < π and i + n/2 < j ≤ L(i)},
if k ≤ i
i − k mod n
otherwise.
Remark. According to the discussion in Section 2, R accounts the pair of points with coordinates given by the rows of zˆ that disagree with the order induced by z by at least k positions. Also, we notice that zˆi is not defined by the entries of the eigenvectors directly, but the principal vectors instead. Proof. As in Theorem 6, let U ΣW T be the singular value decomposition for the matrix T [v2 , v3 ] [ˆ v2 , vˆ3 ]. Thus, z = (v2 , v3 )U and zˆ = (ˆ v2 , vˆ3 )W . Let ϕ(zi ) be the angular coorn dinate of the point zi = (xi , yi ). Thus, {ϕ(zi )}i=1 is an increasing sequence. Now we can write 2n z − zˆ = 2
n n
i=1 j=1
zi − zˆi 2 + zj − zˆj 2 ≥
zi − zˆi 2 + zj − zˆj 2 .
(4.1)
(i,j)∈R
Notice that for each pair (i, j) ∈ R, if i + k ≤ j ≤ i + n/2 the angle between zˆj and zˆi (in the clockwise direction) is smaller than π. Furthermore, the angle between zi and zj
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
109
(in the clockwise direction) is smaller than π. Thus, the minimum in zi − zˆi 2 + zj − zˆj 2 happens whenever zˆi = zˆj =
zi + zj . 2
With a similar argument, the same statement holds true for i + n/2 < j ≤ L(i). Thus, from (4.1), we have 2
2n z − zˆ ≥
zi −
(i,j)∈R
=
zi + zj 2 zi + zj 2 + zj − 2 2
zi − zj 2 2
(i,j)∈R
Now set v = v2 and w = v3 and notice that (zi − zj )U T = (vi − vj , wi − wj ). For (i, j) ∈ R with i + k ≤ j ≤ n/2 we have that z − zj is minimum for j = i + k. Similarly, for i + n/2 < j ≤ L(i) the minimum of zi − zj happens for j = L(i). Notice that for k > i it holds zi − zi+k = zi − zL(i) and zi − zi+k ≤ zi − zL(i) , otherwise. Thus, we can lower bound 2
zi − zi+k 2 2 (i,j)∈R
(zi − zi+k )U T 2 = 2
2n z − zˆ ≥
(i,j)∈R
=
(vi − vi+k )2 + (wi − wi+k )2 . 2
(i,j)∈R
By Lemma 11, for n large enough we obtain 2
2n z − zˆ > C
k2 , n3
(i,j)∈R
for some absolute constant C > 0. Now using that k ≥ cnβ , we can bound 2
z − zˆ > C0 |R|
n2β , n4
for some absolute constant C0 > 0. 2 Now, the proof of Theorem 2 easily follows from Theorem 12 and Theorem 10. First, we make an observation about the order given by Algorithm 1. Let v2 , v3 be unitary ˆ ) and eigenvectors for λ2 (M ) and λ3 (M ), and vˆ2 and vˆ3 be unitary eigenvectors for λ2 (M
110
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
T
ˆ ). Let U ΣW T be the singular value decomposition for the matrix [v2 , v3 ] [ˆ λ3 (M v2 , vˆ3 ]. Let x = vˆ2 and y = vˆ3 as in Algorithm 1, and let ϕ ((x i , y i )) be the angular coordinate of the point (x i , y i ). Define the matrices z = (v2 v3 )U and zˆ = (ˆ v2 vˆ3 )W . Finally, let R = {(i, j) : (ϕ(z i ) − ϕ(z j )) or (ϕ(z j ) − ϕ(z i ))
mod 2π ≤ π and i + k ≤ j ≤ i + n/2
mod 2π < π and i + n/2 < j ≤ L(i)}.
Notice that since W is a rotation matrix, it holds ϕ ((x i , y i )) − ϕ ((x j , y j ))
mod 2π = ϕ (zˆi ) − ϕ (zˆj )
mod 2π.
Thus the order induced by the row vectors of zˆ is the same as the order induced by the row vectors of [ˆ v2 , vˆ3 ]. That implies Dk (σ) = |R|, where σ is the permutation returned by the Algorithm 1. Therefore, we proceed bounding |R|. Proof. (Theorem 2) By Theorems 12 and 10 we have C0 |R|
n2β < ||z − zˆ||2 ≤ C¯0 n5−6γ , n4
where C0 and C¯0 are positive constants and the upper bound holds with probability at least 1 − n−3 . Therefore, there is a constant C > 0 such that |R| < Cn9−6γ−2β , with probability at least 1 − n−3 .
2
The proof of Theorem 3 is similar to the last one but uses a different trick to get a different lower bound. Proof. (Theorem 3) As in Theorem 6, let U ΣW T be the singular value decomposition T for the matrix [v2 , v3 ] [ˆ v2 , vˆ3 ]. Thus, z = (v2 , v3 )U and zˆ = (ˆ v2 , vˆ3 )W . Let ϕ(zi ) be the n angular coordinate of the point zi = (xi , yi ). Thus, {ϕ(zi )}i=1 is an increasing sequence. Fix k = k(n) = C(nβ ) and let R = {(i, j) : (ϕ(z i ) − ϕ(z j )) or (ϕ(z j ) − ϕ(z i )) where L(i) =
mod 2π < π and i + n/2 < j ≤ L(i)}, if k ≤ i
n
i − k mod n Now we can write 2n z − zˆ 2 =
mod 2π ≤ π and i + k ≤ j ≤ i + n/2
n n
i=1 j=1
otherwise.
zi − zˆi 2 + zj − zˆj 2 ≥
(i,j)∈R
zi − zˆi 2 + zj − zˆj 2 .
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
111
As in the proof of Theorem 12, the minimum contribution of each term in the sum happens at the median point zˆi = zˆj =
zi + zj . 2
Thus, we have 2
2n z − zˆ >
zi −
(i,j)∈R
=
zi + zj 2 zi + zj 2 + zj − 2 2
zi − zj 2 2
(4.2)
(i,j)∈R
Now, let ni denote the number of pairs (i, j) ∈ R and label them (i, ji1 ), . . . , (i, jini ) for n i = 1, . . . , n. Therefore, i=1 ni = |R|. Now, for a fixed i we want find a set of indices jit that lower bounds ni
||zi − zjit ||2 .
t=1
The minimizer consists of consecutive indices jit as close as possible to i. Thus we can split these indices into two ranges [i, i + n/2] and [i + n/2, L(i)] as follow: for the first subset i < ji1 , . . . , jia ≤ i +n/2, where 1 ≤ a ≤ ni , set ji1 = i +k, ji2 = i +k+1, . . . , jia = i + k + a − 1, and for i + n/2 < jia+1 , . . . , jini ≤ L(i) set jia+1 = L(i), jia+2 = L(i) − for k > i, 1, . . . , jini = L(i)−ni+1+a. Notice that it holds zi − zi+k+t = zi − zL(i)−t and zi − zi+k+t ≤ zi − zL(i)−t , otherwise. Thus, zi − zi+k+t−a ≤ zi − zL(i)−t+a and that implies that the minimizer must have at least half of the indices in the first range, meaning that a ≥ ni /2. That gives a lower bound ni
||zi − zjit ||2 ≥
t=1
a−1
||zi − zi+k+t ||2 +
≥
||zi − zL(i)−t+a ||2
t=a
t=0 a−1
n i −1
||zi − zi+k+t ||2
t=0 ni /2−1
≥
||zi − zi+k+t ||2 .
(4.3)
t=0
Therefore, setting v = v2 and w = v3 , inequality (4.2) together with inequality (4.3) gives us 2n||z − zˆ|| > 2
/2−1 n ni
||zi − zi+k+t ||2 i=1
t=0
2
/2−1 n ni
1 = 2 i=1
t=0
(vi −vi+k+t )2 +(wi −wi+k+t )2 .
112
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
By Lemma 11, for n large enough, 2n||z − zˆ||2 > C1
/2−1 n ni
(k + t)2
n3
t=0
i=1
> C1
n ni /2−1 k t n3 i=1 t=0
for a constant C1 > 0. Therefore, there is a constant C2 > 0 such that 2n||z − zˆ||2 > n 1 1 C2 nk3 i=1 n2i . Now, recall that two p-norms are related by ||x||p ≤ n p − q ||x||q . Taking p = 1 and q = 2, we obtain n
n
1 1 ni ≤ n 2 ( n2i ) 2
i=1
i=1
which allows us to rewrite inequality (4.2) as 2n||z − zˆ|| > C0 k|R|2 n−4 , for some constant C0 > 0. Combining this inequality with the upper bound of Theorem 10 and using k = k(n) = Ω(nβ ), we can collect all constants and obtain the inequality |R| < Cn
10−6γ−β 2
for an absolute constant C > 0 with probability 1 − n−3 , and therefore Dk (σ) ∈ 10−6γ−β O(n 2 ). 2 Eventually, we give a proof for Theorem 4. Proof. (Theorem 4) Fix k = n(8−6γ)/3 and define R = {(i, j) : (ϕ(z i ) − ϕ(z j )) or (ϕ(z j ) − ϕ(z i ))
mod 2π ≤ π and i + k ≤ j ≤ i + n/2
mod 2π < π and i + n/2 < j ≤ L(i)},
and RC = {(i, j) : (ϕ(z i ) − ϕ(z j )) or (ϕ(z j ) − ϕ(z i ))
mod 2π ≤ π and i + 1 ≤ j < i + k
mod 2π < π and L(i) < j < n}.
That allow us to write D1 (σ) = |R| + |RC |. By Theorem 2, taking β = constant C > 0 so that, for large enough n,
8−6γ 3 ,
|R| = Dk (σ) ≤ Cn9−6γ−2β = Cn(11−6γ)/3 . Furthermore, for each index i there are at most 2k pairs (i, j) in RC , thus
there is a
S. Richter, I. Rocha / Linear Algebra and its Applications 559 (2018) 95–113
113
|RC | ≤ 2kn = 2nβ+1 = 2n(11−6γ)/3 and therefore D(σ) = |R| + |RC | ≤ (C + 2)n(11−6γ)/3 , as required. 2 References ˙ Björck, G. Golub, Numerical methods for computing angles between linear subspaces, Math. [1] A. Comp. 27 (1973) 579–594. [2] B. Codenotti, I. Gerace, S. Vigna, Hardness results and spectra techniques for combinatorial problems on circulant graphs, Linear Algebra Appl. 285 (1998) 123–142. [3] C. Davis, W.M. Kahan, The rotation of eigenvectors by a perturbation. III, SIAM J. Numer. Anal. 7 (1970) 1–46. [4] P. Diaconis, Group Representations in Probability and Statistics, Lecture Notes-Monograph Series, Institute of Mathematical Statistics, Hayward, CA, 1988. [5] J. Díaz, M.D. Penrose, J. Petit, M. Serna, Approximating layout problems on random geometric graphs, J. Algorithms 39 (2001) 78–116. [6] R.M. Karp, Mapping the genome: some combinatorial problems arising in molecular biology, in: Proceedings of the Twenty-Fifth Annual ACM Symposium on the Theory of Computing, San Diego, ACM Press, New York, 1993, pp. 278–285. [7] G. Mitchison, R. Durbin, Optimal numberings of an n × n array, SIAM J. Algebr. Discrete Methods 7 (1986) 571–582. [8] I. Rajasingh, P. Manuel, M. Arockiaraj, B. Rajan, Embeddings of circulant networks, J. Comb. Optim. 26 (2013) 135–151. [9] I. Rocha, J. Janssen, K. Nauzer, Recovering the structure of random linear graphs, Linear Algebra Appl. 557 (2018) 234–264. [10] H. Rostami, J. Habibi, Minimum linear arrangement of chord graphs, Appl. Math. Comput. 197 (2) (2008) 760–767. [11] V. Vu, A simple svd algorithm for finding hidden partitions, Combin. Probab. Comput. 27 (2018) 124–140. [12] Y. Yu, T. Wang, R.J. Samworth, A useful variant of the Davis–Kahan theorem for statisticians, Biometrika 102 (2) (2015) 315–323.