A Parallel Randomized Algorithm for Finding a Maximal Independent Set in a Linear Hypergraph

A Parallel Randomized Algorithm for Finding a Maximal Independent Set in a Linear Hypergraph

JOURNAL OF ALGORITHMS ARTICLE NO. 25, 311]320 Ž1997. AL970884 A Parallel Randomized Algorithm for Finding a Maximal Independent Set in a Linear Hyp...

167KB Sizes 4 Downloads 165 Views

JOURNAL OF ALGORITHMS ARTICLE NO.

25, 311]320 Ž1997.

AL970884

A Parallel Randomized Algorithm for Finding a Maximal Independent Set in a Linear Hypergraph* Tomasz Łuczak and Edyta Szymanska ´ Department of Discrete Mathematics, Adam Mickiewicz Uni¨ ersity, Poznan, ´ Poland Received October 9, 1996

We present a randomized parallel algorithm with polylogarithmic expected running time for finding a maximal independent set in a linear hypergraph. Q 1997 Academic Press

1. INTRODUCTION A hypergraph H is a pair Ž V, E ., where V is a finite set of ¨ ertices and E is a collection of subsets of V with at least two elements. Members of E are called hyperedges, or simply edges of H. Thus, a graph is a hypergraph in which all edges consist of precisely two vertices. A set I : V is independent if it contains no edges of H and by a maximal independent set we mean an independent set not properly contained in any other such set. In this article we are concerned with the problem of describing an efficient parallel algorithm for finding a maximal independent set in a hypergraph. This question was posed by Karp and Ramachdran w5x and seems to be relevant for many applications; nonetheless, it was settled only for special classes of hypergraphs. The first fast parallel randomized algorithm for finding maximal independent sets in graphs was given by Karp and Wigderson w6x. Subsequently, Luby w8x and Alon, Babai and Itai w1x presented very simple probabilistic algorithms for the MIS-graph problem and a general technique of converting probabilistic algorithms into deterministic ones. Beame and Luby w2x generalized Luby’s random permutation algorithm from w8x to hypergraphs and conjectured that it can find a maximal independent set in each hypergraph in polylogarithmic time. Kelsen w7x verified this claim for all hypergraphs whose hyperedges are bounded in size. Finally, let us remark that the fastest known deterministic *Support by KBN Grant 2-P03A-023-09. 311 0196-6774r97 $25.00 Copyright Q 1997 by Academic Press All rights of reproduction in any form reserved.

312

ŁUCZAK AND SZYMANSKA ´

MIS-graph algorithm was given Goldberg and Spencer w4x; their idea was further developed by Dahlhaus, Karpinski, ´ and Kelsen w3x who constructed an N C algorithm for finding a maximal independent set in hypergraphs of dimension 3. Our aim is to present a parallel randomized algorithm for finding a maximal independent set in linear hypergraphs, that is, in hypergraphs in which each pair of edges has at most one vertex in common. THEOREM. There exists a parallel randomized algorithm with polylogarithmic expected running time which finds a maximal independent set in each linear hypergraph using polynomial number of processors.

2. EQUITABLE GRAPHS The algorithm we use is based on the permutation algorithm proposed by Luby w8x and Beame and Luby w2x. However, in each step of our procedure, we first ‘‘preprocess’’ a hypergraph finding in it a large subhypergraph which, roughly speaking, contains no vertices of large degree. Thus, for a linear hypergraph H s Ž V, E . on n vertices, we set Ui s Ui Ž H . s  e g E: < e < s i 4 ,

u i s u i Ž H . s < Ui Ž H . < ,

and, for each ¨ g V, we put d i Ž ¨ , H . s d i Ž ¨ . s <  e g Ui : ¨ g e 4 < . One can easily observe that ݨ g V d i Ž ¨ . s iu i ; that is, the average value of d i Ž ¨ . is iu irn. We say that a hypergraph H s Ž V, E . on n vertices is equitable if either n F 100 or for every ¨ g V and i F log n we have di Ž ¨ . F

iu i n

? log 5 n.

A subhypergraph H9 s Ž V9, E9. of H s Ž V, E . is induced if V9 : V and E9 s  e g E: e : V94 . Clearly, each independent set of an induced subhypergraph of H is independent in H as well. Thus, it is convenient, instead of dealing with the input graph, to look for an independent set in its large induced equitable subhypergraph whose structure is much easier to study. The following result states that such an induced equitable subhypergraph always exists and can be found in polylogarithmic time. CLAIM 1. There exists a procedure Eq which, for each linear hypergraph H on n ¨ ertices, finds in parallel an equitable induced subhypergraph H9 s EqŽ H . of H with at least nr2 ¨ ertices in polylogarithmic time.

MIS ALGORITHM FOR LINEAR HYPERGRAPHS

313

Proof. Let H be a linear hypergraph on n vertices. Delete from H all vertices ¨ such that d i Ž ¨ . ) Ž iu irn. log 4 n for some i F log n, and all edges containing them. Note that the only reason preventing the graph obtained in such a way from being equitable is that together with vertices of large degree we removed a lot of edges, so that for some i the number of edges of size i decreased at least by a factor of about log n. But then we may apply the removal procedure once again, and, since each linear hypergraph on n vertices has at most Ž n2 . edges, after at most log 2 n repetitions we shall arrive at a hypergraph which is equitable. Now it is enough to observe that in one step we delete at most nrlog 3 n vertices of large degree, so during the algorithm we remove not more than nrlog n nr2 vertices of the initial graph.

3. THE ALGORITHM As we have already mentioned, our algorithm is similar to that described by Luby w2x and Beame and Luby w8x: we enlarge greedily the independent set I at the same time eliminating vertices which, if appended to I, would violate its independence. The main difference between our approach and the permutation algorithm from w2x is the fact that in each step we choose an independent set contained in an equitable induced subhypergraph of the input, which considerably simplifies the probabilistic analysis of the procedure. More precisely, each step of the algorithm consists of three parts. First, we use the procedure Eq as defined in Claim 1 to find a large equitable induced hypergraph H9 of the input hypergraph H. Then we apply another procedure, IndSet, which, roughly, generates randomly a subset W in H9 and then removes from W all vertices which belong to edges of H9 contained in W. In such a way we get an independent set I9 of H9. The last procedure Update appends vertices from I9 to the independent set I and removes them from H. Then it performs the most important operation removing from H all vertices ¨ f W for which there exists a hyperedge e such that e :  ¨ 4 j I9. ŽNote that such vertices cannot be added to I in the further steps of the procedure.. Formally, the algorithm can be described as follows. The Algorithm Input: A linear hypergraph Hin s Ž Vin , Ein . Output: A maximal independent set I in Hin begin I ¤ B, W ¤ B, H s Ž V, E . ¤ Hin s Ž Vin , Ein .,

314

ŁUCZAK AND SZYMANSKA ´

while V / B do begin H9 s Ž V9, E9. ¤ EqŽ H .; IndSetŽ I9, H9, W .; UpdateŽ I, I9, H, W .; end end Procedure IndSetŽ I9, H9, W . begin n9 ¤ < V 9 <; if E9 s B then I9 ¤ V 9 else begin Find a such that n9rlog 8 n9 F

Ý

iu i Ž H9 . a iy1 F 2 n9rlog 8 n9;

iG2

p ¤ min a, ey6 4 ; if p F log 8 n9rn9 then W ¤  ¨ 4 , where ¨ is a vertex of H9 which maximizes d 2 Ž ¨ , H9 . else parallel choose a random set W including each vertex of H9 in W with probability p, independently for each ¨ g V9; I9 ¤ W _ ŽD E92 e : W e . end end Procedure UpdateŽ I, I9, H, W . begin I ¤ I j I9; for each ¨ g I9 parallel do begin V ¤ V _ ¨ 4 remove ¨ from any edge e g E containing ¨ end S ¤  ¨ g V _ W: 'e g E: e :  ¨ 4 j I94 , V¤V_S E ¤ E _  e g E: ' ¨ g S: ¨ g e4 end

315

MIS ALGORITHM FOR LINEAR HYPERGRAPHS

4. THE ANALYSIS OF THE ALGORITHM It is not hard to see that the algorithm, if it terminates, finds an independent set I which is maximal in H. Furthermore, Eq, IndSet as well as Update are polylogarithmic subroutines, so it is enough to show that the loop of the algorithm in invoked only a polylogarithmic number of times. We accomplish this by proving that during an execution of the loop with large probability the size of V drops down significantly. LEMMA. With probability at least 1r2 during one execution of the loop of the algorithm the number of ¨ ertices of H s Ž V, E . decreases by at least nrlog17 n, pro¨ ided n s < V < is large enough. Proof. We split the proof into three cases, corresponding to different values of the parameter a. Case 1.

a F log 8 n9rn9. Then

Ý

iu i Ž H9 . a iy1 F a2

iG3

Ý

iu i Ž H9 .

iG3

F a2

n9 F log 16 n9, 2

ž /

and so u 2 Ž H9 . G G

1 2a 1

ž Ý iu Ž H9. a

iy1

i

y log 16 n9

iG2

ž

n9

2 a log 8 n9

y log

16

n9 G

/

/

Ž n9 .

2

3 log 16 n9

.

Thus for the only element ¨ of W we get d 2 Ž ¨ , H . G d 2 Ž ¨ , H9 . G

2 n9 3 log

16

n9

G

n log 17 n

.

Since the procedure UpdateŽ I, I9, H, W . removes from V all d 2 Ž ¨ , H . vertices joined to ¨ by edges of size two, the assertion follows. Case 2. log 8 n9rn9 F a F ey6 . Let us recall that the algorithm picks at random a subset W of V9 choosing elements for W with probability p s a, then constructs an independent set I : W by removing from W vertices of edges e : W , and finally modifies H accordingly. Since this is by far the most interesting and complicated case we state two main steps of the

ŁUCZAK AND SZYMANSKA ´

316

analysis of the algorithm as separate claims: first we show that with large probability the hypergraph H9 contains a lot of edges e such that < e l W < s < e < y 1 ŽClaim 2., then we prove that the vast majority of such edges are vertex-disjoint and do not intersect the edges entirely contained in W ŽClaim 3.. Let X be a random variable which counts edges e g E9 such that all but one of their elements belong to W. We first estimate the order of X. CLAIM 2. With probability tending to 1 as n9 ª ` we ha¨ e X s QŽ n9rlog 8 n9.. Proof. It is convenient to represent X as X s Ý e g E9 X e , where Xe s

½

if < e l W < s < e < y 1, otherwise.

1 0

Since E X e s < e <Ž1 y p . p < e
E Xe s Ž 1 y p.

Ý egE9

Ý

iu i Ž H9 . p iy1

iG2

s Q Ž n9rlog 8 n9 . . Note that, because of the upper bound for p, the contribution to E X coming from edges of size larger than log n9 is smaller than iu i Ž H9 . p iy1 F exp Ž y6 log n9 q 6 .

Ý iGlog n9

Ý

iu i Ž H9 .

iG2

F

6

e

y3 n9 n9 s O Ž Ž n9 . . , 2

ž /ž / n9

and thus is negligible. The same is true for most of the following sums and we shall often use this fact without mentioning it explicitly. In order to show that the value of X does not deviate very much from its expectation, we need to estimate the variance Var X of X. Random variables X e and X eX are independent whenever e l e9 s B, so Var X s Var

Ý

Xe s

egE9

Ý

Var X e q

egE9

Ý

Cov Ž X e , X e9 . .

e, e9gE9 ele9FB

Since 2

Var X e s E X e2 y Ž E X e . F E X e2 s E X e , we have

Ý egE9

Var X e F

Ý egE9

E Xe s E X .

MIS ALGORITHM FOR LINEAR HYPERGRAPHS

317

Furthermore,

Ý

Cov Ž X e , X e9 . s

e, e9gE9 ele9/B

Ý Ž E X e X e9 y E X e E X e9 . e, e9gE9 ele9/B

F

E X e X e9

Ý e, e9gE9 ele9/B

Ý Ž Ž < e < y 1. Ž < e9 < y 1. Ž 1 y p . 2 p < e
s

e, e9gE9 ele9/B

q Ž 1 y p . p < e
< e < < e9 < p < e
Ý e, e9gE9 ele9/B

F2

< e < p < e
Ý egE9

< e9 < p < e9
Ý e9gE9 ele9/B

Let D i s D i Ž H9 . s max d i Ž ¨ , H9 . . ¨ gV 9

Since H9 is equitable, D i F Ž iu i Ž H9.rn9.log 5 n9 for i s 2, . . . , log n9. Thus, Cov Ž X e , X e9 .

Ý e, e9gE9 ele9/B

F2

Ý

< e < p < e
egE9

< e9 < p < e9
Ý e9gE9 ele9FB

log n9

F2

Ý

< e < p < e
ž

egE9 < e
F

F

2 log7 n9 n9 p 3 log7 n9 n9 p

Ý

/

iD i p iy2 q o Ž 1 .

is2

E X Ý iu i Ž H9 . p iy1 q o Ž 1 . iG2 2 ŽE X . F

3Ž E X .

2

log n9

and Var X F E X q

3Ž E X . log n9

2

F

4Ž E X .

2

log n9

Now the assertion follows from Chebyshev’s inequality.

.

ŁUCZAK AND SZYMANSKA ´

318

We shall also use the fact that there are only few pairs of intersecting edges of H9 which share a large number of vertices with W. More precisely, the following holds. CLAIM 3. With probability 1 y oŽ1. hypergraph H9 contains at most n9rlog 9 n9 pairs of edges e, e9 for which e l e9 / B, < e l W < G < e < y 1, and e9 _ e : W. Proof. Let Y count the number of such pairs e, e9 in H9. Then for the expectation of Y we have EYF

Ý

< e < p < e
egE9

Ý

p < e9
e9gE9 ele9/B

log n9

F

Ý

log n9

iu i Ž H9 . p iy1 log n9

is2

F

2 n9 7

log n9

Ý

D j p jy1 q o Ž 1 .

js2

log 5 n9 n9

log n9

Ý

ju j Ž H9 . p jy1 q o Ž 1 . F

js2

5n9 log 10 n9

.

Thus, the assertion follows from Markov’s inequality. Now we may complete the study of Case 2. From Claims 2 and 3 it follows that with probability 1 y oŽ1. hypergraph H9, and thus also H, contains a family E˜ of at least QŽ n9rlog 8 n9. s QŽ nrlog 8 n. hyperedges e, such that Ži. for e g E˜ we have e _ W s  ¨ e 4 ; Žii. no e from E˜ share a vertex with an edge of H9 entirely contained in W; Žiii. ¨ e / ¨ e9 whenever e / e9. Then Ži. and Žii. imply that each edge from E˜ adds one vertex to the set S removed from V by the procedure UpdateŽ I, I9, H, W ., and Žiii. guarantees that all these vertices are different. Case 3. a G ey6 . Let us recall that in this case each element of the random subset W is picked from V9 with probability p s ey6 . Since the number of elements of W is binomially distributed, with probability 1 y oŽ1. we have < W < G pn9r2 G n9ey7 . Furthermore, the expected number of vertices which belong to edges entirely contained in W is bounded

MIS ALGORITHM FOR LINEAR HYPERGRAPHS

319

from above by iu i Ž H9 . p i F ey6

Ý iG2

n9

Ý iu i Ž H9. aiy1 F log7 n9 .

iG2

Therefore, by Markov’s inequality, with probability 1 y oŽ1. it is bounded from above by n9rlog 6 n9. Thus, the independent set I9 generated by IndSetŽ I9, H9, W . contains with large probability at least n9ey8 G nrlog17 n elements, and so the size of V must decrease by at least this amount. From the preceding lemma one can easily get the main result of this article. Proof of Theorem. Let us call an execution of the main loop of the algorithm successful if it results in removing at least nrlog 17 n vertices from H, where n s < V <. Thus, the lemma states that there exists an absolute constant n 0 such that whenever H has at least n 0 vertices the probability that the execution will be successful is not smaller than 1r2. Since at the beginning of the algorithm H has n in vertices, after, say, log 19 n in successful executions of the loop, its order will drop down to less than n 0 . The probability that among k G log 20 n in tries we shall have less than log 19 n in successes is not large than

ž

k 2yk F 2yk r2 , log 19 n in

/

so the expected number of steps needed to reduce the number of vertices of H to less than n 0 is bounded from above by log 20 n in q

k2yk r2 F 2 log 20 n in .

Ý kGlog

20

n in

A quick look at the algorithm reveals that if < V < s n F n 0 then with probability at least ey6 nrn we shall have < W < s < I9 < s 1 and order of H will be reduced by at least one during the procedure UpdateŽ I, I9, H, W .. Thus, the expected number of times of invoking this subroutine until the algorithm finds a maximal independent set in H with at most n 0 vertices is bounded from above by 2 n0 q

Ý

kG2 n 0q1

k k n 0

ž /Ž

k

1 y ey6 n 0rn 0 . s O Ž 1 . .

320

ŁUCZAK AND SZYMANSKA ´

Consequently, the expected number of the execution of the loop of the algorithm needed for finding a maximal independent set is polylogarithmic. Since, as we have already observed, all procedures used by algorithm are polylogarithmic as well, the algorithm has polylogarithmic expected running time and the assertion follows.

REFERENCES 1. N. Alon, L. Babai, and A. Itai, A fast and simple randomized parallel algorithm for the maximal independent set problem, J. Algorithm 7 Ž1986., 567]583. 2. P. Beame and M. Luby, Parallel search for maximal independence given minimal dependence, in ‘‘Proceedings of the First SODA Conference, 1990,’’ pp. 212]218. 3. E. Dahlhaus, M. Karpinski, and P. Kelsen, An efficient parallel algorithm for computing ´ a maximal independent set in a hypergraph of dimension 3, Inform. Process. Lett. 42 Ž1992., 309]313. 4. M. Goldberg, and T. Spencer, Constructing a maximal independent set in parallel, SIAM J. Discrete Math. 1 Ž1989., 322]328. 5. R. Karp and V. Ramachandran, in ‘‘Handbook of Theoretical Computer Science,’’ ŽJ. Van Leeuwen, Ed.., pp. 869]941, North-Holland, Amsterdam, 1990. 6. R. Karp and A. Wigderson, A fast parallel algorithm for the maximal independent set problem, in ‘‘Proceedings of the 16th ACM Symposium on Theory of Computing, 1984,’’ pp. 266]272. 7. P. Kelsen, On the parallel complexity of computing a maximal independent set in a hypergraph, in ‘‘Proceedings of the 24th ACM Symposium on Theory of Computing, 1992.’’ 8. M. Luby, A simply parallel algorithm for the maximal independent set problem, SIAM J. Comput. 15 Ž1986., 1036]1053.