On the hardness of finding subsets with equal average

Information Processing Letters 113 (2013) 477–480 Contents lists available at SciVerse ScienceDirect Information Processing Letters www.elsevier.com...

Download PDF

153KB Sizes 1 Downloads 32 Views

Report

PDF Reader
Full Text

Information Processing Letters 113 (2013) 477–480

Contents lists available at SciVerse ScienceDirect

Information Processing Letters www.elsevier.com/locate/ipl

On the hardness of ﬁnding subsets with equal average Edith Elkind a,∗ , James B. Orlin b a b

Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore MIT Sloan School of Management, USA

a r t i c l e

i n f o

Article history: Received 26 August 2011 Received in revised form 28 March 2013 Accepted 6 April 2013 Available online 8 April 2013 Communicated by R. Uehara

a b s t r a c t We show that, given a set of positive integers, it is NP-complete to decide whether it contains two subsets with the same average. Our interest in this problem is motivated by questions in decision theory that are related to deﬁning preferences on sets of objects given preferences over individual objects. © 2013 Elsevier B.V. All rights reserved.

Keywords: Computational complexity Expected utility theory Subset ranking

1. Introduction In decision-making scenarios, an agent often has to compare two objects from a given, ﬁxed set of objects Q, and choose the one that she prefers. An agent is said to be rational if her preferences over the elements of Q are transitive, i.e., for any triple of elements a, b, c ∈ Q, if she prefers a over b and b over c, she also prefers a over c. A fundamental result in decision theory is that transitive preferences can be encoded by a utility function u : Q → R, so that an agent prefers a to b if and only if u (a) > u (b) [5]. Thus, to describe the agent’s behavior, it suﬃces to list the values of u (x) for all x ∈ Q. As long as u (x) = u ( y ) for all x, y ∈ Q, the function u (·) uniquely determines which of the two given objects will be chosen by the agent. The situation is more complicated when the agent has to choose between sets of objects, i.e., subsets of Q. There are multiple ways of extending preferences from objects to sets of objects: for instance, one can order sets according to their total utility, their average utility, or the utility of their best/worst element [1]. The choice of a preference

*

Corresponding author. Tel.: +65 6513 2028; fax: +65 6515 8213. E-mail addresses: [email protected] (E. Elkind), [email protected] (J.B. Orlin). 0020-0190/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ipl.2013.04.001

extension method depends on what it means to select a set of objects rather than a single object: would the agent be able to enjoy all objects in the set, or just one of them, and if so, which one? Sometimes, selecting a set simply means that the object will eventually be chosen from this set uniformly at random. This is the case, for instance, when a group of agents votes in a single-winner election, an agent’s vote determines which candidates will have the top score, and the winner is chosen among the top-scoring candidates by tossing a fair coin; this setting is discussed in, e.g. [2,4,3]. In such settings, it is natural to compare subsets according to their expected utility with respect to this eventual probabilistic choice; if this choice can be assumed to be uniform, this implies that the agent should always choose a set with the highest average utility. However, it may happen that two subsets of Q have the same average utility with respect to u (·), even if the values of u (·) on elements of Q are all distinct. If this is the case, to fully specify the agent’s behavior, we will need to provide an additional tie-breaking rule, in order to describe how she makes her choice when faced with two subsets that have the same average utility. Therefore, given a utility function u (·) on a set Q, it is natural to ask whether this function is suﬃcient to fully describe the decision-making process, i.e., whether it is the case that

478

E. Elkind, J.B. Orlin / Information Processing Letters 113 (2013) 477–480

for every pair of subsets S , T ⊆ Q with S = T , we have 1 1 q∈ S u (q ) = | T | u (q)∈ T q. |S| In this note, we will show that this problem is computationally hard. More speciﬁcally, we will show that the complement of this problem, i.e., deciding whether a set of numbers contains two subsets with the same average, is NP-complete. In what follows, to simplify notation, instead of considering sets of objects and utility functions deﬁned on these sets, we simply consider sets of natural numbers; these numbers can be thought of as the utilities of the objects in the set. As one would expect, our proof follows by a reduction from Subset Sum; however, the reduction is surprisingly complicated. 2. Main result In this section, we state and prove our main result. We ﬁrst provide a formal deﬁnition of our problem. Deﬁnition 1. An instance of Subset Average problem is given by a set of positive integers Q = {q1 , . . . , qn }. It is a “yes”-instance if there are two (possibly overlapping) subsets S and T of Q such that S = T and

qi ∈ S

qi

|S|

qi ∈ T

=

qi

|T |

,

and a “no”-instance otherwise. We will now prove that Subset Average is computationally hard. Theorem 1. Subset Average is NP-complete.

16m2 N m < K .

T = xi a i ∈ A ∪ y i a i ∈ / A ∪ { v }. Indeed, we have | S | = | T | = m + 1, and

Σ( S ) =

(2)

,

xj + y j + zj < N j 3 +

+

zi + u =

(4M i + N i + ai )

ai ∈ A

zi ∈ S

(7M i + N i + ai ) + K

ai ∈ / A

=

m (4M i + N i + ai ) + 3 Mi + K , i =1

Σ( T ) =

xi +

xi ∈ T

+

(1)

yi +

yi ∈T

ai ∈ / A

yi + v =

1 10m

,

(3)

=

(M i + N i )

ai ∈ A

(4M i + N i + ai ) + K + 3

ai ∈ / A

a j < M j, Nj

yi ∈ S

y i = 4M i + N i + ai ,

and let Q i = {xi , y i , zi }. Also, set K = B 2m+7 , and let u = K , m v = K + 3 i =1 M i + b. Observe that for any j m we have

150m

(5)

Indeed, inequalities (1) and (2) are immediate from the deﬁnitions of M j and N j , inequality (3) follows from (1) and (2), inequality (4) follows from (3) and the observa tions that i =1,..., j −1 N i < N j /( B − 1) and B − 1 > 30m + 1, and inequality (5) follows from the deﬁnition of K and the fact that B 3 > 16m2 . m Finally, deﬁne Q = i =1 Q i ∪ {u , v }. For any subset Q of Q , we will denote by Σ( Q ) the sum of all elements of Q . Note that inequality (3) implies that Σ( Q ) < 4mN m for every Q ⊆ Q \ {u , v }. Throughout the proof, we will consider Σ( S ) and Σ( T ) written in base B = 50a∗ m. For i = 0, . . . , 2m + 7, let si (respectively, t i ) denote the (i + 1)-st least signiﬁcant digit of Σ( S ) (respectively, Σ( T )) in base B. Observe that when we add the elements of any Q ⊆ Q in base B, there is no carry. Therefore, for i = 1, . . . , m, si and sm+i +4 are fully determined by the set S ∩ { Q i , v }, while t i and tm+i +4 are fully determined by T ∩ { Q i , v }. Speciﬁcally, sm+i +4 = | S ∩ Q i |, tm+i +4 = | T ∩ Q i |, and, moreover, if v ∈ / S, we have si ∈ {0, 1, 4, 7, 5, 8, 11, 12} and if v ∈ S, we have si ∈ {3, 4, 7, 10, 8, 11, 14, 15}; the same holds for t i . In fact, given the value of si (respectively, t i ), we can reconstruct S ∩ Q i (respectively, T ∩ Q i ) as long as we know whether v ∈ S (respectively, v ∈ T ). Suppose ﬁrst that I is a “yes”-instance of Subset Sum, i.e., for some set A ⊆ A we have ai ∈ A ai = b. Then we can set

zi = 7M i + N i + ai ,

Mj <

(4)

10m

i =1

S = y i ai ∈ A ∪ zi ai ∈ / A ∪ {u },

Proof. It is not hard to see that this problem is in NP: we can guess two sets S and T and compute the averages of their elements. To show that the problem is NP-hard, we give a reduction from Subset Sum. Recall that an instance of Subset Sum is given by a set of positive integers A = {a1 , . . . , am } and another positive integer b. It is a “yes”-instance if there exists an A ⊆ A such that ai ∈ A ai = b, and a “no”instance otherwise. We can assume without loss of generality that m > 3 and max{ai | ai ∈ A } 2, as otherwise the problem is easily solvable. Given an instance I of Subset Sum, we construct an instance of Subset Average as follows. Set a∗ = max{ai | ai ∈ A }, let B = 50a∗ m, and, for i = 1, . . . , m, set M i = B i , N i = B m+i +4 . Now deﬁne

xi = M i + N i ,

j −1 Nj ( xi + y i + z i ) < ,

m i =1

Mi +

ai

ai ∈ A

m m ( M i + N i + ai ) + 3 Mi + K + 3 Mi , i =1

ai ∈ / A

i =1

i.e., Σ( S ) = Σ( T ). For the converse direction, suppose that there exist two Σ( S )

Σ( T )

sets S and T , S = T , with | S | = | T | . Pick S and T so that they form a minimal pair with this property, i.e., so that there do not exist S ⊂ S and T ⊂ T such that

E. Elkind, J.B. Orlin / Information Processing Letters 113 (2013) 477–480

Σ( S ) |S|

=

Σ( T ) | T | . Note that for this choice of S and T , it can-

not be the case that S ∩ T = ∅ and | S | = | T |: indeed, we can set S = S \ {q}, T = T \ {q}, where q ∈ S ∩ T , and obtain

Σ( S ) Σ( S ) − q Σ( T ) − q Σ( T ) = = = . |S | |S| − 1 |T | − 1 |T | We will now show how to construct the set A given S and T . Suppose ﬁrst that S ∩ {u , v } = ∅, T ∩ {u , v } = ∅. Using (5) and (3), we obtain

Σ( T ) Σ( S ) K > 4mNm , |S| 3m + 2 |T | a contradiction. Thus, either S ∩ {u , v } = T ∩ {u , v } = ∅, ∅ , T ∩ {u , v } = ∅. To further simplify our or S ∩ {u , v } = analysis, we need the following lemma. Lemma 1. Suppose that S ∩ {u , v } = ∅, T ∩ {u , v } = ∅, and set α = | S ∩ {u , v }|, β = | T ∩ {u , v }|. We have

|S| α = . |T | β Proof. We have

479

We show that case (1) leads to a contradiction; in all the remaining cases, we will construct a set A with = b. In cases (2)–(4), we will only consider the ﬁrst a∈ A of the two symmetric scenarios listed above; the other scenario can be handled in a similar manner. We will analyze these four possibilities one by one.

/ S ∪ T. (1) u , v ∈ Set k = max{i | Q i ∩ ( S ∪ T ) = ∅}. Suppose that S ∩ Q k = ∅, but T ∩ Q k = ∅. We have

Σ( S ) Σ( T ) Nk < , |T | 10m |S| a contradiction. Similarly, T ∩ Q k = ∅, S ∩ Q k = ∅ leads to a contradiction, too. Hence, we have S ∩ Q k = ∅, T ∩ Q k = ∅. In fact, a stronger statement holds. Lemma 2. Let γ |S| |T | = δ .

γ = | S ∩ Q k |, δ = | T ∩ Q k |. Then we have

Proof. The proof is similar to that of Lemma 1. By the argument above, we have γ , δ ∈ {1, 2, 3}. Furthermore,

γ Nk Σ( S ) γ Nk + 14Mk +

i =1

α , β ∈ {1, 2}, and γ Nk + 15

α K Σ( S ) < α K + 4mNm , β K Σ( T ) < β K + 4mNm . Hence, if β| S | > α | T |, we have | S |

k −1 ( xi + y i + z i )

Similarly, α | T |+1 β

, and

Nk 150m

+

δ Nk Σ( T ) Nk δ +

Nk 10m

1 5m

1

Nk γ +

5m

.

.

Σ( T ) Σ( S ) β(α K + 4mNm ) β K < , |S| α|T | + 1 |T | |T |

Hence, if | T | > δ , we have | S |

where we use the fact that 4mN m | T | < K . Similarly, β| S |+1 if β| S | < α | T |, we have | T | α , and

1 ) Nk δ Nk Σ( T ) Σ( S ) δ(γ + 5m < , |S| γ |T | + 1 |T | |T |

Σ( S ) α K α (β K + 4mNm ) Σ( T ) > , |S| |S| β| S | + 1 |T |

where the strict inequality follows from the fact that | T | < 5m. Similarly, if ||TS || < γδ , we have | T | δ| Sγ|+1 , and

where we use the fact that 4mN m | S | < K . Thus, we have β| S | = α | T |, i.e., the lemma is proven. 2 In particular, Lemma 1 implies that if S ∩ {u , v } = ∅, we cannot have S ∩ {u , v } = T ∩ {u , v }, as this will mean | S | = | T |, S ∩ T = ∅, and hence contradict our choice of Σ( S ) ∅, we have Σ( = ||TS || = αβ . S , T . Further, if S ∩ {u , v } = T) Consequently, since β si , αt i < B for all i = 0, . . . , m, for s each i = 0, . . . , m we have either si = t i = 0 or t i = α β. i We will now consider all the remaining possibilities for S ∩ {u , v } and T ∩ {u , v }, namely, (1) S ∩ {u , v } = ∅, T ∩ {u , v } = ∅; (2) S ∩ {u , v } = {u }, T ∩ {u , v } = {u , v } (or, symmetrically, T ∩ {u , v } = {u }, S ∩ {u , v } = {u , v }); (3) S ∩ {u , v } = { v }, T ∩ {u , v } = {u , v } (or, symmetrically, T ∩ {u , v } = { v }, S ∩ {u , v } = {u , v }); (4) S ∩ {u , v } = {u }, T ∩ {u , v } = { v } (or, symmetrically, T ∩ {u , v } = {u }, S ∩ {u , v } = { v }).

|S|

γ

γ | T |+1 δ

, and

1 γ (δ + 5m ) Nk Σ( T ) Σ( S ) γ Nk > , |S| |S| δ| S | + 1 |T |

where the strict inequality follows from the fact that | S | < 5m. S) In both cases, we get a contradiction with Σ( |S| = |S| Σ( T ) | T | . Hence, | T |

= γδ . 2

Now, for i = 1, . . . , m, we have si , t i ∈ {0, 1, 4, 7, 5, 8, 11, 12}, and, moreover,

s0 =

yi ∈ S

ai +

t0 =

ai ,

zi ∈ S

Σ( S )

yi ∈T

|S|

γ

ai +

ai .

zi ∈ T

Observe that Σ( T ) = | T | = δ , and, furthermore, δ si , γ t i < B for all i = 0, . . . , m, γ , δ ∈ {1, 2, 3}. Hence, γ s for i = 0, . . . , m we have either si = t i = 0 or t i = δ . i Assuming without loss of generality that γ δ , we have the following possibilities:

480

E. Elkind, J.B. Orlin / Information Processing Letters 113 (2013) 477–480

− γ = δ.

By Lemma 2, we have | S | = | T |, and, by the argument above, sk = tk . This is only possible if S ∩ Q k = T ∩ Q k . Further, S ∩ Q k = ∅ by our choice of k. Together with | S | = | T |, this contradicts our choice of S and T . − γ = 1, δ = 2. For all i = 1, . . . , k, we have either si = t i = 0 or si = 4, t i = 8, i.e., S ∩ Q i = { y i }, T ∩ Q i = {xi , zi }. Set A = {ai | y i ∈ S }; the set A is non-empty since ak ∈ A . We have

s0 =

ai ,

t0 =

ai ∈ A

ai ,

t0 =

ai ∈ A

ai ,

t0 = 2

ai ∈ A

ai ,

ai ∈ A

i.e., t 0 = 2s0 . On the other hand, we have 3s0 = t 0 and hence s0 = 0, a contradiction with A = ∅. − γ = 2, δ = 3. For all i = 1, . . . , k, we have either si = t i = 0 or si = 8, t i = 12, i.e., S ∩ Q i = {xi , zi }, T ∩ Q i = {xi , y i , zi }. Set A = {ai | zi ∈ S }; the set A is nonempty since ak ∈ A . We have

s0 =

ai ,

t0 = 2

ai ∈ A

ai +

yi ∈ S

t0 =

yi ∈T

zi ∈ S

ai +

ai =

ai +

ai ∈ A

ai + b =

zi ∈ T

Since 2s0 = t 0 , we have

ai ∈ A

ai ∈ A

ai +

ai ,

ai ∈ / A

ai + 2

ai + b .

ai ∈ / A

ai = b, as required.

ai +

ai + b =

ai + b ,

ai ∈ / A

zi ∈ S

ai + b = 2

zi ∈ T

ai +

ai ∈ / A

ai + b .

ai ∈ A

Since 2s0 = t 0 , we have ai ∈ A ai = b, as required. (4) S ∩ {u , v } = {u }, T ∩ {u , v } = { v }. We have α = 1, β = 1. Further, for i = 1, . . . , m, we have si ∈ {0, 1, 4, 7, 5, 8, 11, 12} and t i ∈ {3, 4, 7, 10, 8, 11, 14, 15}. Thus, si = t i implies that one of the following four cases holds: (i) si = 4, t i = 3 + 1, i.e., S ∩ Q i = { y i }, T ∩ Q i = {xi }; (ii) si = 7, t i = 3 + 4, i.e., S ∩ Q i = { zi }, T ∩ Q i = { y i }; (iii) si = 7 + 1, t i = 3 + 4 + 1, i.e., S ∩ Q i = {xi , zi }, T ∩ Q i = { xi , y i } ; (iv) si = 7 + 4, t i = 3 + 1 + 7, i.e., S ∩ Q i = { y i , zi }, T ∩ Q i = { xi , z i } . Further, cases (iii) and (iv) are not possible since our analysis implies that | S | = | T | and therefore S ∩ T = ∅. Set A = {ai | y i ∈ S }, A = {ai | zi ∈ S }. We have

s0 =

ai +

yi ∈ S

ai ∈ A

yi ∈T

ai ,

i.e., t 0 = 2s0 . On the other hand, we have 3s0 = 2t 0 and hence s0 = 0, a contradiction with A = ∅. In all cases, we obtain a contradiction. Thus, we can / S ∪ T is impossible. conclude that the case u , v ∈ (2) S ∩ {u , v } = {u }, T ∩ {u , v } = {u , v }. We have α = 1, β = 2. Further, for i = 1, . . . , m, we have si ∈ {0, 1, 4, 7, 5, 8, 11, 12} and t i ∈ {3, 4, 7, 10, 8, 11, 14, 15}. Thus, 2si = t i implies that one of the following three cases holds: (i) si = 4, t i = 3 + 1 + 4, i.e., S ∩ Q i = { y i }, T ∩ Q i = { xi , y i } ; (ii) si = 7, t i = 3 + 4 + 7, i.e., S ∩ Q i = { zi }, T ∩ Q i = { y i , z i }; (iii) si = 1 + 4, t i = 3 + 7, i.e., S ∩ Q i = {xi , y i }, T ∩ Q i = { z i }. However, case (iii) is impossible, as it implies si +m+4 = 2, t i +m+4 = 1, a contradiction with 2si = t i . Thus, each i = 1, . . . , m satisﬁes either (i) or (ii). Set A = {ai | y i ∈ S }. We have

s0 =

s0 =

yi ∈ S

i.e., s0 = t 0 . On the other hand, we have 2s0 = t 0 and hence s0 = 0, a contradiction with A = ∅. − γ = 1, δ = 3. For all i = 1, . . . , k, we have either si = t i = 0 or si = 4, t i = 12, i.e., S ∩ Q i = { y i }, T ∩ Q i = {xi , y i , zi }. Set A = {ai | y i ∈ S }; the set A is non-empty since ak ∈ A . We have

s0 =

(3) S ∩ {u , v } = { v }, T ∩ {u , v } = {u , v }. We have α = 1, β = 2. Further, for i = 1, . . . , m, we have si , t i ∈ {3, 4, 7, 10, 8, 11, 14, 15}. Thus, 2si = t i implies that one of the following two cases holds: (i) si = 3 + 1, t i = 3 + 1 + 4, i.e., S ∩ Q i = {xi }, T ∩ Q i = {xi , y i }, or (ii) si = 3 + 4, t i = 3 + 4 + 7, i.e., S ∩ Q i = { y i }, T ∩ Q i = { y i , z i }. Set A = {ai | xi ∈ S }. We have

t0 =

yi ∈T

ai =

ai +

ai ∈ A

zi ∈ S

ai +

ai + b =

zi ∈ T

Since s0 = t 0 , we have

ai ,

ai ∈ A

ai + b .

ai ∈ A

ai ∈ A

ai = b, as required.

2

Acknowledgements Edith Elkind was supported by Singapore NRF Research Fellowship 2009-08 and by NTU SUG grant. James B. Orlin was partially supported by Oﬃce of Naval Research grant N000141110056. The authors would like to thank the anonymous IPL referees for their very useful feedback. This work was initiated during Dagstuhl seminar 10171, and the authors would like to thank the Dagstuhl staff for providing a great research environment. References [1] S. Barbera, W. Bossert, P.K. Pattanaik, Ranking sets of objects, in: S. Barbera, P.J. Hammond, C. Seidl (Eds.), Handbook of Utility Theory, Kluwer, Boston, 1998, pp. 895–979 (Chapter 17). [2] Y. Desmedt, E. Elkind, Equilibria of plurality voting with abstentions, in: ACM EC’10, 2010, pp. 347–356. [3] S. Obraztsova, E. Elkind, On the complexity of voting manipulation under randomized tie-breaking, in: IJCAI’11, 2011, pp. 319–324. [4] S. Obraztsova, E. Elkind, N. Hazon, Ties matter: Complexity of voting manipulation revisited, in: AAMAS’11, 2011, pp. 71–78. [5] J. von Neumann, O. Morgenstern, Theory of Games and Economic Behavior, Princeton University Press, Princeton, 1944.

On the hardness of finding subsets with equal average

On the hardness of finding subsets with equal average

Recommend Documents