On the hardness of finding subsets with equal average

On the hardness of finding subsets with equal average

Information Processing Letters 113 (2013) 477–480 Contents lists available at SciVerse ScienceDirect Information Processing Letters www.elsevier.com...

153KB Sizes 1 Downloads 32 Views

Information Processing Letters 113 (2013) 477–480

Contents lists available at SciVerse ScienceDirect

Information Processing Letters www.elsevier.com/locate/ipl

On the hardness of finding subsets with equal average Edith Elkind a,∗ , James B. Orlin b a b

Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore MIT Sloan School of Management, USA

a r t i c l e

i n f o

Article history: Received 26 August 2011 Received in revised form 28 March 2013 Accepted 6 April 2013 Available online 8 April 2013 Communicated by R. Uehara

a b s t r a c t We show that, given a set of positive integers, it is NP-complete to decide whether it contains two subsets with the same average. Our interest in this problem is motivated by questions in decision theory that are related to defining preferences on sets of objects given preferences over individual objects. © 2013 Elsevier B.V. All rights reserved.

Keywords: Computational complexity Expected utility theory Subset ranking

1. Introduction In decision-making scenarios, an agent often has to compare two objects from a given, fixed set of objects Q, and choose the one that she prefers. An agent is said to be rational if her preferences over the elements of Q are transitive, i.e., for any triple of elements a, b, c ∈ Q, if she prefers a over b and b over c, she also prefers a over c. A fundamental result in decision theory is that transitive preferences can be encoded by a utility function u : Q → R, so that an agent prefers a to b if and only if u (a) > u (b) [5]. Thus, to describe the agent’s behavior, it suffices to list the values of u (x) for all x ∈ Q. As long as u (x) = u ( y ) for all x, y ∈ Q, the function u (·) uniquely determines which of the two given objects will be chosen by the agent. The situation is more complicated when the agent has to choose between sets of objects, i.e., subsets of Q. There are multiple ways of extending preferences from objects to sets of objects: for instance, one can order sets according to their total utility, their average utility, or the utility of their best/worst element [1]. The choice of a preference

*

Corresponding author. Tel.: +65 6513 2028; fax: +65 6515 8213. E-mail addresses: [email protected] (E. Elkind), [email protected] (J.B. Orlin). 0020-0190/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ipl.2013.04.001

extension method depends on what it means to select a set of objects rather than a single object: would the agent be able to enjoy all objects in the set, or just one of them, and if so, which one? Sometimes, selecting a set simply means that the object will eventually be chosen from this set uniformly at random. This is the case, for instance, when a group of agents votes in a single-winner election, an agent’s vote determines which candidates will have the top score, and the winner is chosen among the top-scoring candidates by tossing a fair coin; this setting is discussed in, e.g. [2,4,3]. In such settings, it is natural to compare subsets according to their expected utility with respect to this eventual probabilistic choice; if this choice can be assumed to be uniform, this implies that the agent should always choose a set with the highest average utility. However, it may happen that two subsets of Q have the same average utility with respect to u (·), even if the values of u (·) on elements of Q are all distinct. If this is the case, to fully specify the agent’s behavior, we will need to provide an additional tie-breaking rule, in order to describe how she makes her choice when faced with two subsets that have the same average utility. Therefore, given a utility function u (·) on a set Q, it is natural to ask whether this function is sufficient to fully describe the decision-making process, i.e., whether it is the case that

478

E. Elkind, J.B. Orlin / Information Processing Letters 113 (2013) 477–480

for  every pair of subsets S , T ⊆ Q with S = T , we have  1 1 q∈ S u (q ) = | T | u (q)∈ T q. |S| In this note, we will show that this problem is computationally hard. More specifically, we will show that the complement of this problem, i.e., deciding whether a set of numbers contains two subsets with the same average, is NP-complete. In what follows, to simplify notation, instead of considering sets of objects and utility functions defined on these sets, we simply consider sets of natural numbers; these numbers can be thought of as the utilities of the objects in the set. As one would expect, our proof follows by a reduction from Subset Sum; however, the reduction is surprisingly complicated. 2. Main result In this section, we state and prove our main result. We first provide a formal definition of our problem. Definition 1. An instance of Subset Average problem is given by a set of positive integers Q = {q1 , . . . , qn }. It is a “yes”-instance if there are two (possibly overlapping) subsets S and T of Q such that S = T and



qi ∈ S

qi

|S|



qi ∈ T

=

qi

|T |

,

and a “no”-instance otherwise. We will now prove that Subset Average is computationally hard. Theorem 1. Subset Average is NP-complete.

16m2 N m < K .

    T = xi  a i ∈ A  ∪ y i  a i ∈ / A  ∪ { v }. Indeed, we have | S | = | T | = m + 1, and

Σ( S ) =

(2)

,

xj + y j + zj < N j 3 +



+



zi + u =



(4M i + N i + ai )

ai ∈ A 

zi ∈ S

(7M i + N i + ai ) + K

ai ∈ / A

=

m   (4M i + N i + ai ) + 3 Mi + K , i =1

Σ( T ) =



xi +

xi ∈ T



+

(1)



yi +

 yi ∈T

ai ∈ / A

yi + v =



1 10m

 ,

(3)

=

(M i + N i )

ai ∈ A 

(4M i + N i + ai ) + K + 3

ai ∈ / A

a j < M j, Nj



yi ∈ S

y i = 4M i + N i + ai ,

and let Q i = {xi , y i , zi }. Also, set K = B 2m+7 , and let u = K , m v = K + 3 i =1 M i + b. Observe that for any j  m we have

150m

(5)

Indeed, inequalities (1) and (2) are immediate from the definitions of M j and N j , inequality (3) follows from (1) and (2), inequality (4) follows from (3) and the observa tions that i =1,..., j −1 N i < N j /( B − 1) and B − 1 > 30m + 1, and inequality (5) follows from the definition of K and the fact that B 3 > 16m2 .  m Finally, define Q = i =1 Q i ∪ {u , v }. For any subset Q  of Q , we will denote by Σ( Q  ) the sum of all elements of Q  . Note that inequality (3) implies that Σ( Q  ) < 4mN m for every Q  ⊆ Q \ {u , v }. Throughout the proof, we will consider Σ( S ) and Σ( T ) written in base B = 50a∗ m. For i = 0, . . . , 2m + 7, let si (respectively, t i ) denote the (i + 1)-st least significant digit of Σ( S ) (respectively, Σ( T )) in base B. Observe that when we add the elements of any Q  ⊆ Q in base B, there is no carry. Therefore, for i = 1, . . . , m, si and sm+i +4 are fully determined by the set S ∩ { Q i , v }, while t i and tm+i +4 are fully determined by T ∩ { Q i , v }. Specifically, sm+i +4 = | S ∩ Q i |, tm+i +4 = | T ∩ Q i |, and, moreover, if v ∈ / S, we have si ∈ {0, 1, 4, 7, 5, 8, 11, 12} and if v ∈ S, we have si ∈ {3, 4, 7, 10, 8, 11, 14, 15}; the same holds for t i . In fact, given the value of si (respectively, t i ), we can reconstruct S ∩ Q i (respectively, T ∩ Q i ) as long as we know whether v ∈ S (respectively, v ∈ T ). Suppose first that I is a “yes”-instance of Subset Sum,  i.e., for some set A  ⊆ A we have ai ∈ A  ai = b. Then we can set

zi = 7M i + N i + ai ,

Mj <

(4)

10m

i =1

    S = y i  ai ∈ A  ∪ zi  ai ∈ / A  ∪ {u },

Proof. It is not hard to see that this problem is in NP: we can guess two sets S and T and compute the averages of their elements. To show that the problem is NP-hard, we give a reduction from Subset Sum. Recall that an instance of Subset Sum is given by a set of positive integers A = {a1 , . . . , am } and another positive integer b.  It is a “yes”-instance if there exists an A  ⊆ A such that ai ∈ A  ai = b, and a “no”instance otherwise. We can assume without loss of generality that m > 3 and max{ai | ai ∈ A }  2, as otherwise the problem is easily solvable. Given an instance I of Subset Sum, we construct an instance of Subset Average as follows. Set a∗ = max{ai | ai ∈ A }, let B = 50a∗ m, and, for i = 1, . . . , m, set M i = B i , N i = B m+i +4 . Now define

xi = M i + N i ,

j −1  Nj ( xi + y i + z i ) < ,

m  i =1

Mi +



ai

ai ∈ A 

m m    ( M i + N i + ai ) + 3 Mi + K + 3 Mi , i =1

ai ∈ / A

i =1

i.e., Σ( S ) = Σ( T ). For the converse direction, suppose that there exist two Σ( S )

Σ( T )

sets S and T , S = T , with | S | = | T | . Pick S and T so that they form a minimal pair with this property, i.e., so that there do not exist S  ⊂ S and T  ⊂ T such that

E. Elkind, J.B. Orlin / Information Processing Letters 113 (2013) 477–480

Σ( S  ) |S|

=

Σ( T  ) | T  | . Note that for this choice of S and T , it can-

not be the case that S ∩ T = ∅ and | S | = | T |: indeed, we can set S  = S \ {q}, T  = T \ {q}, where q ∈ S ∩ T , and obtain

Σ( S  ) Σ( S ) − q Σ( T ) − q Σ( T  ) = = = . |S | |S| − 1 |T | − 1 |T  | We will now show how to construct the set A  given S and T . Suppose first that S ∩ {u , v } = ∅, T ∩ {u , v } = ∅. Using (5) and (3), we obtain

Σ( T ) Σ( S ) K > 4mNm   , |S| 3m + 2 |T | a contradiction. Thus, either S ∩ {u , v } = T ∩ {u , v } = ∅,  ∅ , T ∩ {u , v } =  ∅. To further simplify our or S ∩ {u , v } = analysis, we need the following lemma. Lemma 1. Suppose that S ∩ {u , v } = ∅, T ∩ {u , v } = ∅, and set α = | S ∩ {u , v }|, β = | T ∩ {u , v }|. We have

|S| α = . |T | β Proof. We have

479

We show that case (1) leads to a contradiction; in all  the  remaining cases, we will construct a set A with = b. In cases (2)–(4), we will only consider the first  a∈ A of the two symmetric scenarios listed above; the other scenario can be handled in a similar manner. We will analyze these four possibilities one by one.

/ S ∪ T. (1) u , v ∈ Set k = max{i | Q i ∩ ( S ∪ T ) = ∅}. Suppose that S ∩ Q k = ∅, but T ∩ Q k = ∅. We have

Σ( S ) Σ( T ) Nk <  , |T | 10m |S| a contradiction. Similarly, T ∩ Q k = ∅, S ∩ Q k = ∅ leads to a contradiction, too. Hence, we have S ∩ Q k = ∅, T ∩ Q k = ∅. In fact, a stronger statement holds. Lemma 2. Let γ |S| |T | = δ .

γ = | S ∩ Q k |, δ = | T ∩ Q k |. Then we have

Proof. The proof is similar to that of Lemma 1. By the argument above, we have γ , δ ∈ {1, 2, 3}. Furthermore,

γ Nk  Σ( S )  γ Nk + 14Mk +

i =1

α , β ∈ {1, 2}, and  γ Nk + 15

α K  Σ( S ) < α K + 4mNm , β K  Σ( T ) < β K + 4mNm . Hence, if β| S | > α | T |, we have | S | 

k −1  ( xi + y i + z i )

Similarly, α | T |+1 β

, and

Nk 150m

+



δ Nk  Σ( T )  Nk δ +

Nk 10m

1 5m



1

 Nk γ +

5m

 .

 .

Σ( T ) Σ( S ) β(α K + 4mNm ) β K <   , |S| α|T | + 1 |T | |T |

Hence, if | T | > δ , we have | S | 

where we use the fact that 4mN m | T | < K . Similarly, β| S |+1 if β| S | < α | T |, we have | T |  α , and

1 ) Nk δ Nk Σ( T ) Σ( S ) δ(γ + 5m <   , |S| γ |T | + 1 |T | |T |

Σ( S ) α K α (β K + 4mNm ) Σ( T )   > , |S| |S| β| S | + 1 |T |

where the strict inequality follows from the fact that | T | < 5m. Similarly, if ||TS || < γδ , we have | T |  δ| Sγ|+1 , and

where we use the fact that 4mN m | S | < K . Thus, we have β| S | = α | T |, i.e., the lemma is proven. 2 In particular, Lemma 1 implies that if S ∩ {u , v } =  ∅, we cannot have S ∩ {u , v } = T ∩ {u , v }, as this will mean | S | = | T |, S ∩ T = ∅, and hence contradict our choice of Σ( S )  ∅, we have Σ( = ||TS || = αβ . S , T . Further, if S ∩ {u , v } = T) Consequently, since β si , αt i < B for all i = 0, . . . , m, for s each i = 0, . . . , m we have either si = t i = 0 or t i = α β. i We will now consider all the remaining possibilities for S ∩ {u , v } and T ∩ {u , v }, namely, (1) S ∩ {u , v } = ∅, T ∩ {u , v } = ∅; (2) S ∩ {u , v } = {u }, T ∩ {u , v } = {u , v } (or, symmetrically, T ∩ {u , v } = {u }, S ∩ {u , v } = {u , v }); (3) S ∩ {u , v } = { v }, T ∩ {u , v } = {u , v } (or, symmetrically, T ∩ {u , v } = { v }, S ∩ {u , v } = {u , v }); (4) S ∩ {u , v } = {u }, T ∩ {u , v } = { v } (or, symmetrically, T ∩ {u , v } = {u }, S ∩ {u , v } = { v }).

|S|

γ

γ | T |+1 δ

, and

1 γ (δ + 5m ) Nk Σ( T ) Σ( S ) γ Nk   > , |S| |S| δ| S | + 1 |T |

where the strict inequality follows from the fact that | S | < 5m. S) In both cases, we get a contradiction with Σ( |S| = |S| Σ( T ) | T | . Hence, | T |

= γδ . 2

Now, for i = 1, . . . , m, we have si , t i ∈ {0, 1, 4, 7, 5, 8, 11, 12}, and, moreover,

s0 =



yi ∈ S

ai +



t0 =

ai ,

zi ∈ S

Σ( S )



yi ∈T

|S|

γ

ai +



ai .

zi ∈ T

Observe that Σ( T ) = | T | = δ , and, furthermore, δ si , γ t i < B for all i = 0, . . . , m, γ , δ ∈ {1, 2, 3}. Hence, γ s for i = 0, . . . , m we have either si = t i = 0 or t i = δ . i Assuming without loss of generality that γ  δ , we have the following possibilities:

480

E. Elkind, J.B. Orlin / Information Processing Letters 113 (2013) 477–480

− γ = δ.

By Lemma 2, we have | S | = | T |, and, by the argument above, sk = tk . This is only possible if S ∩ Q k = T ∩ Q k . Further, S ∩ Q k = ∅ by our choice of k. Together with | S | = | T |, this contradicts our choice of S and T . − γ = 1, δ = 2. For all i = 1, . . . , k, we have either si = t i = 0 or si = 4, t i = 8, i.e., S ∩ Q i = { y i }, T ∩ Q i = {xi , zi }. Set A  = {ai | y i ∈ S }; the set A  is non-empty since ak ∈ A  . We have

s0 =



ai ,

t0 =

ai ∈ A 





ai ,

t0 =

ai ∈ A 

ai ,



t0 = 2

ai ∈ A 

ai ,

ai ∈ A 

i.e., t 0 = 2s0 . On the other hand, we have 3s0 = t 0 and hence s0 = 0, a contradiction with A  = ∅. − γ = 2, δ = 3. For all i = 1, . . . , k, we have either si = t i = 0 or si = 8, t i = 12, i.e., S ∩ Q i = {xi , zi }, T ∩ Q i = {xi , y i , zi }. Set A  = {ai | zi ∈ S }; the set A  is nonempty since ak ∈ A  . We have

s0 =



ai ,



t0 = 2

ai ∈ A 



ai +

yi ∈ S

t0 =



yi ∈T

zi ∈ S

ai +





ai =

ai +

ai ∈ A 

ai + b =

zi ∈ T

Since 2s0 = t 0 , we have





ai ∈ A 

ai ∈ A 

ai +





ai ,

ai ∈ / A

ai + 2



ai + b .

ai ∈ / A

ai = b, as required.



ai +





ai + b =

ai + b ,

ai ∈ / A

zi ∈ S

ai + b = 2

zi ∈ T



ai +

ai ∈ / A





ai + b .

ai ∈ A 

Since 2s0 = t 0 , we have ai ∈ A  ai = b, as required. (4) S ∩ {u , v } = {u }, T ∩ {u , v } = { v }. We have α = 1, β = 1. Further, for i = 1, . . . , m, we have si ∈ {0, 1, 4, 7, 5, 8, 11, 12} and t i ∈ {3, 4, 7, 10, 8, 11, 14, 15}. Thus, si = t i implies that one of the following four cases holds: (i) si = 4, t i = 3 + 1, i.e., S ∩ Q i = { y i }, T ∩ Q i = {xi }; (ii) si = 7, t i = 3 + 4, i.e., S ∩ Q i = { zi }, T ∩ Q i = { y i }; (iii) si = 7 + 1, t i = 3 + 4 + 1, i.e., S ∩ Q i = {xi , zi }, T ∩ Q i = { xi , y i } ; (iv) si = 7 + 4, t i = 3 + 1 + 7, i.e., S ∩ Q i = { y i , zi }, T ∩ Q i = { xi , z i } . Further, cases (iii) and (iv) are not possible since our analysis implies that | S | = | T | and therefore S ∩ T = ∅. Set A  = {ai | y i ∈ S }, A  = {ai | zi ∈ S }. We have

s0 =



ai +

yi ∈ S

ai ∈ A 





yi ∈T

ai ,

i.e., t 0 = 2s0 . On the other hand, we have 3s0 = 2t 0 and hence s0 = 0, a contradiction with A  = ∅. In all cases, we obtain a contradiction. Thus, we can / S ∪ T is impossible. conclude that the case u , v ∈ (2) S ∩ {u , v } = {u }, T ∩ {u , v } = {u , v }. We have α = 1, β = 2. Further, for i = 1, . . . , m, we have si ∈ {0, 1, 4, 7, 5, 8, 11, 12} and t i ∈ {3, 4, 7, 10, 8, 11, 14, 15}. Thus, 2si = t i implies that one of the following three cases holds: (i) si = 4, t i = 3 + 1 + 4, i.e., S ∩ Q i = { y i }, T ∩ Q i = { xi , y i } ; (ii) si = 7, t i = 3 + 4 + 7, i.e., S ∩ Q i = { zi }, T ∩ Q i = { y i , z i }; (iii) si = 1 + 4, t i = 3 + 7, i.e., S ∩ Q i = {xi , y i }, T ∩ Q i = { z i }. However, case (iii) is impossible, as it implies si +m+4 = 2, t i +m+4 = 1, a contradiction with 2si = t i . Thus, each i = 1, . . . , m satisfies either (i) or (ii). Set A  = {ai | y i ∈ S }. We have

s0 =

s0 =

yi ∈ S

i.e., s0 = t 0 . On the other hand, we have 2s0 = t 0 and hence s0 = 0, a contradiction with A  = ∅. − γ = 1, δ = 3. For all i = 1, . . . , k, we have either si = t i = 0 or si = 4, t i = 12, i.e., S ∩ Q i = { y i }, T ∩ Q i = {xi , y i , zi }. Set A  = {ai | y i ∈ S }; the set A  is non-empty since ak ∈ A  . We have

s0 =

(3) S ∩ {u , v } = { v }, T ∩ {u , v } = {u , v }. We have α = 1, β = 2. Further, for i = 1, . . . , m, we have si , t i ∈ {3, 4, 7, 10, 8, 11, 14, 15}. Thus, 2si = t i implies that one of the following two cases holds: (i) si = 3 + 1, t i = 3 + 1 + 4, i.e., S ∩ Q i = {xi }, T ∩ Q i = {xi , y i }, or (ii) si = 3 + 4, t i = 3 + 4 + 7, i.e., S ∩ Q i = { y i }, T ∩ Q i = { y i , z i }. Set A  = {ai | xi ∈ S }. We have

t0 =



yi ∈T



ai =



ai +

ai ∈ A 

zi ∈ S

ai +



ai + b =

zi ∈ T

Since s0 = t 0 , we have







ai ,

ai ∈ A 

ai + b .

ai ∈ A 

ai ∈ A 

ai = b, as required.

2

Acknowledgements Edith Elkind was supported by Singapore NRF Research Fellowship 2009-08 and by NTU SUG grant. James B. Orlin was partially supported by Office of Naval Research grant N000141110056. The authors would like to thank the anonymous IPL referees for their very useful feedback. This work was initiated during Dagstuhl seminar 10171, and the authors would like to thank the Dagstuhl staff for providing a great research environment. References [1] S. Barbera, W. Bossert, P.K. Pattanaik, Ranking sets of objects, in: S. Barbera, P.J. Hammond, C. Seidl (Eds.), Handbook of Utility Theory, Kluwer, Boston, 1998, pp. 895–979 (Chapter 17). [2] Y. Desmedt, E. Elkind, Equilibria of plurality voting with abstentions, in: ACM EC’10, 2010, pp. 347–356. [3] S. Obraztsova, E. Elkind, On the complexity of voting manipulation under randomized tie-breaking, in: IJCAI’11, 2011, pp. 319–324. [4] S. Obraztsova, E. Elkind, N. Hazon, Ties matter: Complexity of voting manipulation revisited, in: AAMAS’11, 2011, pp. 71–78. [5] J. von Neumann, O. Morgenstern, Theory of Games and Economic Behavior, Princeton University Press, Princeton, 1944.