New results on the coarseness of bicolored point sets

Information Processing Letters 123 (2017) 1–7 Contents lists available at ScienceDirect Information Processing Letters www.elsevier.com/locate/ipl ...

Download PDF

327KB Sizes 0 Downloads 30 Views

Report

PDF Reader
Full Text

Information Processing Letters 123 (2017) 1–7

Contents lists available at ScienceDirect

Information Processing Letters www.elsevier.com/locate/ipl

New results on the coarseness of bicolored point sets ✩ J.M. Díaz-Báñez a , R. Fabila-Monroy b , P. Pérez-Lantero c , I. Ventura a,∗ a b c

Departamento de Matemática Aplicada II, Universidad de Sevilla, Spain Departamento de Matemáticas, Cinvestav, Mexico Departamento de Matemática y Ciencia de la Computación, Universidad de Santiago, Santiago, Chile

a r t i c l e

i n f o

Article history: Received 28 October 2015 Received in revised form 20 October 2016 Accepted 25 February 2017 Available online 2 March 2017 Communicated by R. Uehara Keywords: Combinatorial problems Discrepancy Red–blue separability

a b s t r a c t Let S be a set n points in general position in the plane. A 2-coloring of S is an assignment of one of two colors (red or blue) to each point of S. Recently, a parameter called coarseness was introduced by Bereg et al. (2013) [7]; it is a measure of how well blended a given 2-coloring of S is. Informally, the coarseness of a 2-coloring is high when S can be split into blocks, each with a large difference in its number of red and blue points. In this paper, we study two questions related to this parameter: Given a 2-coloring of S, can its coarseness be approximated to a constant factor in polynomial time? What is the minimum coarseness over all 2-colorings of S? For the ﬁrst problem, we show that there exists a polynomial-time algorithm that for a given 2-coloring of S, approximates its coarseness to a constant ratio. For the second problem, we prove that every set of n points can be 2-colored such that its coarseness is at most O (n1/4 log n). We show that this bound is almost tight since there exist sets of n points such that every 2-coloring has coarseness at least (n1/4 ). © 2017 Elsevier B.V. All rights reserved.

1. Introduction Let S be a 2-colored set of n points in general position in the plane (no three of them collinear). Assume that each point of S is either red or blue. Suppose we want to decide if its distribution is well-separated by color. That is,

✩ Research of J.M. Díaz-Báñez and I. Ventura partially supported by project FEDER MEC MTM2009-08652, and the ESF EUROCORES programme EuroGIGA – ComPoSe IP04-MICINN Project EUI-EURC-2011-4306. Research of R. Fabila-Monroy partially supported by Conacyt of Mexico, grant 253261, (CONACYT, Mexico). Research of P. Pérez-Lantero partially supported by projects CONICYT FONDECYT/Iniciación 11110069 (Chile), Millennium Nucleus Information and Coordination in Networks ICM/FIC RC130003 (Chile), and CONICYT FONDECYT/Regular 1160 543 (Chile). Corresponding author at: Escuela Politécnica Superior, Departamento

*

de Matemática Aplicada II, Universidad de Sevilla, C/Virgen de África, 7, 41011, Sevilla, Spain. E-mail addresses: [email protected] (J.M. Díaz-Báñez), [email protected] (R. Fabila-Monroy), [email protected] (P. Pérez-Lantero), [email protected] (I. Ventura). http://dx.doi.org/10.1016/j.ipl.2017.02.007 0020-0190/© 2017 Elsevier B.V. All rights reserved.

if there exists some partition of the plane into convex sets so that every element in the partition can be seen as being mostly blue or red; or if on the contrary, we have an uniform distribution in which the colors are well-blended. In any of these two extreme cases, we need a formal definition of well-separated or well-blended point sets. This immediately calls for the use of geometric discrepancy and uniform distribution theories. Both of them have important applications in areas such as machine learning, data mining, computer graphics, etc. See [1–6] for problems and results in these ﬁelds. Recently in [7], a measure of this property has been proposed. As pointed out in [7], we must be careful how we deﬁne well-blended bicolored point sets. When one attempts to generalize to the plane the notion of uniform distributions of 2-colored points in one dimension, numerous inconsistencies may appear. For example, let us consider the following deﬁnition for points on a line. We say that a bicolored point set on the real line is well-blended if in any interval the difference between the number of red

2

J.M. Díaz-Báñez et al. / Information Processing Letters 123 (2017) 1–7

Fig. 1. a) Convex position, alternating coloring. b) Linear separator. (Red points are disks, blue points are circles.)

and the number of blue points of S is bounded by a constant. When the constant is just one, this deﬁnition implies that the colors of the points alternate and this is exactly the extreme case for well-blended conﬁgurations. A natural generalization in the plane of intervals would be to consider convex sets or islands1 [8]. Unfortunately, as can be seen in Fig. 1a) such generalization does not work for two dimensions. In order to get a formal deﬁnition of well-blended point sets, we ﬁrst recall the well-known parameter of combinatorial discrepancy. Let R be the subset of red points of S and B the subset of its blue points. Let Y be a family of subsets of S. For a set Y ∈ Y , deﬁne ∇(Y ) = | R ∩ Y | − | B ∩ Y |. The combinatorial discrepancy of S with respect to Y is deﬁned as

C O M B D I SC ( S , Y ) = max ∇(Y ). Y ∈Y

The set S has high discrepancy for a given family Y if C O M B D I SC ( S , Y ) is high with respect of | S |. We could say that a bicolored set S is well-blended if the combinatorial discrepancy with respect to islands is bounded by a constant. However, for point sets in general position, this will never happen, since any bicolored point set contains islands with logarithmic discrepancy [7]. Even more, as already exempliﬁed, suppose that the points of S lie on a circle in such a way that the colors alternate when traversing the circle clockwise. Intuition tempt us to consider this conﬁguration as well-blended. However, the combinatorial discrepancy using islands does not work for this case; since we can easily ﬁnd a monochromatic subset whose discrepancy is (n), refer to Fig. 1a). Bereg et al. [7] introduced a parameter, which they called coarseness, to get around this problem. Intuitively speaking, if the set S is not well-blended, we should be able to split it into islands, each with large discrepancy. As a consequence, in a well-blended conﬁguration every partition into islands should contain a island with low combinatorial discrepancy. We now recall the deﬁnition of coarseness. A convex partition of S is a partition of S into islands with pairwise

1 A subset I of S is called an island if there is a convex set C on the plane such that I = C ∩ S.

Fig. 2. A linear separator is not good for clustering but a convex partition is. (Red points are disks, blue points are circles.)

disjoint convex hulls. The discrepancy of a convex partition = { S 1 , S 2 , . . . , S k } of S, denoted disc(), is the minimum of ∇( S i ) overall i = 1, . . . , k. Deﬁnition 1. The coarseness of S, denoted by C ( S ), is the maximum of disc() over all the convex partitions of S. A remarkable difference between the coarseness and the combinatorial discrepancy is that the latter looks for a local block and the former considers the overall distribution of points. The concept of coarseness is related with a main problem in supervised classiﬁcation and machine learning. Suppose we are given a labeled training set represented by a point set, each labeled as a positive or a negative example and we want to ﬁnd a hypothesis (for example, a partition or clustering) that can be used as a predictor for other query points [9]. One of the state-of-the-art methods for supervised learning is the Support Vector Machines (SVM) [10,11]. For the 2-class case, the set of hypothesis is the set of hyperplanes and SVM aims to separate both classes by means of a hyperplane. Such linear separators are widely used in practice because they are relatively fast to compute; however, their simplicity often limits their accuracy. SVM assumes that the two classes appear well separated by a hyperplane and the outliers examples lie around the separator. See Fig. 1b). However, there exists empirical evidence that more complex models can work better in practice, for example, when the hypothesis set is restricted to boxes [12,13], circles [14] or convex sets [15,8]. In Fig. 2 a clustering by using convex sets seems adequate. Therefore, geometric models different to linear separators are needed. With the coarseness parameter we may consider a convex partition as hypothesis and it can be seen as a generalization of other geometric models as boxes, circles, etc. It is important to observe that in the deﬁnition of coarseness the cardinality of the convex partition is not ﬁxed. This makes its computation a diﬃcult problem. It is conjectured in [7] that the problem of ﬁnding the coarseness of bicolored sets is NP-hard and ﬁnding eﬃcient algorithms to approximate the coarseness is posed as an open problem. We arrive at the ﬁrst problem studied in this paper:

J.M. Díaz-Báñez et al. / Information Processing Letters 123 (2017) 1–7

3

• Coarseness approximation Does there exists a polynomial time algorithm to approximate the coarseness of a 2-coloring of S within a constant ratio? In this paper we show that the coarseness of a 2-colored point set S can be approximated in polynomial time with an approximation ratio between 1/128 and 1/64. This ratio depends on ∇( S ) = | R | − | B | (Theorem 8); the approximatevalue for the coarseness that we provide is at least max

C( S ) C( S ) 128

,

64

− ∇( S ) and at most C ( S ).

From a theoretical point of view, it is also desirable to know what is the smallest possible coarseness achievable by a 2-colored set of n points in general position in the plane. We could then say that S is well blended if its coarseness is within a constant factor of such value. We arrive at the second problem studied in this paper:

• Coarseness bounding What is the smallest coarseness achievable by a 2-coloring of S? Notice that the problem of coloring an n-point set in the plane such that the resulting 2-colored point set has high coarseness is trivial (color all points with the same color). We show that for every n-point set in general position in the plane there such that its coarseness exists a 2-coloring

is at most O n1/4 log n

Fig. 3. If I belongs to I2 \ I1 , then it is the intersection of S with two halfplanes H 1 and H 2 . (For interpretation of the references to color in this ﬁgure, the reader is referred to the web version of this article.)

disc( S \ I ) = r − | I ∩ R | − b − | I ∩ B |

≥ | I ∩ R | − | I ∩ B | − r − b = disc( I ) − r − b ≥ t − |r − b|.

If t − |r − b| ≥ t /2, then for the convex partition = { I , S \ I } we have that disc() ≥ t − |r − b|. Otherwise, if t − |r − b| < t /2, then disc( S ) = |r − b| > t /2 which implies that disc() > t /2 for the trivial convex partition = { S }. The result follows. 2

(Theorem 17). We also show

that there exist point sets such that all 2-colorings have coarseness at least (n1/4 ) (Theorem 13). To prove our bounds we make use of geometric discrepancy theory. The main tool we use to obtain the results in this paper are k-separable islands of S, deﬁned as follows.

Lemma 4. Let t be a nonnegative integer. If there exists an island I ∈ I2 of S such that disc( I ) ≥ t, then there exists a convex partition of S such that

Deﬁnition 2. An island I of S is k-separable if it can be separated from S \ I with at most k halfplanes; that is, if there exist halfplanes H 1 , H 2 , . . . , H t (1 ≤ t ≤ k), such that I = S ∩ ( H 1 ∩ H 2 ∩ . . . H t ). We denote the family of all the k-separable islands of S with Ik .

Proof. If I ∈ I1 , then the result follows from Lemma 3. Assume that I ∈ I2 \ I1 . Let H 1 and H 2 be two halfplanes such that I = S ∩ ( H 1 ∩ H 2 ). Let I = S ∩ ( H 1 ∩ H 2 ), I = S ∩ ( H 1 ∩ H 2 ), and I = S ∩ ( H 1 ∩ H 2 ). Refer to Fig. 3. If disc( I ) ≤ t /2, then the island I ∪ I ∈ I1 satisﬁes

2. Coarseness approximation

disc( I ∪ I ) ≥ disc( I ) − disc( I ) ≥ t − t /2 = t /2,

Let r = | R | and b = | B | be the number of red and blue points of S, respectively. Before proceeding we introduce some notation and conventions, in order to simplify our proofs. We consider an island I as a single partition of itself. Let disc( I ) = ∇( I ) and D k = max I ∈Ik disc( I ). For a given set X ⊆ R2 , let X denote the complement of X , that is, X = R2 \ X . In this section, we show that the value of D 2 is a constant approximation for the coarseness of S. We start with the following lemmas. Lemma 3. Let t be a nonnegative integer. If there exists an island I ∈ I1 of S such that disc( I ) ≥ t, then there exists a convex partition of S such that

disc() ≥ max {t /8, t /4 − |r − b|} .

and by Lemma 3, there exists a convex partition 1 such that

disc(1 ) ≥ max {t /4, t /2 − |r − b|} .

(1)

The same happens if disc( I ) ≤ t /2. Otherwise, if disc( I ) > t /2 and disc( I ) > t /2, then we proceed as follows. If

disc( I ) ≥ t /4, then the convex partition 2 = { I , I , I , I } satisﬁes

disc(2 ) ≥ t /4.

(2)

Otherwise, we have that the island I ∪ I ∈ I1 satisﬁes disc( I ∪ I ) > t /4 and then, by Lemma 3, there exists a convex partition 3 such that

disc(3 ) ≥ max {t /8, t /4 − |r − b|} .

disc() ≥ max {t /2, t − |r − b|} .

Combining equations (1)–(3) the result follows.

Proof. We have that disc( S \ I ) ≥ t − |r − b|. Indeed,

Lemma 5. D 3 ≤ 4D 2 , and D k+1 ≤ 2D k for k ≥ 3.

(3)

2

4

J.M. Díaz-Báñez et al. / Information Processing Letters 123 (2017) 1–7

Fig. 4. H 1 , H 2 , H 3 , I , I depending on whether H 1 ∩ H 2 ∩ H 3 is bounded or unbounded. (For interpretation of the references to color in this ﬁgure, the reader is referred to the web version of this article.)

Proof. First we prove that D 3 ≤ 4D 2 . Let I ∈ I3 be an island such that D 3 = disc( I ). From Deﬁnition 2 it follows that I1 ⊆ I2 ⊆ I3 , thus D 1 ≤ D 2 ≤ D 3 . If I ∈ I2 , then D 3 = D 2 ≤ 4D 2 . Thus, assume I ∈ I3 \ I2 ; and let H 1 , H 2 , and H 3 be three halfplanes such that I = S ∩ ( H 1 ∩ H 2 ∩ H 3 ). Note that H 1 ∩ H 2 ∩ H 3 may be bounded or unbounded. We label H 1 , H 2 , H 3 as in Fig. 4 depending on these two cases. Let I = S ∩ ( H 1 ∩ H 2 ∩ H 3 ), and note that I ∪ I ∈ I2 . See Fig. 4. If disc( I ) ≤ D 3 /2, then

D 2 ≥ disc( I ∪ I ) ≥ disc( I ) − disc( I )

≥ D 3 − D 3 /2 = D 3 /2 . This implies D 3 ≤ 2D 2 . Thus, assume disc( I ) > D 3 /2. If H 1 ∩ H 2 ∩ H 3 is unbounded then I ∈ I2 , which implies disc( I ) ≤ D 2 and then D 3 < 2D 2 . Thus, assume that H 1 ∩ H 2 ∩ H 3 is bounded and let I = S ∩ ( H 1 ∩ H 3 ). Note that both I and I ∪ I belong to I2 . If disc( I ) ≥ D 3 /4, then D 3 ≤ 4 · disc( I ) ≤ 4D 2 . Otherwise, if disc( I ) < D 3 /4, then

D 2 ≥ disc( I ∪ I ) ≥ disc( I ) − disc( I )

> D 3 /2 − D 3 /4 = D 3 /4 , which implies D 3 < 4D 2 . We have then proved that D 3 ≤ 4D 2 . We prove now that D k+1 ≤ 2D k for k ≥ 3. Let I ∈ Ik+1 be an island such that disc( I ) = D k+1 . If I ∈ Ik , then D k+1 = D k since Ik ⊆ Ik+1 . Thus, assume that I ∈ Ik+1 \ Ik ; and let H 1 , H 2 , . . . , H k+1 be k + 1 halfplanes such

k+1 that I = S ∩ ( i =1 H i ). Label these halfplanes clockwise

k+1

i =1 H i so that H k ∩ H k+1 ∩ H 1

k+1 is non-empty and shares and edge with i =1 H i . Refer to Fig. 5. Let I = S ∩ ( H k ∩ H k+1 ∩ H 1 ); note that I ∈ I3 and I ∪ I ∈ Ik . If disc( I ) ≥ D k+1 /2, then D k+1 ≤ 2 disc( I ) ≤ 2D 3 ≤ 2D k . Otherwise, we have

around the boundary of

D k ≥ disc( I ∪ I ) ≥ disc( I ) − disc( I )

> D k +1 − D k +1 /2 = D k +1 /2 , which implies D k+1 < 2D k .

2

We now show that every convex partition must contain a 5-separable island; we use the following known result.

Fig. 5. H 1 , H k , H k+1 in the proof of the second case of Lemma 5. (For interpretation of the references to color in this ﬁgure, the reader is referred to the web version of this article.)

Lemma 6 (Theorem 2 in [16]). A collection of n ≥ 3 compact, convex, and pairwise disjoint sets in the plane can be covered with n non-overlapping convex polygons with a total of no more than 6n − 9 sides. Lemma 7. Every convex partition of S has a 5-separable island. Proof. Let := { S 1 , S 2 . . . , S m }. By Lemma 6, there exist non-overlapping convex polygons C 1 , C 2 , . . . , C m with a total of no more than 6m − 9 sides, such that for each i = 1, . . . , m the convex hull of S i is contained in C i . Thus, one of these convex polygons has at most 5 sides and the enclosed island is a 5-separable island. 2

( S ) C( S ) Theorem 8. max C128 , 64 − |r − b|

≤ max

D2 D2 , 4 8

−

|r − b| ≤ C ( S ). Proof. Observe that max

D2 D2 , 4 8

− |r − b| ≤ C ( S ) follows

from Lemma 4. Since any convex partition of S has a 5-separable island, by using Lemma 5, we have that C ( S ) ≤ D 5 ≤ 2D 4 ≤ 4D 3 ≤ 16D 2 ; the result follows. 2 Theorem 9. C ( S ) can be approximated within constant ratio in polynomial time. Proof. The value of D 2 is equal to the discrepancy of a 2-separable island of maximum discrepancy; it can be

J.M. Díaz-Báñez et al. / Information Processing Letters 123 (2017) 1–7

computed in O (n3 log n) time, see [17]. The result then follows from Theorem 8. 2 3. Coarseness bounding 3.1. Visiting the discrepancy theory In this section, we recall some deﬁnitions and results from discrepancy theory. Let S now be a ﬁnite set of n elements, and Y ⊆ 2 S be a family of subsets of S. The tuple ( S , Y ) is called a range space. If the range space arises from point sets and geometric objects, then ( S , Y ) is called a geometric range space. A coloring of S is a mapping X : S → {−1, +1}. We think of the elements of S mapped to −1 as being colored blue and the elements of S mapped to +1 as being colored red. For Y ⊆ S, let X (Y ) = y ∈Y X ( y ). The discrepancy of Y is deﬁned as disc(Y ) = |X (Y )|; that is, it is the absolute value of the number of red minus the number of blue points that Y contains. The discrepancy of the family Y is deﬁned as disc(Y ) = minX maxY ∈Y disc(Y ). The primal shatter function πY (m) of ( S , Y ) is a function of m deﬁned as follows. It is the maximum number of subsets into which a subset of S, of at most m elements, can be split (or “shattered”) by all the elements of Y . Formally:

πY (m) =

max

A ⊂ S ,| A |≤m

|{Y ∩ A : Y ∈ Y }|

∗ (m) is obtained by exchangThe dual shatter function πY ∗ (m) ing the roles of the points in S with the sets in Y . πY is deﬁned as the maximum number of equivalence classes on S deﬁned by an m-element subfamily Z ⊂ Y , where two elements x and y of S are equivalent if they belong to the same sets of Z . The primal and dual shatter functions have been used to give tight and almost tight upper bounds on the discrepancy of range spaces, via the following theorems (see Chapter 5 of [4]).

Theorem 10 (Primal shatter function bound). Let d > 1 and C be constants such that πY (m ) ≤ Cmd for all m ≤ n. Then, disc(Y ) is at most O n1/2−1/2d where |Y | = n.

5

Theorem 13. For arbitrarily large values of n, there exist sets of n points in general position such that the following holds. Every 2-coloring of this set has coarseness at least Cn1/4 , for some positive constant C . Proof. Assume that S is the set of points given by Lemma 12, and consider any 2-coloring of S. Thus, there exists a halfplane H such that disc( S ∩ H ) ≥ C n1/4 for some positive constant C . Suppose that the trivial convex partition { S } has discrepancy at most (C /2)n1/4 , as otherwise we are done with C := C /2. Then, we have that disc( S \ H ) ≥ (C /2)n1/4 and the convex partition = { S ∩ H , S \ H } of S has discrepancy disc() ≥ (C /2)n1/4 . Thus, the coarseness of S is at least Cn1/4 , with C := C /2. 2 3.2. k-Separable islands and convex partitions We prove the upper bound by showing that the discrepancy of a convex partition is closely related to the discrepancy of k-separable islands. For constant k, we upper bound the discrepancy of a k-separable island Ik by using its dual shatter function. Namely, we show that πI∗ (m) = O (m2 ). We point out that k Dobkin and Gunopulos [17] proved the same asymptotic upper bound, but our proof provides more details and explicitly gives the constant hidden in the big-O notation. Lemma 14. If k is a positive integer and S a set of n points in convex position in the plane, then πI∗ (m) ≤ 4km. k

Proof. Assume that S is sorted clockwise around its convex hull. Note that any k-separable island must consist of at most k intervals of consecutive points of S in this order. Consider a family of m, k-separable islands. There are at most 2km points of S that are the endpoints of any such intervals. There are at most 2km regions into which the remaining points (which are not endpoints of any interval) can lie. Thus, in total, there are at most 4km equivalence classes. 2 Lemma 15. If k is a positive integer and S a set of n points in general position in the plane, then πI∗ (m) ≤ (k2 + 4k)m2 . k

Theorem 11 (Dual shatter function bound). Let d > 1 and C ∗ (m) ≤ Cmd for all m ≤ |Y |. Then, be constants such that πY

disc(Y ) is at most O n1/2−1/2d log n where |Y | = n.

For example, if H is the family of halfplanes, it is easy to see that πH (m) = O (m2 ). Thus the discrepancy of halfplanes is O (n1/4 ). It is known that this bound is tight. Lemma 12 ([18,19]). For arbitrarily large values of n, there exist sets of n points in general position in the plane such that, given any 2-coloring of S, a halfplane exists within which one color outnumbers the other by at least Cn1/4 , for some positive constant C . From Lemma 12, we prove the following Theorem.

Proof. Let Fm ⊂ Ik be a family of m, k-separable islands of S. We ﬁrst consider the points lying in the convex hull of some island I of Fm . Note that the convex hull of I is a set of points in convex position. By Lemma 14, these points are in at most 4k(m − 1) different equivalence classes (when considering the other m − 1 islands in Fm ). Thus, in total, there are at most 4km2 equivalence classes for points in the boundary of some island in Fm . We now bound the number of equivalence classes for points not lying in the boundary of any island. Each of such equivalence classes is contained in a cell of the line arrangement deﬁned by the following set of lines L. For each island I ∈ Fm , let L I be the set of at most k lines that separate I from S \ I . Set L = ∪ I ∈Fm L I . The line arrangement deﬁned by L has at most |L|2 = k2 m2 cells. The result thus follows. 2

6

J.M. Díaz-Báñez et al. / Information Processing Letters 123 (2017) 1–7

Using the dual shatter function bound and Lemma 15, we obtain the following theorem: Theorem 16. Let k be a positive constant and S a set of n points in general position in the plane. The discrepancy ofthe family of

k-separable islands of S is at most O n1/4 log n .

Note that although k-separable islands have small discrepancy, this is not the case for islands in general. For example, for any coloring of a set of n points in convex position in the plane, there always exists an island with discrepancy at least n/2. It can be shown that in this case the primal and dual shatter function are equal to 2m . Finally, we arrive at the main result of this section by combining Theorems 7 and 16. Theorem 17. For every set S of n points in general position in the plane there exists a 2-coloring such that the coarseness of S is at most O (n1/4 log n). 4. Conclusions In this paper, the concept of the coarseness of a 2-colored point set is revised and new results are given from both the algorithmic and combinatorial point of view. Our techniques are based on the concept of k-separable islands. From the algorithmic side, we provided a constant ratio, polynomial time, approximation algorithm for computing the coarseness of a 2-colored point set. Proving the NPhardness of computing the coarseness of a 2-colored point set remains an open problem. From the combinatorial side, we proved that the disof a set of crepancy of the family of k-separable islands n points in the plane is at most O (n1/4 log n), by showing that its dual shatter function πI∗ (m) is at least O (m2 ). k It is known that the dual shatter function bound can be tight for some range spaces (see [4]). It is not hard to see that in the case of point sets in convex position the primal shatter function of their k-separable islands is at least (mk ). So the primal shatter function bound can be arbitrarily worse than the dual shatter function bound in this case. It is also interesting to note that the discrepancy of 1-separable islands (halfplanes) is at most O (n1/4 ). We leave the exact (asymptotic) computation of the discrepancy of k-separable islands as an open problem. Using the fact that every convex partition of a point set S has an island (in this case a 5-separable island) of low discrepancy, we showed that every n-point set in general position in the plane can be two-colored so that the coarseness is at most O (n1/4 log n). Unfortunately, one can ﬁnd convex partitions of a point set S with no 4-separable island, as illustrated in the example of Fig. 6. Thus, the same technique does not improve the bound. However, Theorem 6 provides more information: for any positive constant c < 1 there exists a positive integer kc (depending only on c), so that in every convex partition of S into m islands at least cm of them are ck -separable (and thus have small discrepancy). We think that computing the exact asymptotic value of the above bound on the

Fig. 6. An example with no 4-separable island.

coarseness of point sets is an interesting (and hard) open problem. Acknowledgement The problems studied here were introduced and partially solved during a visit to University of Valparaiso funded by project FONDECYT 11110069 (Chile). References [1] J.R. Alexander, J. Beck, W.W.L. Chen, Geometric discrepancy theory and uniform distribution, in: Handbook of Discrete and Computational Geometry, CRC Press, 1997, pp. 185–207. [2] B. Chazelle, The discrepancy method in computational geometry, in: Handbook of Discrete and Computational Geometry, CRC Press, 2004, pp. 983–996. [3] M. Drmota, R.F. Tichy, Sequences, Discrepancies and Applications, Lecture Notes in Mathematics, vol. 1651, Springer, 1997, pp. 983–996. [4] J. Matoušek, Geometric Discrepancy: An Illustrated Guide, SpringerVerlag, 1999. [5] J. Pach, P.K. Agarwal, Combinatorial Geometry, Wiley-Interscience Series in Discrete Mathematics and Optimization, Wiley, 1995. [6] S. Majumder, B.B. Bhattacharya, On the density and discrepancy of a 2D point set with applications to thermal analysis of VLSI chips, Inf. Process. Lett. 107 (5) (2008) 177–182. [7] S. Bereg, J.M. Díaz-Báñez, D. Lara, P. Pérez-Lantero, C. Seara, J. Urrutia, On the coarseness of bicolored point sets, Comput. Geom. 46 (1) (2013) 65–77. [8] C. Bautista-Santiago, J.M. Díaz-Báñez, D. Lara, P. Pérez-Lantero, J. Urrutia, I. Ventura, Computing optimal islands, Oper. Res. Lett. 39 (4) (2011) 246–251. [9] S.M. Weiss, C.A. Kulikowski, Computer Systems That Learn: Classiﬁcation and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1991. [10] C.J.C. Burges, A tutorial on support vector machines for pattern recognition, Data Min. Knowl. Discov. 2 (2) (1998) 121–167. [11] L.H. Hamel, Knowledge Discovery with Support Vector Machines, vol. 3, John Wiley & Sons, 2011. [12] D.P. Dobkin, D. Gunopulos, W. Maass, Computing the maximum bichromatic discrepancy, with applications to computer graphics and machine learning, J. Comput. Syst. Sci. 52 (3) (1996) 453–470. [13] J. Barbay, T.M. Chan, G. Navarro, P. Pérez-Lantero, Maximum-weight planar boxes in time (and better), Inf. Process. Lett. 114 (8) (2014) 437–445. [14] A. Cannon, J.M. Ettinger, D. Hush, C. Scovel, Machine learning with data dependent hypothesis classes, J. Mach. Learn. Res. 2 (2002) 335–358. [15] P. Fischer, Sequential and parallel algorithms for ﬁnding a maximum convex polygon, Comput. Geom. 7 (3) (1997) 187–200.

J.M. Díaz-Báñez et al. / Information Processing Letters 123 (2017) 1–7

[16] H. Edelsbrunner, A.D. Robison, X. Shen, Covering convex sets with non-overlapping polygons, Discrete Math. 81 (2) (1990) 153–164. [17] D.P. Dobkin, D. Gunopulos, Concept learning with geometric hypotheses, in: Proceedings of the Eighth Annual Conference on Computational Learning Theory, COLT’95, ACM, New York, NY, USA, 1995, pp. 329–336.

7

[18] J.R. Alexander, Geometric methods in the study of irregularities of distribution, Combinatorica 10 (2) (1990) 115–136. [19] B. Chazelle, J. Matoušek, M. Sharir, An elementary approach to lower bounds in geometric discrepancy, Discrete Comput. Geom. 13 (1995) 363–381.

New results on the coarseness of bicolored point sets

New results on the coarseness of bicolored point sets

Recommend Documents