Journal Pre-proof A Fully Polynomial Time Approximation Scheme for the Smallest Diameter of Imprecise Points
Vahideh Keikha, Maarten Löffler, Ali Mohades
PII:
S0304-3975(20)30090-6
DOI:
https://doi.org/10.1016/j.tcs.2020.02.006
Reference:
TCS 12371
To appear in:
Theoretical Computer Science
Received date:
27 January 2019
Revised date:
14 January 2020
Accepted date:
6 February 2020
Please cite this article as: V. Keikha et al., A Fully Polynomial Time Approximation Scheme for the Smallest Diameter of Imprecise Points, Theoret. Comput. Sci. (2020), doi: https://doi.org/10.1016/j.tcs.2020.02.006.
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2020 Published by Elsevier.
A Fully Polynomial Time Approximation Scheme for the Smallest Diameter of Imprecise Points Vahideh Keikhaa , Maarten L¨offlerb , Ali Mohadesa a
Department of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran b Department of Information and Computing Sciences, Utrecht University, Utrecht, The Netherlands
Abstract Given a set D = {d1 , . . . , dn } of imprecise points modeled as disks, the minimum diameter problem is to locate a set P = {p1 , . . . , pn } of fixed points, where pi ∈ di , such that the furthest distance between any pair of points in P is as small as possible. This introduces a tight lower bound on the size of the diameter of any instance P . In this paper, we present a fully polynomial time approximation scheme (FPTAS) for computing the minimum diameter of a set of disjoint disks that runs in O(n2 −1 ) time. Then we relax the disjointness assumption and provide a O(n2 −2 ) randomize algorithm. We also show that our results can be generalized in Rd when the dimension d is an arbitrary fixed constant. Keywords: Computational Geometry; Approximation algorithms; Imprecise points; Minimum diameter.
1. Introduction Over the past 15 years, a flurry of activity can be seen in the design of algorithms for geometric problems with uncertain input. In this paper we study computing tight lower bounds on the size of a basic geometric measure of a set of Imprecise points, the diameter, where the uncertainty of the input is modeled by a set D = {d1 , . . . , dn } of disks and each disk represents an imprecise point. The lower bound is the smallest possible diameter, when one is allowed to translate each point within the corresponding disk. See Figure 1. Such lower bounds give some indication on how “extent” any set of fixed points from distinct disks is, in certain sense. Indeed, in the geometric problems which concern computing some measures on a given input, when the input is precise, the output is an exact number α. But when the input is uncertain, it is natural to ask what the exact changing range of α is, and how fast we can compute it. Although the stated uncertainty model is applicable and quite simple, finding the exact true solutions looks to be notoriously hard. As the proof of evidence to this inquiry, even with a set of single points instead of each disk, it is shown that computing a subset with minimum possible diameter is NP-hard [9]. Email addresses:
[email protected] (Vahideh Keikha),
[email protected] (Maarten L¨offler),
[email protected] (Ali Mohades) Preprint submitted to TCS
February 10, 2020
Related Work. L¨offler and van Kreveld [21, 22] introduced a framework with the intention of computing tight lower and upper bounds on geometric measures, for instance, the diameter, the area or the perimeter of the convex hull, the radius of the smallest enclosing circle, etc. They modeled the uncertainty of data by convex regions, including segments, squares or disks. During the last fifteen years, their paradigm has been successfully applied to various problems [4, 14, 15, 18, 19, 24, 26]. In general, there are many methodologies for dealing with uncertain data in geometric problems: Epsilon-geometry, the Probabilistic model, the Region-based model, etc. In the following, we give a short description of the related studies. The Probabilistic model assigns probability distributions to uncertain points. There exist two common methods to define an uncertain point in this model, sometimes referred to as the locational model and the existential model. In the locational model, each uncertain point always exists but its exact location is unknown. There exists a probability that determines the position of each uncertain point. In the existential model, each uncertain point has a certain exact location but its existence is unknown, and again there exists a probability density function that determines the existence of each uncertain point. We can then define α as a random variable that is distributed over exponentially many choices for the point set. For α we can choose, for instance, the most likely sets with a specific value of the diameter, the width, or the most likely convex hull. Another possibility is to let α be a shape defined by the point set, and compute the probability of a query point lying inside α, etc. There has been a lot of recent work using this model as well [1, 3, 10, 13, 16, 28, 29]. The Region-based model is another model where the input is a set R = {r1 , . . . , rn } of regions and each region represents an uncertain point. In this model, we do not necessarily assume any probability distribution in the regions. Again, we may define α as a measure on R. Several problems have been studied in this model, for instance computing the optimum value of α, or bounds on the possible values of α. Another line of research focuses on preprocessing R and computing the exact value of α later, when a query set P = {p1 , . . . , pn }, with pi ∈ ri is given. There has also been a lot of recent work in this model [8, 12, 20, 21, 27]. In the Region-based model, the minimum diameter problem on a set of imprecise points √ 3π/ was studied by L¨offler and van Kreveld [22], who presented a PTAS in O(n ) time, where the uncertainty of the input was modeled by arbitrary disks. In the same paper, the authors also presented an exact algorithm that runs in O(n log n) time and computes an upper bound on the size of the diameter (maximum diameter problem). Also, a new approximation algorithm was presented for this problem, where the uncertainty of the input is modeled by −d convex objects in d-dimensional space [10]. The presented algorithm runs in O(2 −2d n3 ) time. Recently, Kazemi et al. [17] studied a related problem so-called minimum diameter color spanning set MDCSS in Rd , the input is a set P of n colored points with k < n colors and the question is to find a subset Q of P , where each color is included in Q, and the diameter of Q is minimized. The authors presented a PTAS in high dimensional space for this problem, and they prove that under assuming Exponential Time Hypothesis (ETH), (1−d)/2 ) there is no 2o( poly(n) time (1 + )-approximation algorithm to solve the problem. In our problem, if we approximate each disk with a set of discrete points and assign them a 2
(a)
(b)
(c)
Figure 1: Problem definitions and optimal solutions. (a) The diameter of a set of precise points is the largest pairwise distance. (b) The smallest possible diameter on a given set of imprecise points, where each uncertain point is modeled by a disk. Here the minimum diameter occurs at three pairs at the same time; moving any of the four points involved would increase the distance between at least one pair of them. (c) The largest possible diameter on a given set of imprecise points.
unique color, the problem reduces to the MDCSS problem. Also, we show that if we find a linear ordering of the important points of the same color (that lie on a disk), we can design an FPTAS for the MDCSS, which is a significant positive result in this context. The minimum diameter problem is also studied in other models of uncertainty [2, 7, 9, 14, 30]. Finally, it is worth mentioning that the minimum diameter problem can be modeled as an optimization problem: let d∗ denote the minimum diameter, the objective function is minimizing d∗ , where the distance between any pair is smaller than d∗ , and each pair’s point is restricted to locate on its disk. Obviously, the optimization model is minmax and is a bilevel optimization problem that does not admit a straightforward solution. Furthermore, the theoretically provable polynomial optimization techniques to solve such problems still suffer from issues such as unstable performance for different data sets, the high dimension of the input and poor performance in practice [5]. In Appendix A we discuss in detail that for large values of n, our method is always more efficient. Also if we scrutinize the space complexity, our method is always more efficient. Contribution. In this paper, we present an FPTAS for approximating some bounds on the diameter of a set of uncertain points, where previously only some PTAS’s were known [10, 22]. Specifically, we study the following problem, illustrated in Figure 1: Problem 1. Given a set D = d1 , . . . , dn of disjoint disks, choose a set P = {p1 , . . . , pn } of points, where pi ∈ di , such that the diameter of P is as small as possible. Results. • We present an FPTAS for Problem 1, which runs in O(n2 −1 ) time (Section 3). • Then we relax the disjointness assumption of disks and we show that adjusting the presented FPTAS will cost O(n2 −2 ) expected time (Section 4). • We then discuss that our results can be extended to any fixed dimension (Section 5). Recall that the maximization version of Problem 1 is much easier [22]. 3
di−1
`
di+1
αi di
(a)
(b)
Figure 2: (a) Extreme disks (shown in gray) on the critical sequence. Extreme arc αi is determined by the neighbors of di on ∆D . (b) Delimited polygon (gray polygon) by the common tangent lines is not a simple polygon necessarily.
2. Preliminaries 2.1. Preliminary Concepts from [22] In terms of definitions and terminology, we will follow L¨offler and van Kreveld [22]. Also, we use the results of thissection that were obtained by L¨offler and van Kreveld [22]. For a given set D = d1 , . . . , dn of imprecise points modeled as disks, an extreme disk di ∈ D has a line ` tangent to some point on the boundary of di , where no other disk can have its interior completely on the same side of di , unless it is tangent to `, as illustrated in Figure 2(a). The critical sequence ∆D is the set of all extreme disks of D. Without loss of generality, suppose the elements of ∆D ordered clockwise. This ordering can be found in O(n log n) time [25]. In the minimum diameter problem, it is possible that the diameter occurs at several pairs at the same time, and many points are involved in the diameter, such that moving any of them would increase the distance between at least one pair of them. This property makes the problem difficult (see Figure 1(b) and Figure 5(b)). In this situation, the diameter constructs a graph, the star graph. A star graph G is defined as G = (P ∗ , E), where P ∗ = {b1 , . . . , bm } (m ≤ n) is a collection of points, such that bi ∈ di for some i, and all the elements of E have equal length, which is the optimal diameter and denoted by d∗ . Observe that no two points of P ∗ can be more than d∗ away from each other. It follows that all the elements of E must intersect each other, and the path makes an angle of at most 60◦ at each vertex. Thus the degree of each vertex of G can be at most two, and each element bi ∈ di of P ∗ (with degree two) is in balance between its neighbors (which are the cherry disks of di , we will define them later), that is, it could move closer to one neighbor but only by moving further from the other neighbor. Note that we can remove any vertex in this graph that can come closer to some element without moving further from another element (see [22] for more details). Also notice that all n points could be involved in such a construction, where none of the points can be moved without increasing the diameter. The elements of P ∗ with degree two are called bends. Computing the exact positions of the bends is a difficult problem. 4
π 3
d∗
d∗
(a)
(b)
Figure 3: Illustration of Observation 1. The (not necessarily unique) edge of length d∗ is shown in green. (a) For n ≥ 3, d∗ > 0.28. (b) For n ≥ 7, d∗ > 2.
Any two adjacent disks on ∆D admit a common tangent line (see Figure 2). The extreme arc αi of an extreme disk di is defined by two disks di−1 and di+1 that are neighbors of di on ∆D , such that the endpoints of αi are the touching points of the common tangent lines, as illustrated in Figure 2(a). Note that αi is the part of the boundary of di that must contain bi . Let ri and ci denote the radius and the center of di , respectively. The distance from a point x on disk di to a disk dj is the minimum distance between x and any point on dj . The distance between two arcs αi and αj is the minimum distance from any point on αi to any point on αj . Also, for any two disks di and dj in ∆D , the shortest distance between di and dj will be determined by αi and αj . Let us denote the absolute length of the line segment ` by |`|. 2.2. New Concepts In this section, we provide some observations, notations and lemmas which are used in the main results. The anchored diameter at a vertex x means that x is a vertex of the diameter. Also, the smallest axis-aligned bounding box of a set of objects is the smallest axis-aligned rectangle containing all the objects. Let A be a set of points. For any A0 ⊂ A and 0 ≤ ≤ 1, and a geometric shape r from the range space R, if the inequality |r ∩ A| ≥ |A| implies that r contains at least one point of A0 , then A0 is an -net [11]. For a range space of VC-dimension d, the number of independent samples for constructing an -net is O( d log δd ), where 1 − δ is the probability of the set that is an -net, and > 0 is a given constant [11]. Also, the VC-dimension of a disk equals three [11]. Observation 1. For a given set D of n disjoint unit disks with n ≥ 3, the smallest diameter d∗ is at least 0.28 and for n ≥ 7, d∗ > 2. The observation can be easily proved by considering Figure 3 and the similarity between the triangles. The above observation gives us a simple constant factor approximation. Let γ be any upper bound for possible diameters of D, and let d be any feasible solution, then obviously d ≤ γ2 d∗ . 5
α2 β2 β1
βn−1
β1 β3
α100
α3
α1
00 αn−1
α10
0 αn−1
dn−1
d1 βn αn dn (a)
(b)
Figure 4: (a) The base of induction. (b) The inductive step.
Lemma 1. Let D = {d1 , . . . , dn } be a set of disjoint disks, and let α ˆ i denote the size of the constructed angle where the twoPlines through the endpoints of the extreme arc αi of extreme disk di meet the center. Then ni=1 α ˆ i = 2π. Proof. First note that the polygon which is delimited by the common tangent lines of the critical sequence is not necessarily convex or even a simple polygon, as illustrated in Figure 2(b). We will prove the lemma by using induction. Let dj be a disk on the critical sequence of D and let βi be the front angle of αj which is determined by the intersection of common tangent lines of dj with its two neighbors on ∆D , as illustrated in Figure 4(a). Let βˆj denote the size of βj . In each step (j), Let Aj denote the total sum of the angles of extreme arcs, where j is also the number of disks that we have considered until step (j). We first compute ∆D . Note that if di is not an extreme disk, α ˆ i = 0. Let ∆D = {e1 , e2 , . . . , em } be the extreme disks in D (so each ej = di for some i.) We prove that by incrementally inserting the extreme disks, the value Aj for j = 1, . . . , m is invariant. The baseP case consists of threePdisks; see Figure 4(a). Thus we have ∀j ∈ {1, 2, 3} : ˆ α ˆ j + βj = π, 3j=1 βˆj = π ⇒ A3 = 3j=1 α ˆ j = 2π. In each step, we insert one disk and since the inserted disks belong to ∆D , the inserted disk can only affect the extreme arcs of two other disks. Pm−1 In the inductive step, we show that if A = ˆ j = 2π, the statement also holds m−1 j=1 α Pm for m, so Am = j=1 α ˆ j = 2π. From the induction hypothesis, we know that before inserting the last disk, the sum of the angles of all the extreme arcs on ∆D is 2π. For brevity of explanation, our consideration is according to Figure 4(b). 6
dk
cj dj
dj
dk eki
eji
bi eij eik ci di
di
(b)
(a)
Figure 5: (a) Extreme disk di ∈ D has two cherry disks dj and dk . (b) All the cherry disks of D; the constructed star graph on them is shown in green. Bend bi is located in the interval [eij , eik ].
By inserting em , it will have two neighbors on ∆D . Let e1 and em−1 denote these disks (w.l.o.g). We consider the changes on the extreme arcs of e1 and em−1 . First we consider the changes of the extreme arc of e1 . The extreme arc α1 will be decomposed into two new parts, and respectively two new angles: α10 and α100 , that α10 denotes the updated extreme arc of e1 , and α100 is the part of α1 that should be omitted when inserting 0 em . Similarly, we will decompose the extreme arc of disk em−1 , αm−1 into two parts, αm−1 00 00 00 and αm−1 . We will show that α ˆm = α ˆ 100 + α ˆ m−1 . We have α ˆ m + βˆm = π, α ˆ m−1 +α ˆ 100 + βˆm = 00 π⇒α ˆm = α ˆ 100 + α ˆ m−1 . It should be noted that em can be larger or can be located somewhere else, up to the point where it is no longer tangent to the common tangent of e1 and em−1 ; since otherwise e1 and em−1 will no longer be extreme disks. Cherry Disks. For any disk di that shares bend bi on the star graph, there always exist two other disks dj , dk ∈ ∆D with j 6= k, such that bend bi is in balance between them, that is bi could move closer to one of them but only by moving further from the other one. We call dj and dk the cherry disks of di . An example is illustrated in Figure 5. If a disk di ∈ ∆D does not have two cherry disks, di cannot introduce a bend on the star graph [22]. For any disk di ∈ ∆D , let eij and eik denote the intersection points of the boundary of di by the line segments ci cj and ci ck , respectively (see Figure 5(b)), such that dj and dk are the cherry disks of di . The points eij and eik are the startpoint and the endpoint of the arc on which bend bi on αi is in balance between the cherry disks of di . We denote this interval by [eij , eik ]. Lemma 2. Let dj and dk with j < k denote the cherry disks of di . Then bi is located in the interval [eij , eik ]. Proof. Suppose this is false. Then bi ∈ / [eij , eik ]. We assume the possible locations of bi on the boundary of di are ordered clockwise about the center ci . Consider the case where bi is located strictly before the position eij (the other case is similar). In this case, at least bj is located strictly after eji (on the clockwise ordering of the boundary of dj ). See Figure 5(b) as 7
d∗ d d∗ d U (b)
(a)
(c)
Figure 6: (a) Subdivisions of the extreme arcs into sub-arcs of length (at most) . (b,c) The configurations of the approximated diameter d (shown in gray) and the optimal diameter d∗ (shown in green).
an illustration of the bends. But then, if we go through ∆D in clockwise direction, the two disks (or even one) which determine the position of bj on dj must be located clockwise after di until we reach the disk dj . Let dp and dq denote these disks. Since bj is strictly located after eji , at least one of dp and dq is different from di . Let dp be such a disk. Clearly, bi bk , bj bp and bj bq are some edges of the star graph. But then bj bp never intersects bi bk . This gives a contradiction with the fact that all the edges of the star graph are pairwise intersecting. 3. Minimum Diameter Problem 3.1. Unit Disks First, suppose D consists of disjoint unit disks. From Lemma 1 we know that if D consists of disjoint disks, the total sum of the angles of extreme arcs equals 2π. We proceed by covering the boundary of a unit disk U ∈ / D by all the extreme arcs of the disks in ∆D , according to the clock-wise ordering of ∆D , such that the extreme arcs just intersect at the endpoints, as illustrated in Figure 6(a). This covering is indeed a translation transformation. We decompose the boundary of U into smaller, equal-length sub-arcs by regularly inserting 2π/ points. Then, for any disk di ∈ ∆D , the added points on the boundary of U which is covered by αi will be transferred to the boundary of di , as illustrated in Figure 6(a). Consequently, the extreme arcs get divided into sub-arcs of length at most . Recall that P ∗ denotes the optimal point set. Let P 0 = {p1 , . . . , pm } denote the optimal point set restricted to the endpoints of the sub-arcs. The set P 0 minimizes the furthest distance between any pair of points on P 0 among all possible choices for P 0 for a given > 01 . Let d denote the diameter of P 0 . As said before, d∗ denote the optimal diameter of P ∗ . We will show that d approximates d∗ within a factor (1 + ). For any disk di ∈ ∆D , we define the optimal sub-arc αi∗ that includes (if any) the optimal bend bi . As we show in Lemma 7, αi∗ minimizes the difference of distances of the endpoints of αi∗ to the approximated cherry disks of di , where the diameter realized by such selection 1 We use the name P for the final set approximating d∗ which are selected from set D rather than ∆D , where P 0 ⊆ P . In Lemma 9 we will discuss that the diameter of P 0 (denoted by d0 ) equals the diameter of P (denoted by d).
8
p0j
p0x p0k
Initial position Second position Third position
Figure 7: Execution of Algorithm 1 on a set of five disks. The solid points are the endpoints of the subarcs.
p2+ i p1− i
p0i
of each αi∗ , is as small as possible. Note that the length of αi∗ for any di ∈ ∆D depends on , and in the limit as → 0 the length of αi∗ also goes to 0, and reaches the optimal solution. We will postpone the discussion of computing the optimal sub-arcs, and first consider how the set P 0 approximates the minimum diameter. The optimal diameter d∗ selects its vertices on two optimal sub-arcs. Lemma 3. For any two optimal sub-arcs which may include the vertices of the potential minimum diameter, the ratio of their smallest distance to their furthest distance is bounded by 1 + O(). Proof. There only exist two configurations to consider the ratio of the smallest distance to the furthest distance of a pair of optimal sub-arcs: the case where the edges realizing d and d∗ intersect, and the case where they do not intersect; see Figure 6(b,c). In both cases, the points can move by a distance of at most 2 along a sub-arc. Given that d∗ ≥ 2 and by the triangle inequality we have d ≤ d∗ (1 + O()). 3.2. Algorithm: Computing the Optimal Sub-arcs We show that for any disk di ∈ ∆D , we can find αi∗ efficiently. We initialize P 0 with arbitrary distinct points, so that for any disk di ∈ ∆D we select a point pi on the endpoints of an arbitrary sub-arc of αi ; as the initialization of the set P 0 . Then, during the algorithm, we try to move each element of P 0 to its best position, so that the final set P 0 minimizes the diameter among all possible choices for P 0 . Indeed, for any disk di we look for the optimal sub-arc αi∗ , where one of the endpoints of αi∗ will be determined by an element of P 0 , in the last step of the algorithm. In each step of the algorithm we start by reducing the size of the current diameter of P 0 , and we continue by reducing the size of the other anchored diameters. Let δ1 denote the current diameter of P 0 with pi and pj as the vertices. If pi (or pj ) is not yet in balance, we move it forward among the endpoints of the sub-arcs on αi in the direction such that the size of δ1 is decreasing, and in each movement, we update δ1 . We stop moving pi when in the next movement, the distance of pi to any other point, say pk ∈ P 0 , will be greater than the current size of δ1 . So far, we have updated one element of P 0 . 9
In the next stage, let δ2 denote the diameter with a vertex at pk . Then we move pk in the direction such that the size of δ2 is decreasing (also we update the size of δ2 in each movement), until in the next movement, the distance of pk to any other point, say pl , is greater than the current size of δ2 . We also repeat this procedure for pl . Then, if the updated points are in balance but a point px ∈ P 0 remains that have not checked yet (i.e. px was not the furthest point of any other point), we will continue the above procedure by px and its furthest neighbor. We stop this step once we update/check all the elements of P 0 . In the next step, we again start by computing the diameter of P 0 . We repeat the above procedure, until in the last step of the algorithm, we only can check the position of all the elements of P 0 , while no other movement is possible. Then we have approximated the cherry disks of any disk di ∈ ∆D , and also one endpoint of the optimal sub-arc αi∗ is determined. The other endpoint is the one which is closer to both approximated cherry disks of di .2 This procedure is illustrated in Figure 7 and is outlined in Algorithm 1. Lemma 4. During the consecutive steps of the algorithm, the changes of the size of the diameter anchored at each point pi is always non-increasing. Proof. Suppose the lemma is false. Then in a step, say (η), the size of the diameter anchored at a given vertex pi is increased. Suppose pj is the other vertex of the anchored diameter at pi in step (η − 1). During the step (η − 1) of the algorithm, we have moved pi until it gets in balance between two points pj and another element of P 0 . Then in step (η), there is another point px such that |pi px | > |pi pj |, which also means px is the furthest neighbor of pi . Let pb denote the other vertex of the anchored diameter at px , that is |px pb | > |px pi |. It implies that in step (η), px has been moved further form pi with respect to its previous position. It follows that px has moved in (w.l.o.g) cw direction to have a further distance from pi which is because pb has been moved toward another point, say pl , that is |pb pl | > |pb pi |. (Note that px can coincide with pj but pb 6= pi due to the movement of px .) Because of the direction of movement of px , pb has to be moved in ccw direction among the boundary of its disk. Also |px pb | > |px pi |. But point pl must be located to the left of the directed line through − → p− b px , since px is moved in cw direction to get closer to pb , and pb is moved in ccw direction to get closer to pl , as illustrated in Figure 8. From above discussion about the position of pl and by considering the intersection region of the circle centered at pi with radius |pi px | with the circle centered at pb with radius |pb px |, |pb pl | > |pb pi | and |pb pl | > |pb px | we have |pi pl | > |pi px |. This gives a contradiction with the fact that in step (η), px is the furthest neighbor of pi . The case where px is moved in ccw direction is symmetric. 2
If the optimal sub-arc is the leftmost or the rightmost sub-arc on a disk, then there is only one endpoint for the optimal sub-arc.
10
Algorithm 1: Approximation algorithm Input D: a set of disks, :a small positive constant Output P 0 : a set of points which approximate d∗ ∆D = extreme disks of D m = cardinality of ∆D Initialize P 0 with selecting an arbitrary point from any disk of ∆D , so that arbitrary points are chosen at the endpoints of the computed sub-arcs of length . while there is a point in P 0 which is not in balance yet do f lag = 1 pi , pj =diameter of P 0 while f lag = 1 do for pk , k = 1, . . . m, k 6= i, j do while |pi pj | > |pi pk | do move pi toward pj among the endpoints of the sub-arcs of di (? make pi in balance ?) update the anchored diameter at pi end end pi = pk (? the point which by further moving of pi , |pi pk | > anchored diameter at pi ?). pj =the furthest neighbor of pi (? to make pi in balance in above loop ?) if all the checked points are in balance but a point pi remains which have not checked yet then pi =an unprocessed point in P 0 pj =the furthest neighbor of pi end if all the points are in balance then f lag = 0 end end end Return P 0 The importance of the reduced value from each anchored diameter lies in the convergence of the iterative process. The above lemma gives the intuition why of the whole procedure should terminate during O(−1 ) steps. In the following we will prove it formally. Lemma 5. The direction of movement of each point pi is changed during the algorithm at most twice. Proof. Suppose, by contradiction, the lemma is false. Then there exists a point pi ∈ di , such that the maximum number of times that the direction of movement of pi is changed is more than two. Suppose pj is the other vertex of the anchored diameter at pi in step (η) of the algorithm. Then during step (η), we move pi until it gets in balance between two points pj 11
pl px pj
pi
pb
Figure 8: Illustration of step (η)
and another element of P 0 , say, pk (w.l.o.g j < k; the other case is symmetric). Then pj and pk are the first and second furthest neighbors of pi . W.l.o.g, let pi should be moved in ccw direction until it gets in balance between pj and pk (the other case is symmetric), as illustrated in Figure 7. Let p1− i denote the new position of pi , in which − denotes pi moves in ccw direction. (We supposed p0i denotes the initial position of pi .) As can be seen in Figure 7, at the end of step (η), we have checked/corrected the position of all the elements of P 0 . Now suppose the direction of movement of pi is changed in a step (η + c) with c > 0, that is pi should get in balance between two other points, say px and py . Then pi is moved in cw direction. Let p2+ denote the new position of pi , where + i denotes pi moves in cw direction. Two cases can happen: some other point px , x = 6 j, k is the reason of movement of pi or the movements of pj and/or pk are the reason of movement of pi . First consider the case where pi is moved in cw direction because some other point px is moved further away from pi , that is either |pi px | ≥ |pi pj |, or |pi px | ≥ |pi pk |. In the case where |pi px | ≥ |pi pj |, either in step (η + c − 1) the size of the anchored diameter at pi is increased (that is a contradiction), or in step (η + c) there is another point pb which is moved further away from px , and pb is the furthest neighbor of px and movement of pb has encouraged px to move further away from pi . But then with the same argument as we had in Lemma 4, there exists another point pl that is the furthest neighbor of pi (see Figure 8). This is a contradiction. In the case where |pi px | ≥ |pi pk | since pi needs to be moved in cw direction to get in balance, dx is located before dk in cw ordering of ∆D . Again there is another point pb which is moved further away from px , and pb is the furthest neighbor of px and movement of pb has encouraged px to be moved further away from pi . Again with the same argument as we had in Lemma 4, there exists another point pl that is outside the circle centered at pi whose radius is |pi px |, and outside the circle centered at pb that whose radius is |pb pl |. And thus pl is the second furthest neighbor of pi instead of px . Contradiction. Second, consider the case where pi is moved in cw direction where the size of the anchored diameter at pi is decreased. This happens because one or both pj and pk moved closer to pi (among the boundary of their disks). This is an admissible movement. Now again suppose in a step (η + c0 ) with c0 > c, pi needs to be moved in ccw direction. But then either pj or another point px is moved further away from pi , which implies that either the size of the anchored diameter at pi is increased in step (η + c0 − 1) (that is a contradiction), or pj or 12
another point px is moved further away from pi in step (η + c0 ), and again, there exists another point pb which is moved before pi and has encouraged pj (or px ) to be moved in a direction further away from pi . And then with the same argument as we had in Lemma 4, there always exists another point pl such that |pl pi | > |pj pi | or |pl pi | > |px pi |, contradiction. Lemma 6. Algorithm 1 stops after O(−1 ) steps. Proof. Since the changes of the size of the anchored diameters are always non-increasing and the direction of movement of each point changes at most twice, and since we have at most O(−1 ) points on all the disks, we have visited each endpoint of a sub-arc at most thrice. The lemma follows. 3.3. Running Time Analysis Now we consider the running time of each step. Let m denote the cardinality of ∆D . Since we always know the cw order of the elements of P 0 , the diameter of P 0 can be computed in linear time in each step. In each movement, for any pi ∈ P 0 , we care about not increasing the size of the diameter with a vertex at pi . Thus for each element in one P step, this costs O(m + mepi ) time, where epi is the number of times that pi is moved, and epi ∈ O(−1 ). i=1,...,m
The later m is the time cost to check whether the corresponding element gets in balance or not. Thus the algorithm takes O(−1 · m) · O(m) time. Notice that in the worst-case m equals n. 3.4. Correctness of the Method First notice that for any disk di ∈ ∆D , an optimal sub-arc αi∗ always exists. It is easy to observe that there exist two optimal sub-arcs on two different disks that realize the diameter, such that two endpoints of these sub-arcs have the furthest distance among all (however, we have minimized this distance). These two sub-arcs approximate the minimum diameter. Lemma 7. The endpoints of each αi∗ minimize the difference between the distance from di to the approximated cherry disks of di , among all endpoints on di . Proof. Let dj and dk denote the approximated cherry disks of di . Then dj and dk are the furthest and second furthest neighbors of di . Since di is in balance between dj and dk on αi∗ , in each of the endpoints of αi∗ , it is closer to one of dj and dk , where in all the other endpoints of other sub-arcs on di , di has a further distance from both of them. The lemma follows. Lemma 8. For any disk di ∈ ∆D , αi∗ computed by the algorithm contains the possible bend bi . Proof. Suppose the lemma is false. Then bi is located on a sub-arc αi0 which is distinct from αi∗ . Then either we have passed over αi∗ during the algorithm, or we did not find it and we have stopped. Let dj and dk denote the approximated cherry disks of di by the algorithm. In the first case, at both endpoints of αi0 , the computed distance from di to both dj and dk must be greater than the current diameter anchored at pi , while we previously have 13
di cd0
cd0 /2 ci
dj cd0
Figure 9: The maximum possible length for an extreme arc appears between two disks di and dj with ri = cd0 /2 and rj = 0. Note that the subtended angle of an extreme arc cannot equal π, it is supposed so to compute the upper bound.
found another diameter which involves αi∗ and has strictly smaller size.3 This contradicts the optimality of the minimum diameter. In the second case, since αi0 and αi∗ are distinct, at least one computed cherry disk for one endpoint of αi0 has to be distinct from dj or dk (if not; αi0 = αi∗ , and we are done). But then we could move pi to reduce the distance of di to at least one of dj and dk . This contradicts the stop criterion of the algorithm. Lemma 9. Let D be a given set of n disjoint unit disks. The diameter d0 of set P 0 approximates the smallest diameter of D within a factor (1 + ) and can be computed in O(n2 −1 ) time. Proof. The candidate point pi of any disk di ∈ ∆D which does not have cherry disks is located at one of the endpoints of αi∗ , such that the distance of pi to the other elements of P 0 is always smaller than the current size of d0 . If di ∈ / ∆D , pi is chosen to be in closest distance to the elements of P 0 . Obviously, the distance of such points to the elements of P 0 cannot be greater than d0 . Thus we have a valid instance of points, and d0 approximates the smallest diameter d∗ within a factor (1 + ). 3.5. Disks with Different Sizes In Observation 1 we saw that for a given set D of n disjoint unit disks with n ≥ 7, the smallest possible diameter is always at least 2. If the disks are not unit but still disjoint, Observation 1 is still correct if the smallest disk is a unit disk. In this case, again, the total sum of the angles of the extreme arcs still equals 2π, but if we apply the idea we used on unit disks, the optimal sub-arcs will not necessarily have same length. In this case, we first apply the presented constant factor approximation algorithm [22] for the minimum diameter problem on a set of disks. The the smallest diameter √ presented algorithm approximates 2 0 within a constant factor c = 3 3 in linear time. Let d denote the approximated smallest diameter of D within factor c. We define h = · c · d0 as the new length of the sub-arcs. Note that if h is greater than the extreme arc of a disk di , we consider αi as the only sub-arc of di . Lemma 10. By setting h = · c · d0 as the new length of the sub-arcs, the total sum of the lengths of the sub-arcs is bounded by 2π · cd0 /2. 3
Notice that the size of all anchored diameters are always non-increasing.
14
Proof. We consider the case where a disk can have the maximum possible sub-arcs length. Suppose there are two disks di and dj on ∆D . Let dj be the smaller disk, whose radius is 0. Also let rj = cd0 /2; see Figure 9. Then obviously the total sum of the length of the sub-arcs of di and dj is smaller than π · cd0 /2 since for n ≥ 3 the subtended angle of an extreme arc cannot equal π. Knowing the fact that the total sum of the angles of the extreme arcs equals 2π, and since cd0 gives an upper bound on the size of the smallest possible diameter of D, any circle whose radius is greater than cd0 /2 will share an extreme arc with less curvature than di (and thus with smaller arc length), and also any circle whose radius is less than cd0 /2 will share an extreme arc with a shorter arc length. Thus the remaining disks on ∆D can increase the total sum to at most 2π · cd0 /2. The lemma follows. Thus the maximum number of the points that approximate the extreme arcs is bounded 0 /2 by 2π·cd , which is in O(−1 ). Since the computed sub-arcs lengths at most equal h (almost h equal), the considered ratio of the furthest distance to the smallest distance between the optimal sub-arcs (in Section 3.1) still holds, and the presented algorithm still works in the mentioned time. Theorem 1. Given a set of n disjoint disks, the solution to the problem of choosing a point on each disk such that the diameter of the resulting point set is as small as possible can be approximated within a factor (1 + ) in O(n2 −1 ) time. 4. Relaxing the Disjointness Assumption In the case where all the disks have a common intersection, the smallest diameter equals 0. This case can be distinguished in at most O(n2 ) time. From now on, suppose this is not the case. We first snap the points to an -Net [23] GR and then scale the distances to achieve a lower bound on the size of the diameter. Then we show that the discretized points on GR approximate the diameter within a factor 1 + O(). Suppose we have constructed GR on the bounding box of the disks, so that each grid cell has side length 1/4−1 , and we have approximated each disk with the set of grid points inside the disk; see Figure 10 as an illustration. It is easy to observe that for each disk we only need to consider the interior points that lie on convex position. Lemma 11. By approximating the disks with a set of grid points, the diameter will be changed up to 1 + O(). Proof. Let d∗ denote the optimal diameter. The probability that the two points p and q determining d∗ snap to different grid cells on a randomly shifted grid equals pr ≤ 2||p − q||/∆0 , where ∆0 is the cell length of GR . If they lie on the same cell, their distance is not changed. Let p0 and q 0 denote the snapped points of p and q on GR , then we have ||p0 − q 0 || ≤ ||p − q|| + 2||p − q||/∆0 × 4/4−1 ∆0 ≤ ||p − q||(1 + ). Then we scale the distances on GR by a factor 1/8−1 , so that the minimum distance between any two points on distinct disks is at least 2 (matching the lower bound on the size of the minimum diameter). Note that the approximation factor still equals 1 + O(). 15
di
dj
Figure 10: For each disk, the interior grid points in convex position (shown by cross) approximate the boundary of the disk. Also, the common tangent lines of disks determine the points approximating the extreme arcs (shown in red). The nearest neighbor (within the disks) of the touching points of the tangent lines are shown larger.
dk
∆0
To be able to use Algorithm 1, we also need to compute the set of points approximating the extreme arcs. By computing the touching points of the tangent lines of the disks on ∆D , we can compute the nearest neighbor of each touching point within the points approximating the disk. These nearest neighbors are the points approximating the extreme arcs on the grid; see Figure 10. Note that if a disk di is located entirely inside a grid cell, the nearest grid point of that cell to di determines the extreme arc of di . Since the size of the discrete set of points that approximates the disks is in O(n + −2 ), we can apply Algorithm 1 on the points approximating the extreme arcs in O(n2 −2 ) time. Theorem 2. Given a set of n disks, the solution to the problem of choosing a point on each disk such that the diameter of the resulting point set is as small as possible can be approximated within a factor 1 + O() in expected O(n2 −2 ) time. 5. Extension to Higher Dimensions Finally, we note that we can obtain the same results for any fixed dimension d with a slight modification of our algorithm. For instance, in R3 , each ball di will have four neighbors (2d for general values of k) on ∆D . Then we connect the determined points by the tangent planes of di and its neighbors (on ∆D ) together to make a bounded face with four vertices on di , αi , which is a candidate of including the optimal point of di . If D consists of unit spheres, obviously Lemma 1 generalizes to higher dimensions, since the total sum of the αi ’s for i = 1, . . . , n equals the surface area of a unit sphere. Then we will make a grid on each αi , and the grid points approximate αi . To do so, we consider a unit sphere U , and the circumscribing hypercube C of side length twice the radius of U , which is centered at the center of U , where each of its sides is decomposed into grid cells of length 2 . Then the π intersection of U with the rays through the grid vertices of C and the center of U makes such decomposition on U . Then again we transfer the grid points to the corresponding spheres. 16
Then Algorithm 1 also generalizes, in which each point will have four possible directions to move among the points of the grid. Also, notice that the side length of each grid cell on U would at most equal (thus the diameter of each cell is also a function of ). The extension of other required Lemmas is obvious. For a set of arbitrary disks in higher dimensions, again, we can generalize Lemma 10, i.e., we can find an upper bound on the smallest diameter from the solution of the smallest enclosing ball, in linear time (since this problem is still an LP-type problem [6]), then the extension of the algorithm is the same as before. 6. Concluding Remarks In this paper, we have given an FPTAS for approximating some bounds on a basic geometric measure, the diameter, with uncertain input, where previously only some PTAS’s were known. One of the most important open problems is improving over the running time of our algorithm to admit logarithmic dependency to the accuracy since then the presented method is always more efficient than the other methods in convex optimization theory. References
[1] P. K. Agarwal, S. Har-Peled, S. Suri, H. Yıldız, and W. Zhang. Convex hulls under uncertainty. In European Symposium on Algorithms, pages 37–48. Springer, 2014. [2] P. K. Agarwal, S. Har-Peled, and K. R. Varadarajan. Approximating extent measures of points. Journal of the ACM, 51(4):606–635, 2004. [3] P. K. Agarwal, N. Kumar, S. Sintos, and S. Suri. Range-max queries on uncertain data. In 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 465–476, 2016. [4] H.-K. Ahn, S.-S. Kim, C. Knauer, L. Schlipf, C.-S. Shin, and A. Vigneron. Covering and piercing disks with two centers. Computational Geometry, 46(3):253 – 262, 2013. [5] S. Bubeck. Convex optimization: Algorithms and complexity. Foundations and Trends in Machine Learning, 8(3-4):231–357, 2015. [6] B. Chazelle and J. Matouˇsek. On linear-time deterministic algorithms for optimization problems in fixed dimension. Journal of Algorithms, 21(3):579–597, 1996. [7] M. E. Consuegra and G. Narasimhan. Geometric avatar problems. In LIPIcs-Leibniz International Proceedings in Informatics, volume 24, 2013. [8] E. Ezra and W. Mulzer. Convex hull of points lying on lines in o(n log n) time after preprocessing. Computational Geometry, 46(4):417–434, 2013. [9] R. Fleischer and X. Xu. Computing minimum diameter color-spanning sets is hard. Information Processing Letters, 111(21-22):1054–1056, 2011. [10] M. Ghodsi, H. Homapour, and M. Seddighin. Approximate minimum diameter. In 23th International Conference on Computing and Combinatorics, pages 237–249, 2017. [11] S. Har-Peled. Geometric approximation algorithms. Number 173. American Mathematical Soc., 2011. [12] M. Held and J. S. Mitchell. Triangulating input-constrained planar point sets. Information Processing Letters, 109(1):54–56, 2008. [13] L. Huang, J. Li, J. M. Phillips, and H. Wang. epsilon-kernel coresets for stochastic points. In LIPIcsLeibniz International Proceedings in Informatics, volume 57, 2016. [14] W. Ju, C. Fan, J. Luo, B. Zhu, and O. Daescu. On some geometric problems of color-spanning sets. Journal of Combinatorial Optimization, 26(2):266–283, 2013.
17
[15] W. Ju, J. Luo, B. Zhu, and O. Daescu. Largest area convex hull of imprecise data based on axis-aligned squares. Journal of Combinatorial Optimization, 26(4):832–859, 2013. [16] P. Kamousi, T. M. Chan, and S. Suri. Stochastic minimum spanning trees in euclidean spaces. In 27th Annual Symposium on Computational Geometry, pages 65–74. ACM, 2011. [17] M. R. Kazemi, A. Mohades, and P. Khanteimouri. Approximation algorithms for color spanning diameter. Information Processing Letters, 135:53–56, 2018. [18] C. Knauer, M. L¨offler, M. Scherfenberg, and T. Wolle. The directed Hausdorff distance between imprecise point sets. Theoretical Computer Science, 412(32):4173–4186, 2011. [19] H. Kruger and M. L¨offler. Geometric measures on imprecise points in higher dimensions. In 25th European Workshop on Computational Geometry, pages 121–124, 2009. [20] M. L¨offler. Data imprecision in computational geometry. PhD thesis, Utrecht Univesity, 2009. [21] M. L¨offler and M. van Kreveld. Largest and smallest convex hulls for imprecise points. Algorithmica, 56(2):235–269, 2010. [22] M. L¨offler and M. van Kreveld. Largest bounding box, smallest diameter, and related problems on imprecise points. Computational Geometry, 43(4):419 – 433, 2010. [23] N. H. Mustafa and K. R. Varadarajan. Epsilon-approximations and epsilon-nets. arXiv preprint arXiv:1702.03676, 2017. [24] Y. Myers and L. Joskowicz. Uncertain geometry with dependencies. In 14th ACM Symposium on Solid and Physical Modeling, pages 159–164, 2010. [25] T. Nagai, S. Yasutome, and N. Tokura. Convex hull problem with imprecise input and its solution. Systems and Computers in Japan, 30(3):31–42, 1999. [26] J. Sember and W. S. Evans. Guaranteed voronoi diagrams of uncertain sites. In 20th Canadian Conference on Computational Geometry, 2008. [27] F. Sheikhi, A. Mohades, M. de Berg, and A. D. Mehrabi. Separability of imprecise points. Computational Geometry, 61:24–37, 2017. [28] S. Suri, K. Verbeek, and H. Yıldız. On the most likely convex hull of uncertain points. In European Symposium on Algorithms, pages 791–802. Springer, 2013. [29] J. Xue, Y. Li, and R. Janardan. On the expected diameter, width, and complexity of a stochastic convex hull. Computational Geometry, 82:16–31, 2019. [30] D. Zhang, Y. M. Chee, A. Mondal, A. K. Tung, and M. Kitsuregawa. Keyword search in spatial databases: Towards searching by document. In Data Engineering, pages 688–699, 2009.
18
Appendix A. Our algorithm vs the algorithms based on the standard optimization techniques We first compare our algorithm with the ellipsoid method which is the most efficient technique in the running time evaluation in the standard optimization techniques. In this method, an ellipse E will contain the feasible solutions, and during the finite number of iterations, E will be resized to a smaller size, so that in the last iteration it just contains the optimal solution. In this method, it is needed to check all the constraints in each iteration to resize the size of the ellipse containing the feasible solution, while it is only proved that its running time is polynomial to the dimension of the input (however, its dependency to the required accuracy is logarithmic to −1 ). Consequently, in our problem, the ellipsoid method will have an asymptotic quartic convergence to the input size, since O(n2 log −1 ) is an estimation for the number of iterations of the ellipsoid algorithm [5]. Moreover, this method is only good in theoretic, since its efficiency deeply depends on the stability of the input. Knowing that the running time of our algorithm is quadratic to the input size and linear to the accuracy, solving n4 log −1 − n2 −1 = 0 will determine for which values of n and which method works more efficiently (theoretically), e.g., for n > 20 and = 10−3 our algorithm is significantly faster than the other one. Furthermore, we always can guarantee our algorithm practically. Notice that there are similar difficulties for applying other optimization methods, e.g., Simplex method has good practical performance, but can have exponential behavior in a theoretical point of view. The Barrier method is without quantitative estimates, it is only proved that the algorithm has polynomial convergence to the dimension of the problem and to the required accuracy. But this method has the problem of computing a good penalty function, etc. Furthermore, if we scrutinize the space complexity analyze, we will get to know that our method admits linear space, however, in our problem, all the possible optimization methods at least need quadratic space.
19
Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: