JOURNAL
OF ALGORITHMS
7, @l-h&
(1986)
Point Retrieval for Polygons MICHAEL S. PATERSON* Department
of Computer
Science,
University
of Warwick,
Coventry,
England
AND
F. FRANCES YAO Xerox
Palo Alto Research
Center,
Palo Alto, California
Received June I1985
We present a solution to the following polygon retrieval problem: given a set of n points on the plane, build a data structure so that for any query polygon P the Set of points lying in P can be retrieved efficiently. Our solution uses space O( n*) and reports the s points lying in a k-sided polygon P in time O(k log n + s). 0 1986 Academic
Press. Inc.
1. INTRODUCTION We consider the following polygon retrieval problem: given a set X of n points on the plane, build a data structure so that for any query polygon P, the set of s points lying in P can be retrieved efficiently. D. Willard was the first to consider this problem, and he gave a solution which required O(n) space, and O(n’.” + s) time [WI. Edelsbrunner and Welzl refined Willard’s scheme to reduce the query time to O(n0.69 + S) [EW]. Methods which achieve faster query time at the expense of more storage space have also been studied. An O(n’) space and O(log n + S) time solution was given by Edelsbrunner, Kirkpatrick, and Mauer [EKM]. More recently, Cole and Yap ICY] obtained algorithms with O(n2+‘/log n) space and O(log n log(l/<)) time for any c > 0. In particular, with O(n’/log n) space O(log n log log n + S) time is achievable. *Support was provided in part by an Information Technology Senior Fellowship from the Science and Engineering Research Council. This work was done while the first author was visiting Xerox PARC.
441
01%-6774/86 $3.00 Copyright 6 19R6 by Academic Press. Inc. All rights of reproduction in any form reserved.
442
PATERSON AND YAO
In this paper we give a solution to the polygon retrieval problem with O(log n) query time, O(n*) space and preprocessing time. In Section 2, we give a brief discussion of the point-line duality on the plane since our method employs the dual view of the point set as a set of lines (cf. [C, CGL]). The description of our solution consists of two parts: first we show how to partition a query polygon into simple “quads” in Section 3; we then give a data structure for efficient retrieval of the points lying in a quad in Sections 4 and 5. The overall space and time requirements of our algorithm are analyzed in Section 6.
2. DUALITY We shall use the correspondence of duality between points and lines in the projective plane. In homogeneous coordinates, the point (a, b, c) and the line (ax + by + cz = 0) are dual to each other for all (a, b, c) # (0, 0,O). The following pairs are then dual concepts: points on a line line segment set of lines intersecting a line segment
* t) c)
lines through a point “double wedge” of lines (Fig. 1) set of points in a double wedge
One can represent the projective plane as the ordinary Euclidean plane augmented with “points at infinity” (dual to lines through the origin) and a “line at infinity” (dual to the origin point) as follows:
Point Point Line Line
Projective plane (a, b, c), c + 0 (a, b, 0) ax + by + cz = 0, (a, 6) # (0,O) z= 0
FIG. 1.
Double
Augmented Euclidean plane Point (a/c, b/c) Point at infinity (a, b) Line ax + by + c = 0 Line at infinity
wedge.
POINT
FIG.
3.
RETRIEVAL
FOR
POLYGONS
443
2. Decomposition of polygon by sectors
DECOMPOSITION
OF
QUERY
REGION
We will describe our search procedure in the case that the query region P is a convex polygon, not necessarily bounded. For arbitrary regions with a piecewise linear boundary our procedure is still effective after a suitable decomposition is carried out. In such a general case, however, the form of presentation of the query region becomes of more concern and the decomposition procedure is less straightforward. Let the query region P be an intersection of half-planes and be neither empty nor the entire plane. We suppose that P is represented as a minimal sequence of half-planes (Ht, H,, . . . , Hk) given in cyclic order by their equations. The boundary of P consists of a sequence of edges, where each edge Ei is part of the bounding line of some half-plane Hi. The vertices of P are the (finite) points where two adjacent edges meet. We shall separate P into simpler regions by cutting up the plane into sectors. The sectors are obtained by drawing semi-infinite rays from the origin 0 through each vertex of P (Fig. 2). When a bounding edge of P is infinite in some direction, we introduce a ray parallel to the edge in that direction. The resulting partition of the plane into sectors decomposes P into subregions called quua!r (Fig. 3). Each quad has a particularly simple form which permits an efficient search algorithm using the duality correspondence described above.
FIG. 3.
Quad.
444
PATERSON AND YAO
(i) Any quad is the intersection of the sector containing it and at most two half-planes Hi and HI. (ii) The bounding lines of Hi and Hi do not meet in the interior of the sector. (iii) No sector containing a quad subten& a reflex (i.e. concave) angle. LEMMA 1.
Proofi To prove (i), let E be an edge of the quad which is not a ray of the sector. Since by construction there can be no vertex in the interior of the sector, E either ends on a ray or extends to infinity. In the latter case the direction must be parallel to a bounding ray and we can regard E as ending on this ray at a point at infinity. In this sense E extends from one bounding ray to the other. Therefore any ray r in the interior of the sector must meet E at an interior point of E. Since any line can intersect the interior of at most two of the edges of the convex polygon P, property (i) is proved. Properties (ii) and (iii) are immediate since any nonray edge E of the quad extends from one bounding ray to the other. 0 Our retrieval algorithm makes use of the representation described below for the query polygon, A double wedge is the region which is in either both or none of two half-planes. THEOREM 1. Any region which is the intersection of k half-planes is the disjoint union of at most k + 1 subregions, each of which is the intersection of a sector and a double wedge.
Proof: At most k + 1 sectors are created by rays from 0. Consider one of the quads generated. If it has two nonray edges then the double wedge required is given by the two corresponding half-planes, and we have only to show that the second component of this double wedge does not also intersect the sector. Suppose c were an interior point of this second component also within the sector. Take a, b as points on the two edges contained in the sector. Since the sector is convex (Lemma l(G)) the interior of the triangle defined by a, b, c is contained within the sector. The center of the double wedge lies in this triangle but is not in the sector (Lemma l(ii)). The contradiction proves this case. If the quad has only one nonray edge or none, then we can use either or both respectively of the half-planes defining the sector to form the required double wedge. The proof continues as in the previous case. q
4. DATA REPRESENTATION We are given a set of n points in the plane. To set up the data structure for our retrieval algorithm, the points are first sorted according to their
POINT
RETRIEVAL
FOR
POLYGONS
445
polar angles at the origin. We will label the points so that the ith point has polar angle ei and 0 5 8, s 8, I . . . I 6, < 27r. We set up a balanced binary search tree of depth [ logn] with the n points as leaves in this order. At the root of each subtree is stored the interval of angles [ei, 0,] corresponding to its leaves. A retrieval restricted to points with polar angles in the range [a, /I] is accomplished as follows. Firstly, binary searches with ct and p are performed in the tree to find p, q such that p = min{ i/a I 8,} and q = max( j]$ < /I }. The indices p and q determine a set of at most 2 log n disjoint subtrees whose leaves represent exactly those points in the range [(Y, B]. The required retrieval can then be made in the sets corresponding to each of these subtrees in turn. In summary, the data points are ordered by polar angle, and separate data structures are set up for the whole set, the first half, the second half, each of the four quarters, and so on. Any sector is a disjoint union of at most 2 log n of these basic sectors. Given a polygonal query region P with k sides, it is decomposed into O(k) quads for separate retrievals. Each quad Qi is the intersection of a sector Si and a double wedge Di. Decomposition of each sector Si into basic sectors { Sij} yields a total of 0( k log n) disjoint subregions for P, each of which is the intersection of a basic sector Sij and a double wedge Di. The remaining problem is thus one of reporting all points of a basic sector which lie within a query double wedge. If we represent the points of a basic sector dually as a set L of lines, then by the correspondence mentioned in Section 2 the problem becomes that of finding all lines of L which intersect a query line segment. We consider the latter problem in the next section.
5. LINE
GRAPHS
AND
SEGMENT
QUERIES
A set L of n infinite lines on the plane gives rise to a planar graph G, called the line graph of L. It was shown in Chazelle et al. [CGL] that G can be set up in O(n*) time given L as input. Chazelle [C] gave an algorithm which for any query line segment AB can report in O(log n + s) time the set of s edges in G that intersect AB. He also noted the correspondence of a line segment with a double wedge. To make our presentation self-contained, we include below a description of Chazelle’s algorithm. A proof of its correctness is given in Chazelle [Cl. We assume that the line graph G is represented in a suitable data structure so that in O(log n) time one can perform point Zocation, i.e. find the face of G that contains any query point (see, e.g., Edelsbrunner et al. [EGS]). We also assume that associated with each face f is stored its rightmost vertex u( f ), and that the vertices on the boundary of f are stored
446
PATERSON
AND
YAO
in cyclic order to permit both clockwise and counterclockwise traversals starting from v(f). To report all edges of G that intersect a segment AB, we first perform a point location to identify the face f in G which contains the point A. We traverse the boundary of f by going initially to its rightmost vertex v(f) and then walking either clockwise or counterclockwise depending on whether the points (A, v( f ), B) are in clockwise or counterclockwise cyclic order. Eventually one of two cases must happen after we visit v1 = u(f), 9, - - *, vi, vi+1 on f’s boundary: (a) the edge u~v~+~ intersects AB, or (b) the line supporting the edge vivj+ 1 does not intersect AB. In case (a), we report the intersection of oivi+ i with AB, move to the face f’ on the other side of edge vivi+ t, and repeat the traversal for f ‘. When case (b) happens (i must be l), the current face is called the failure face for A, denoted by faif( A). After faU( A) is encountered, we switch to the vertex B and find edges intersecting BA in the same manner as before. We first locate the face containing B and then traverse the boundary of successive faces f in the orientation of (B, u( f ), A). When the face faU( A) is reached we are finished. It is shown in Chazelle [C] that fad(A) is always reached before case (b) occurs in this direction. Chazelle also proves that the total number of edges visited during the search is O(s). Hence the running time is O(log n + 3).
6. TIME
AND
SPACE
ANALYSES
Since the time required for each double wedge query in a basic sector is O(log n + s) where s is the number of points reported, a simple estimate for the total time for a polygonal query is O(k(log n)2 + s). An improvement to O(k log n + s) can be made by augmenting the tree of line graph structures in a simple way. We introduce a pointer from each face f of each line graph G to the face containing f in the line graphs associated with the two immediate subtrees of G. Now any point location problem is solved in O(log n) time in the line graph at the root of the tree and the solution is propagated by the pointers to each of the O(log n) subtrees at which it is required. The search time for each quad is reduced to O(log n + s) thereby. The space and preprocessing time for setting up the tree of line graph structures is
o(n2+2(J2+4($)‘+ . ..) = O(d). The additional time required to preprocess the line graph at the root of the tree to allow efficient point location is also O(n*) (see Edelsbrunner et al. [EGS]).
POINT RETRIEVAL
FOR POLYGONS
447
THEOREM 2. Giuen n points in the plane, a data structure can be set up in time and space O(n’) so that the set of s points within a query polygon (bounded or unbounded) with k sidesmay be reported in time 0( k log n + s). REFERENCES
[Cl
WLI WI
WY WMI
WJI WI
B. CHAEELEE, “Intersecting is Easier than Sorting.” Proceedings, 16th ACM Annual Symposium on Theory of Computing (1984), pp. 125-134. B. CHAEELEE, L. GUIBAS, AND D. T. LEE, “The Power of Duality,” Proceedings, 24th IEEE Ammal Symposium on Foundations of Computer Science (1983) pp. 217-225. R. COLE, C. K. YAP, “Geometric Retrieval Problems,” Proceedings, 24th IEEE Annual Symposium on Foundations of Computer Science (1983), pp. 112-121. To appear in Inform. and Control. H. EDELSBRUNNER,L. GUIBAS, AND J. STOLFI, “Optimal Point Location in a Monotone Sub-division,” Tech. Rep. 2, DEC Systems Research Center, 1984. H. EDELSBRUNNER,D. KIRKPATRICK, AND H. MAUER, Polygonal intersection searching, Inform. Process. Left. 14 (1982), 74-79. H. EDELSBRUNNERAND E. WELZL, “HaIfplanar Range Search in Linear Space and 0( n”.695) Query Time,” Tech. Rep. Fill, University of Gras. D. WILLARD, Polygon retrieval, SIAM J. Comput. 14 (1982). 149-165.