Distributed boundary representation and Boolean operations on a massively parallel computer

Distributed boundary representation and Boolean operations on a massively parallel computer

~10-~9~~73-9 Computer-Aided Design, Vol. 26, No. 8, pp. 631-649, 1996 Copyright Q 1996 Elsevisr Science Ltd Printed in Great Eritain. Ali rights rese...

2MB Sizes 0 Downloads 32 Views

~10-~9~~73-9

Computer-Aided Design, Vol. 26, No. 8, pp. 631-649, 1996 Copyright Q 1996 Elsevisr Science Ltd Printed in Great Eritain. Ali rights reserved ~~0~~ $~5.~~0.~

ELSEVIER

Research reports Distributed boundary representation and Boolean operations on a massively parallel computer K C Hui and Y M Kan

Boolean operation is usually the most expensive task in solid modelling. Considerable amount of time is required in a union, difference, or interaction operation especially when an object grows in complexity after a series of operations. Parallel processing seems to be a promising technique for improving the speed in the manipulation of complex solid objects. This paper presents a technique for performing Boolean operations in a distributed environment. In this method, a distributed bounda~ representation scheme is adopted for representing objects in an array of processors of a SIMD computer. Solid objects with non-manifold boundary is assumed and is modelled with a cell-complex based boundary representation. The concept of half-wedge is employed to allow easy retrieval of neighbourhood or topological information in a distributed processing environment. Parallel algorit~s for obtaining the complement of solids are presented. Algorithms for evaluating the intersecting regions and hence the result of Boolean operations between objects are also discussed. The proposed method is implemented on a DECmpp 12OO/Sxmachine with 8K processors. Test results on the experimental system shows that the method is effiicient for a test piece with 1300 or more entities. The actual ~rfo~an~e of the system may also depends on the relative sizes of the entities involved. Copyright 0 1996 Elsevier Science Ltd.

Keywords: Booleans, non-manifold models, array processing

INTRODUCTION Complex solid models are usually built up from simple geometric entities (edges, faces) or constructed -._ Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong Paper received: 27 October 1994. Revised: 25 April 1995

through Boolean operations on simple primitive solids. While current sequential machines are capable of handiing objects with moderately complex geometry, it is unhkely to give the required perfo~ance in handling objects with complex shape. Manipulating solid objects on a parallel machine seems to be a promising technique for improving performance. Yamaguchi and Tokieda’.* introduced the use of a special purpose 4 x 4 Determinant Processors for handling Boolean operations on solids. Ellis et aL3 implemented the Ray Casting Engine which is a parallel machine specifically built for handling solids in Ray-representation. Holliman et ~1.~proposed a parallel solid modeller Mistral-3 implemented on a transputer for creating, ray-tracing and incrementally updating octree-based spatial data structure. Another approach is to use general purpose parallel computers to speed up the process of Boolean operations. Akman et al.’ and Narayanaswam? proposed a parallel algorithm that makes use of the uniform grid technique for Boolean operations on polyhedral objects. Strip and Karasick7 introduced the parallel directed edge (D-edge) to represent solid objects on a Connection Machine. D-edge is an extension of Star-edge which was proposed by Karasicks for representing solids with nonmanifold boundary. In a D-edge representation, a series of D-edges is associated with an edge. Each D-edge is associated with a face incident to the edge. The interior of the face is to the right of the D-edge if the direction of the D-edge is positive. On the contrary, the direction of a D-edge is negative if the interior of the face is to the left of the D-edge. D-edges on an edge are ordered radially on a plane perpendicular to the edge (Figure 1). Neighbourhood information can thus be inferred from the radial order of the D-edges. In the parallel D-edge representation, D-edges are distributed in different processors. A label is associated with each D-edge indicating the processor in which it resides. In this case, processors containing those D-edges required for a certain operation can be selectively activated. 631

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

Non-manifold edge

Figure 1

D-edge representation: (a) a non-manifold edge; (b) the corresponding D-edges

Recently, Bajaj and Dey’ applied a MultipleInstructions-Multiple-Data (MIMD) machine for evaluating Boolean operations on two polyhedral objects with manifold boundary. In order to handle objects with non-manifold boundary, an approach is to use a cell-complex” based boundary representation scheme. This is extended for implementation on a massively parallel computer which will be discussed in the following sections. CELL-COMPLEX BASED BOUNDARY REPRESENTATION A solid is usually defined as a regular point set (r-set) in the 3D Euclidean space. Early solid modelling systems were usually restricted to the modelling of r-sets. In particular, systems with boundary representation were generally confined to the modelling of solids with manifold boundary. Recent development in solid modelling tends to relax this constraint to allow the modelling of non-manifold objects. Rossignac and Requicha” proposed the use of Constructive Nonregular Geometry (CNRG) to manipulate decompositions of point sets. CNRG represents objects in a tree-like structure similar to a CSG tree. This forms a good candidate for implementing on a MIMD parallel machine such that different processors may perform different operations corresponding to different nodes of the CNRG tree concurrently which may not be possible with a Single-Instruction-Multiple-Data (SIMD) machine. Another approach for representing non-manifold objects is based on the assumption that an object is a cell complex”. A cell complex is a subset of E3 consistin of a union of disjoint n-cells which are also subsets of E . An n-cell is homeomorphic to an n-dimensional open sphere. A solid object in E3 is thus a union of disjoint O-cells, l-cells, 2-cells, and 3-cells, corresponding to vertices, interior of edges, interior of faces, and the interior of solids respectively. For example, an edge is the union of two O-cells and a l-cell denoting the two endpoints and the interior of the edge, respectively. There are two major tasks in evaluating a Boolean operation between two objects. Firstly, intersections between O-cell, l-cell, 2-cell and 3-cell of the two objects are located. Secondly, results of the intersection are grouped to form the cell complex of the final object. These processes can be performed in parallel since the intersection between 632

D-edges

two cells is independent of other cells, and is thus suitable for implementing on a SIMD parallel machine. In a cell-complex based B-rep, a solid is a group (or collection) of cells. Each of these cells is a volume (3-cell) bounded by shells composed of vertices, edges and faces (a collection of O-cell, l-cell, 2-cell). Intuitively, a cell is an r-set and a solid is a collection of disjoint r-sets which may touch at their vertices or edges. A face is in turn bounded by loops of edges whereas an edge is bounded by its two vertices. In order to denote neighbourhood (or topological) information around an edge, the concept of wedge is employed. A wedge is defined as the simplices incident to an edge (Figure 2b). In the current implementation, a solid is represented as a collection of cells. Each cell is represented as a collection of shells. A shell is represented as a collection of faces. A face is represented by its geometric information as well as one or more loops of wedges bounding the face. A wedge is associated with an edge. The vertices bounding the edge, together with the two faces incident to the edge, give a description of the wedge (Figure 2b). In order to represent loops of wedges bounding a face, a wedge is represented as two halfwedges (Figure 2~). Each half-wedge is associated with an edge. A half-wedge is specified with a face and a direction denoting the orientation of the edge in a loop lying on the face. Each half-wedge is also associated with the other half-wedge of the wedge such that the two half-wedges form a mating pair. The half-wedges of a shell thus form a graph. All entities of a shell can be extracted by traversing this graph. In addition, the mating information between half-wedges enables easy retrieval of neighbourhood information around an edge. This allows the processing of different wedges concurrently in a distributed processing environment. To represent neighbourhood information around a vertex, the concept of bundle is employed. A bundle is associated with a vertex denoting a collection of edges incident to the vertex (Figures 2d and e). A cell-complex based boundary representation is thus composed of a hierarchy of cell, shell, face, wedge, and bundle, in addition to the geometric information of vertices, edges and faces. In a polyhedral object, all edges are straight lines defined by its bounding vertices (or bundles), and all faces are planar surfaces defined by the loops of bounding edges (or halfwedges). Hence, half-wedge, bundle, and vertex coordinates are the basic elements in representing the topological and geometric information of a polyhedral object.

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

Non-manifold edge

(a) \

Half-wedge of wedge 1 I

(b) Half-wedge of wedge 2

(d) Figure 2 Wedges of a non-manifold edge: (a) object with a non-manifold with half-wedges; (d) a bundle associated with a vertex; (e) representing

DISTRIBUTED BOUNDARY REPRESENTATION As discussed in the previous section, an object is represented as a hierarchy of cells, shells, faces, wedges and bundles. In the proposed distributed boundary representation, this information are distributed in the memories of different processors such that they can be manipulated concurrently. In the following sections, a distributed memory SIMD machine is assumed which usually allows selectively activating a group of processors containing the data required. Hence, not all of the entities in the hierarchy have to be stored explicitly. Only half-wedges and bundles are stored in addition to the geometric information of vertices, edges and faces. Identification numbers (e.g. labels) are associated with each half-wedge indicating the shell or cell to which the half-wedge belong. This allows processors containing wedges of the same shell or cell identification to be activated selectively. Figure 3 shows the distribution of half-wedges of an object in an array of processors. In this example, the object (Figure 3~) is composed of two cells (Cell, and Cel12) that touches at an edge. Cell, is bounded by two shells (Shellii and Shell,,), whereas Cell2 is bounded by one shell (Shell*,) only. Loops of half-wedges of Shelli, are shown in Figure 3b. The

(e) edge; (b) wedges incident to the non-manifold edge; (c) representing a bundle with a list of half-wedges incident to a vertex

wedges

distribution of half-wedges wi,j (where i denotes the 100~ number and j denotes the number of the half-wedge within a loop) of Loop, and Loop3 of Shelli, in an array of processors is illustrated in Figure 3c. Each of the squares in Figure 3c represents the memory of a processor in the array of processors. Half-wedges in a loop bounding a face are stored in consecutive processors forming a loop cluster of processors as indicated by the contiguous squares containing half-wedges of Loop, and Loop,. Loop clusters of the same shell are also stored in consecutive processors. Shells of the same cell are in turn stored in consecutive clusters of processors. In each half-wedge, the mating relations between half-wedges of the same wedge is stored as a processor identification number indicating the processor containing the mating half-wedge. An edge entry is also used to specify if an edge is a manifold edge or not, which will be discussed in the section on representing edges and bundles.

Representing faces

In Cell, of the previous example (Figure 3b), a face is bounded by a single loop of half-wedges. However, a face may be bounded by a number of loops as shown in the top face of Cell, in Figure 3a. In order to cope with 633

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

Shell,,

I

I

II

(b) Shell,,

End of Shell,,

Processors wedges

containing

half-

of Cell,

Processors containing wedges of Cell,

Inactive

half-

Processors

Cc) of half-wedges in an array of processors: (a) an example Figure 3 Distribution distribution of the half-wedges of the example object in an array of processors

multiple loops bounding a face, all loops bounding a face are stored as attributes of the face. Faces are stored in a different cluster of processors. An array (Loop(i), where i = 0,. , ~1, n = number of loops) of loop attributes is associated with each face. Each element of the array contains the identification of the loop (Loop(i).id), and the address of the processor (Loop(i).processor) containing the starting half-wedge of the loop. Figure 4 depicts the association of loops with different faces (denoted as fi). Since half-wedges are stored in consecutive processors, all half-wedges of a loop and hence all loops bounding a face can be easily identified.

Representing edges and bundles In general, a wedge provides no information on the geometry of an edge (e.g. whether an edge is a curve or a straight line). For polyhedral objects, the geometry of an edge is defined by its bounding vertices such that no 634

object with non-manifold

boundary;

(b) half-wedges

of Shell,,;

(c)

explicit geometric information of an edge is required. When dealing with non-manifold solid object, a series of wedges may be associated with an edge. Hence, it is essential to keep track of the list of wedges incident to an edge in order that neighbourhood information around an edge can be easily retrieved. This is implemented as an array associated with each edge denoting the radial order of half-wedges incident to the edge. A bundle is represented as a list of half-wedges incident to a vertex. The manipulation of bundles is carried out through extended Euler operations that only affects one or two entities (e.g. vertices, edges) at a time. In the proposed algorithms for Boolean operations, extended Euler operations are not used in constructing the final solid. It follows that parallel operations on bundles are not required. Bundles are thus handled with sequential procedures and are not distributed to different processors. In the current implementation, bundles are stored in a sequential machine which is also the front-end workstation connected to the massively parallel computer.

Distributed

boundary

representation

Loop(O).processor

= 3I

Processor No.

Loop(l).processor

= 37

1

and Boolean operations:

1

2

l

K C Hui and Y M Kan

l

8 .

\

. 29 36

Loop 4 at processor

Starting half-wedge Loop 5 at processor

no. 3 1

of no. 37

(4 (b) Figure 4 Storing face information clusters of processors

in an array of processors:

(a) faces in an array of processors;

BOOLEAN OPERATIONS A Boolean operation of union, intersection or difference can always be performed by using complement and intersection only. The complement of an object is required only if it is to be subtracted from another. However, the Boolean intersection of two objects has to be computed for all different types of Boolean operations. In a union operation, the intersection of the two objects is discarded whereas in a difference or intersection operation, the intersecting region is retained.

Evaluating the complement of a solid To obtain the complement of an object, the orientation of all faces and the order of the half-wedges in all loops of the faces have to be reversed. Reversing the orientation of all faces is straight forward as it only require reversing the direction of the face normals. The major task involved is thus to reverse the order of the half-wedges within the same loop cluster of processors. A parallel algorithm is developed to reverse the order of halfwedges in different loops concurrently. Assuming the half-wedges w;,, (j = 0,. . . , k,) in a loop (Loop,) are distributed in consecutive processors Pi,j, and there are n loops in the object (i.e. i = 0,. , n), the algorithm for reversing the order of half-wedges is listed in Algorithm 1. A prerequisite for the algorithm is that the number of half-wedges kj in the loops have to be obtained before the algorithm is invoked. The number of half-wedges in a loop is not stored explicitly. Instead, the end of a loop is marked with a special flag associated with the halfwedges at the loop end. In this case, a special procedure has to be invoked to determine ki. Starting with the processors containing the first half-wedge of a loop, a

(b) the corresponding

loops of half-wedges

in the loop

counter ki (initialized to zero) is increased and transmitted to the processor containing the next halfwedge in the loop. This is repeated until the processor containing the last half-wedge in the loop is visited. The counter k, will then contain the number of halfwedges in the loop traversed (i.e. no. of halfwedges = k, + 1). In the algorithm for reversing loop order, the counter obtained in the above procedure is used for locating the processors at the middle of each loop cluster (i.e. in the first half of the Pi,mr m = ki/2). Half-wedges loop cluster are then swapped with those in the second half of the loop cluster. Assuming k = ki, if k is odd, half-wedges in (Pi,O, . . , Pi,,) are swapped with those in in (Pi,m+i:. . Pl,k). If k is even, half-wedges (P;,,. ‘. >Pi,m-l) are swapped with those in (Pi,m+l,. . , P[,k). The net effect is that Wi,j in P,,, is swapped with Wi,/I-, in P; k-]. This is performed in two stages. Firstly,

Algorithm I: Reverse loop order for i = 0 to n do in parallel m = ki/2; /* Transmit half-wedges in the first half of the loop to their target processors */ forj = 0 to m do in parallel np = k; - 2j; Transmit w,,, through np processors; end for /* Transmit half-wedges in the second half of the loop to their target processors */ for j = m + 1 to k, do in parallel np = kj - 2j; Transmit Wi.1through np processors: end for end for

635

Distributed

boundary

representation

and Boolean operations:

half-wedges in the first half of the loop cluster together with a counter are transmitted to their neighbouring processors. The direction of transmission depends on the sign of the distance nr, of the target processor from the current processor measured in terms of the number of processors. For instance, if nP > 0, wi,j and the counter in Pi,j is transmitted to Pi,j+l. If nP < 0, wi,j and the counter in Pi,j is transmitted to Pi,j_,. NO transmission is required when nP = 0. The counter is increased in each transmission and the process is repeated until the counter is the same as the estimated step (or distance) nP between the source and the destination processor. On completion of this process, half-wedges in the first half of all loop clusters are transmitted to their target processors in the second half of the 100~s (i.e. all yf,j in Pi,j are transmitted to Pi,~_j). The same process is again performed but with J in the range m + 1 to k. The step np = k - 2j is thus negative indicating that a backward transmission is performed. Hence, half-wedges in the second half of all loop clusters are transmitted to their target processors in the first half of the loops.

BOOLEAN INTERSECTION BETWEEN TWO OBJECTS Computing the intersecting regions between the two objects is the major task involved in a Boolean operation. This in turn requires performing vertex-vertex intersection, vertex-edge intersection, edge-edge intersection, vertex-face intersection, edge-face intersection, and face-face intersection’*. Vertex-vertex, vertexedge, edge-edge and vertex-face intersection locate singular cases when some part of the objects touch at their vertices or edges. In edge-face, and face-face intersection, the situations when one (or both) of the objects is intersected in the interior of its faces are considered. Based on the half-wedges created in the intersection procedures and the face information of the intersecting objects, loops of half-wedges and hence faces of the new objects are determined. These loops of halfwedges, together with the faces, edges and vertices of the final objects are then grouped such that it conforms to the distributed boundary representation as discussed previously. In order to maximize the usage of an individual processor, the two objects are partitioned with a modified Uniform Grid Technique’ before the intersection procedures are invoked. In this process, each processor is mapped to a rectangular block in space. All entities are classified with respect to these rectangular blocks. Entities intersecting or residing in a rectangular block are directed to the corresponding processor. This ensures that possibly intersecting entities are contained in the same processor. In this case, vertex-vertex intersection can be carried out concurrently in all processors containing the possibly intersecting vertices. Those intersecting vertices are directed to the sequential machine where it is stored for later use.

Vertex-edge

and edge-edge

intersection

Vertex-edge intersection and edge-edge intersection are carried out with a process similar to the vertex-vertex 636

K C Hui and Y M Kan

intersection since all possibly intersecting vertices and edges have been distributed to the same processor. In a vertex-edge intersection, a vertex is tested for intersection with the interior of an edge. Coincidence of a vertex with the end-points of an edge is handled in vertex-vertex intersection. Similarly, edge-edge intersection only test for intersection in the interior of the edges. Coincidence of the edge end-points are handled in vertex-vertex intersection. If an intersection exists in a vertex-edge or edge-edge intersection, the intersecting edges and the corresponding half-wedges will be subdivided at the intersection points. These edges and half-wedges are also directed to the sequential machine for storage.

Vertex-face

and edge-face

intersection

In a vertex-face intersection, besides testing if the vertex lies on the plane of a face, it is also necessary to test if the vertex lies within the boundary of the face. This is performed in two stages. Firstly, a vertex is tested if it lies on the plane of the face. If it is not, no further test is required. Otherwise, the vertex coordinates and face information will be directed to the processors containing the first half-wedges of the loops bounding the face in consideration. These processors then broadcast the vertex coordinates and face information to all processors in the same loop cluster. A semi-infinite ray through the vertex is then intersected with the edge of the halfwedge stored in each of the processors concurrently. Results of the intersection test in the form of intersection counts (either 0 or 1) are directed to the processors containing the first half-wedges of the loops. Using previous notations, if Izf is the number of loops bounding a face, the algorithm for estimating the number of intersections between the loops of a face and a semi-infinite ray through the possibly intersecting vertex is listed below. Algorithm 2: Ray-loop intersection for i = 0 to nf do in parallel for j = 0 to ki do in parallel

Determine intersection count c,,~ between wi,j in Pi,, and a semi-infinite ray through the vertex in P*,j; Transmit ci,i to Pi,O; end for end for

On exiting the above algorithm, Pi,0 will contain the array of intersection count ci,j between the semi-infinite ray through the vertex and the half-wedges in the same loop cluster. Summation of ci,j are then evaluated in P,,, to obtain the intersection counts Si of each loop. Summation of S, is in turn evaluated to give the intersection count between the boundary of the face and the semi-infinite ray through the vertex in consideration. The parity-count method is then used to decide whether the vertex is inside or out of the face. Vertices lying within the face boundary, and the intersecting faces are directed to the sequential machine for storage. In an edge-face intersection, a test is performed to determine if the interior of an edge intersects the plane of

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

Processor array

Processor array ml Direction of boardcast

w (a) Figure 5

(b)

Direction of boardcast

Distribution of possibly intersecting faces in an array of processors: (a) distribution of faces of object A; (b) distribution of faces of object B

a face. If an intersection exists, a procedure similar to the vertex-face intersection is invoked to decide whether the intersection point lies within the boundary of the face. Valid intersecting vertices and the intersecting faces are again directed to the sequential machine for storage.

Face-face

intersection

Consider two intersecting objects A and B. Before evaluating face-face intersection, all possibly intersecting faces of one of the model (e.g. J; of A) are distributed to the first row of processors as shown in Figure 5~. These are broadcasted down the columns of processors. Similarly, all possibly intersecting faces of the other model (e.g. gi of B) are distributed to the first column of processors and are then broadcasted along each row of the processors. In this case, each processor will consist of two possibly intersecting faces. Face-face intersection can then be performed in parallel in each of the processors. In a face-face intersection, the result of the intersection (an edge) usually starts and ends with vertices which are the intersections between one of the two faces and those edges bounding the other face. Hence, results of the vertex-face and edge-face intersection are used for locating the intersection. Intersecting vertices obtained in vertex-face and edge-face intersection are distributed to the array of processors where they are grouped to form edges. Assume that Vertex-on-A is the list of intersecting vertices between edges of object B and faces of object A, and Vertex-on B is the list of intersecting vertices between edges of object A and faces of object B. Each element of the list consists of a vertex and the corresponding intersecting face. Elements of Vertex-on-A are directed to the first row of processors such that only those vertices intersecting the face contained in a processor are stored within the same processor. This is effected by circulating the elements of the list among the processors. The circulating vertex (an element of the list) is stored in the processor if the face associated with the circulating vertex is the same as that stored in the processor. These vertex information are then distributed down the columns of processors. The same is performed with the Vertex-on-B list. Elements of Vertex-on-B are directed to the first column of processors. Those vertices intersecting the face contained in a processor are stored within the same processor. This vertex information is then distributed across the rows of

processors. In addition, the intersecting vertices obtained in vertex-vertex, vertex-edge and edge-edge intersection are also distributed to each processor of the processor array. This ensures that non-manifold edges can also be detected in subsequent processes. On completion of this process, each processor in the processor array will contain the possibly intersecting face pair, the intersecting vertices (intersection points) lying on these faces or obtained in the vertex-vertex, vertex-edge and edgeedge intersection. The intersecting vertices in a processor as obtained in the previous procedure may lie on one or both faces contained in the same processor. However, the line of intersection must lie on both intersecting faces. Hence, only those vertices lying on both faces are extracted. These are then sorted along the line of intersection. In this case, successive vertices define a possible edge of the object created in the Boolean operation. If the edge already exists, the edge must be associated with one of the intersecting faces. A half-wedge is thus created and is associated with the other intersecting face. If the edge does not exist, a further test is necessary to decide if the edge is an edge of the final solid. The mid-point of the edge is classified with respect to both intersecting faces with a procedure similar to the vertex-face intersection. If the mid-point lies within the boundary of both faces, the edge is an edge of the final solid. A new edge entry is then created together with four half-wedges specifying the wedges created in the process. The mating relation between half-wedges is also modified or established at the same time.

Constructing faces

In the previous processes, all intersecting vertices, edges and half-wedges between the two objects are located. However, faces of the final object still have to be constructed. A face is defined by the loops of half-wedges lying on the plane of the face. It is thus necessary to trace the half-wedges created previously to form loops bounding faces of the final object. In order to perform loop tracing in parallel, all halfwedges lying on a face and their bounding vertices are grouped and distributed such that they reside in consecutive processors. This includes the half-wedges of the intersecting objects and those created in previous intersection processes. Denoting wi,j as the half-wedges lying on a face A, wbi andJ; are distributed in the array 637

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

(a)

wW

Initial state

wQ!

wQ2

wO,

"OO~"OO 5 %.lJO I "O?"W .

WQo wQl After 1st circulation

wQ4

“0.39”0,3

%4*“0.4

wQ5

wQ6

“OJ9”OJ

“Q69”as

wQ7 u Q7’ ” (17

wQ8 %,&,,8

wQ2 wQ3

wQ4

woS

wQ6

wQ7

wO8

wQ2

wQ4

wOJ

wQ6

wQ7

w08

uO_8~v0.6

uOf,~vO.O

“O.,*“O.,

L

r W

QO

wQl

w03

After 2nd circulation “02’“02

.

“03’“0.3

u0.41vQ4

“OJvvOJ

u0,69v06

“07*“07

,

1

Direction of circulation

Figure 6 Tracing loops of half-wedges: (a) grouping of half-wedges and faces; (b) circulation of vertices of facefo

of processors as shown in Figure 6a. Within each group of processors, the starting and ending vertices (ui,j, vi,j) of the half-wedges, and the addresses (or identification) of the processor containing the half-wedges are passed around (Figure 6b). The objective of this process is to locate the succeeding half-wedge of the half-wedge stored in a processor so that they form part of a loop of the final object. In general, the loops of a face may be split into two or more loops after an intersection. It is thus necessary to decide if the angle of deviation of the succeeding halfwedge from the stored half-wedge is the minimum. This is to ensure that the succeeding half-wedge lies on the same loop of the stored half-wedge. Hence, if the circulating starting vertex coincide with the terminating vertex of the stored half-wedge, the angle of deviation between the line joining the circulating vertices and the half-wedge in the processor is evaluated. If the angle is less than a previously stored value (or an initial value), the vertices and the address of the processor containing the corresponding half-wedge are stored. On completion of this process, each of the processors will contain the address of the processor containing the succeeding half-wedge of the stored halfwedge. Loops of half-wedges are then constructed by following the processor addresses stored in each processor of the same group. Finally, loops lying on the same face are grouped and face entries are created or modified to denote loops lying on these faces. 638

Construction of the final solid

The last stage in the process is to construct the final objects based on the loops of half-wedges obtained in the previous stage. Since the loops and the mating information between half-wedges have already partitioned the intersecting objects into two different regions, it is only required to identify the solid denoting the Boolean intersection between the objects. This is retained or discarded depending on the type of Boolean operation being performed. Identifying the entities of a solid is simply to traverse the graphs of half-wedges denoting the object. Hence, locating the intersection solid is to traverse the graphs of half-wedges describing the intersecting regions. Traversal of half-wedges actually starts with a half-wedge lying on a loop created in the previous process. This is because a loop lying on the intersecting regions must be a loop created in the intersection process. However, this loop may be associated with the intersection solid or the other part of the intersecting solids (Figures 7a and b). A test is thus required to decide if the loop of half-wedges is lying on the intersection solid. This is attained by checking if the loop is BELOW, or ABOVE the intersecting faces. As shown in Figure 7c, a loop is classified as BELOW if the dot product between a vector r and the face normal n of the intersecting face is

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

Intersecting face \ Intersecting

(4

(cl

Figure 7 Classifying loops: (a) a loop lying on A f’ B; (b) a loop lying on A U B or A-B; (c) a loop BELOW its intersecting faces; (d) a loop ABOVE its intersecting faces

negative. The vector r is constructed with a point lying within the loop and a point on the intersecting edge such that r is normal to the intersecting edge. The intersecting edge is an edge of the loop lying on the intersecting face in consideration. On the contrary, the loop is ABOVE the intersecting face if the dot product is positive (Figure 74. A loop is thus lying on the intersection solid if it is BELOW its intersecting faces. This technique may fail when a loop lies on both the intersection solid and the other part of the intersecting solids (Figure 8). In this case, the loop is BELOW its intersecting faces while it may not be lying on the intersection solid. This situation is avoided by taking into consideration the face information such that loops forming the outer boundary of a face will only be selected for the test. Traversal of half-wedges are performed concurrently. In a union or intersection operation, the processors

(4 Figure 8

containing the half-wedges of those loops in the intersecting regions and is BELOW the intersecting faces are activated. The processors containing the mating halfwedges of these half-wedges are then activated. Processors containing the half-wedges in the same loops of these half-wedges are in turn activated. This process repeats until all processors containing the half-wedges of the intersection solid are activated. In a union operation, the half-wedges of the final solid will reside in those processors that are not activated. In a difference operation, the processors containing the half-wedges of those loops in the intersecting regions and BELOW the intersecting faces are activated. This is because the complement of the object to be subtracted is used such that evaluating a difference operation is the same as evaluating an intersection operation. However, if multiple loops exist on a face, processors containing

(b)

Classifying loops: (a) a loop lying on A f’ B; (b) a loop lying on A U B or A-B

639

Distributed

boundary

representation

and Boolean operations:

K C l-hi and Y M Kan

(4

Figure 9

Basic components of test piece: (a) object B; (b) object A; (c) A u B; (d) A-B

the loops of the same face have to be activated as well. Subsequent processes for activating processors containing the half-wedges of the final solid are similar to that of an intersection or union operation. There may be cases when no loop lying on the intersecting region is detected. This corresponds to the situation when there is no intersection between the objects or one of the objects lies totally inside the other. In this occasion, two vertices, each from one of the objects, are classified with respect to the other. If one of the vertices lies inside the other object, the object containing the vertex must lie inside the other object. Otherwise, there is no intersection between the objects.

IMPLEMENTATION AND DISCUSSION OF RESULTS The proposed distributed boundary representation

and

the algorithms for evaluating Boolean operations are implemented on a DECmpp 12000/Sx/8K distributed memory SIMD computer. 48K memory is available in each of the 8K processors which are interconnected in a 128 x 64 2D mesh, The half-wedge based boundary representation and the procedures for evaluating Boolean operations are also implemented on a sequential machine, a DECstation 5000, which also acts as the front end of the DECmpp. Tests are performed on sets of objects with increasing complexity to compare the time required for Boolean operations on the massively parallel computer and the sequential machine. The test pieces are constructed such that the number of entities involved can be handled with the total available memory of the processors. For objects that require a memory size exceeding the available limit, virtual memory may have to be used to extend the accessible memory of each processor. In this case, the performance of the system depends on the performance of the proposed algorithms

Table 1 Time required for union operations (s) No. of A and B No. of entities Time (sequential) Time (parallel)

640

9: 0.29 21.02

2 224 1.5 27.04

4 512 7.71 36.1

8 1088 43.77 46.71

16 2240 246 64.28

32 4544 1478.46 100.12

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

(4

Figure 10 Test pieces with 2 and 4 copies of A and B: (a) a union with 2 copies of A and B; (b) a difference copies of A and B; (d) a difference with 4 copies of A and B

as well as the performance of the virtual memory system being used. Limiting the complexity of the test piece to within the available memory thus gives a good measure on the performance of the algorithms. It is expected that the use of a virtual memory system will inevitably lower the overall performance of the system as a result of the memory swapping involved. Three sets of tests are performed. In the first set of tests, the test pieces are composed of basic components arranged in a regular fashion. In this case, the performance of the algorithms with increasing complexity in the test piece can be measured independent of the effect induced by variations in the geometry of the test piece. In the second test set,

Table 2

Time required

No. of A and B No. of entities Time (sequential) Time (parallel)

for difference

1 96 0.31 21.88

operations

2 224 1.53 27.48

with 2 copies of A and B; (c) a union with 4

the basic components of the test pieces are arranged in an irregular manner. The last test set is a multi-cavities mould arrangement of a toy car. Results of the tests are detailed in the following sections.

Test set with regular arrangement The basic components

of the test pieces are shown in of object A and object B are evaluated on both the DECmpp and the sequential machine. The corresponding times required are recorded. A different test piece is constructed by Figure 9. The union and difference

(s)

4 512 1.76 36.1

8 1088 44.23 48

16 2240 241.3 65.51

32 4544 1474.94 100.89

641

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

64

Figure 11

Test

with 32

inserting one more copies of A and B. Their union and difference are then evaluated (Figures lOa and b). This is repeated for objects with 4, 8, 16 and 32 copies of A and B. The test pieces with 4 and 32 copies of A and B are illustrated in Figures 1Oc and d and lla and b. Results of the test are listed in Tables 1 and 2. In the tables, the number of entities refers to the total number of edges and faces involved. Results of the test shows that the time required for the union operations is roughly the same as that required for the difference operations. Similar tests are also performed on a set of objects with non-manifold boundary. In this test, an object C is constructed as shown in Figure 12a. Objects A and C are positioned such that they touch at an edge (Figure 12~). The union and difference of objects A and C are then evaluated (Figures 12d and e) on the DECmpp as well as the sequential machine. This is repeated for objects with 2, 4, 8, 16 and 32 copies of A and C. The test pieces with 32 copies of A and C are illustrated in Figures 13a and b. Results of the tests are listed in Tables 3 and 4. Table 3

Time required

No. of A and C No. of entities Time (sequential) Time (parallel)

642

for union operations 1 96 0.29 23.95

2 224 1.34 30.08

These results also indicate that the time required for the union operations is roughly the same as that required for the difference operations on objects with non-manifold boundary. However, the processing time required for objects with non-manifold boundary is less than that required for objects with manifold boundary (Figure 14). This is a result of the reduced number of intersection edges generated in the operations such that the time required for splitting edges and faces is also reduced. Test on the Boolean intersection operation is not performed since intersection is always computed in a union or difference operation as discussed in previous sections. The average time required for Boolean operations is estimated by taking the average of all four test results. The overall performance (average time required) of the system is also shown in Figure 14. The time required for evaluating Boolean operations on the parallel machine is relatively constant as compared with the time required on the sequential machine. The parallel algorithm for Boolean operations outperforms

(s) 4 512 7.02 38.25

8 1088 38.59 50.07

16 2240 210.83 64.4

32 4544 1297.65 91.68

Oistributed boundary representation and Boolean operations: K C l-hi and Y M Kan

Non-manifold edge

(a)

(b)

(4

Figure 12 Basic components of test piece with non-manifold edge: (a) object C; (b) object A; (c) relative positions of A and C; (d) A U C; (e) A-C

it sequential counterpart when the number of entities involved exceeds 1140. This occurs when the computing time required is around 50 s. For objects with over 1140 entities, the performance of the sequential algorithm drops drastically. Figure 15 shows the relative time required in each of the various stages in evaluating the union of objects A and objects B in the parallel machine. The stages being studied are the vertex-vertex, vertex-edge, edge-edge, vertex-face, edge-face and face-face intersection, as well as the stages for the construction of faces, and the construction of the final solid. The construction of faces and the construction of the final solid are the most time consuming stages. The percentage time required for the construction of faces increases from 7 to 35% as the number of entities involved increases from 96 to 4544.

On the other hand, the percentage time required for the construction of the final solid decreases from 54 to 26%. This implies that the degree to which tasks are distributed for parallel processing is higher in the construction of the final solid than in the face construction stage. This is a result of the retrieval of half-wedges from the sequential machine which is a sequential process. The retrieval of half-wedges is a prerequisite for the face construction stage since half-wedges created in vertex-edge and edge-edge intersection are also required for tracing loops in the construction of faces. Figure 16 gives a comparison between the sequential and parallel implementation of the construction of faces and the construction of solids, respectively. The performance of the parallel algorithms for face construction is initially rather linear but gradually increases when the

Table 4 Time required for difference operations (s) No. of A and C No. of entities Time (sequential) Time (parallel)

1 96 0.27 24.56

2 224 1.35 30.26

4 512 7.05 40.19

8 1088 38.39 51.41

16 2240 207.53 65.76

32 4544 1282.24 93.61

643

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

(b) Figure 13

Test pieces with 32 copies of A and C: (a) result of a union; (b) result of a difference

number of entities increases beyond 2240. On the contrary, the performance of the sequential algorithm drops rapidly as the number of entities increases. This is in contrast with the solid construction stage in which the performance of

1600

the parallel algorithm drops initially but gradually becomes constant for objects with large number of entities. Among the six intersection stages, edge-face intersection is the most time consuming stage which covers

L

1400

---+-

(sequential)

1200

0

1000

2ooo

3ooo

no. of entities Figure 14

644

Overall performance

manifold

4ooo

5ooo

--c----

manifold (parallel)

-*-

non-manifold (sequential)

-

non-manifold (parallel)

-+---

overall (sequential)

U

overall (parallel)

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

vertex-vertex

-

intersection -

vertex-edge intersection

-*--

edge-edge intersection

-O-

vertex-face intersection

-

edge-face intersection

U

face-face intersection

-

construction of faces

u

construction of solid

M

construction of faces (sequential)

-

construction of faces (parallel)

-------*-

construction of solid (sequential)

-O-

construction of solid (parallel)

10

0 1000

0

2000

3000

4000

5000

no. of entities Figure 15 Relative time required for various stages in evaluating Boolean operations

0

1000

2000

3000

4000

5000

no. of entities Figure 16 Performance of the face and solid construction stages

T

.$ ::\. &



2.. 1” 04 0

1000

2000

3000

4000

5000

no. of entities Figure 17 Percentage time for evaluating complement

645

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

, 0

1000

2000

3000

4000

5000

no. of entities Figure 18 Time required for evaluating complement

roughly 26% of the total time in the process, However, its effect tends to be constant and is independent of the number of entities involved. Other intersection stages, except face-face intersection, allocate a relatively small amount of time especially when the number of entities involved increases. Face-face intersection initially takes up 5.8% of the time and slightly increases to 11.1% when the number of entities is 4544. In a difference operation, the complement of the objects to be subtracted have to be evaluated. Test results on the difference operations on objects A and C indicate that the percentage time required for the complement stage drops from 4 to 1.8% when the number of entities reaches 4544 (Figure 17). Comparison between sequential and parallel implementation shows that the parallel algorithm for evaluating the complement of objects performs efficiently for objects with more than 3100 entities as shown in Figure 18.

Test set with irregular arrangement The same basic components, Object A and Object B, as used in the first test set are used but are arranged in an irregular manner. Figure 19 shows the test pieces with 2, 3 and 5 basic components, respectively. Figure 20 illustrates the test pieces with 8, 16 and 32 basic components. The time required for the union operations in creating the test pieces are listed in Table 5. A different test set with irregular arrangement is also constructed. In this test, the basic component is the lower and upper part of a toy car as shown in Figure 21. Copies of the lower and upper part are created and are interconnected such that their union constitutes a multi-cavities mould of the toy car (Figure 2Id). The times required for the union operation for the moulds with 2, 3, 4, 5, 6 and 9 cavities are listed in Table 6.

(4

(b)

Figure 19 Test pieces with irregular arrangement: {a) test piece with 2 basic components; fb) test piece with 3 basic components; (c) test piece with 5 basic components

646

Distributed

Figure 20 Test pieces with irregular basic components

arrangement:

boundary

representation

(a) test piece with 8 basic components;

Results for the difference operations are more or less the same and are therefore not included in this paper. This is because the algorithms for union and difference are effectively the same except that a complement procedure is required for a difference operation. Since the complement procedure only constitutes less than 4% of the total time (Figure 18) its effect on the total time is not significant. The parallel implementation attains better performance over the Table 5

Time (s) required

No. of A and B No. of entities Time (sequential) Time (parallel)

for the test set with irregular

1 96 0.29 21.02

2 288 1.97 28.94

3 412 4.01 33.48

and Boolean operations:

K C Hui and Y M Kan

(b) test piece with 16 basic components;

(c) test piece with 32

sequential implementation for the test set with irregular arrangement when the number of entities exceed 1340. In addition, the times required for the parallel algorithms are more or less the same for both the regular and the irregular cases (Figure 22). This indicates that the performance of the proposed algorithm is insensitive to the regularity of the data involved. Results of the tests on the toy car test set (or multi-cavities mould) also show that the performance of the parallel algorithm is rather

arrangement

5 788 15.25 42.59

8 1292 49.14 55.92

16 2676 286.72 13.74

32 5324 1860.54 130.27

647

Distributed

boundary

representation

and Boolean operations:

K C Hui and Y M Kan

(a)

(b)

Figure 21 Multi-cavities with 9 cavities

mould of a toy car: (a) upper part of the toy car; (b) lower part of the toy car; (c) the union of (a) and (b); (d) the mould piece

comparing with the sequential algorithm. However, the speed up attained in this case is less than that attained in the other test sets. This is mainly a result of the speed up of the intersection stages in the sequential algorithm. Although the number of entities involved is roughly the same for the toy car and the other test sets, the number of intersection calculations required in the toy car test sets are much less than that required in the other test sets. This is because a large portion of the entities (e.g. edges and faces of the wheels) are relatively small compared with the overall dimension. An enclosing box test thus discards most of the non-intersecting entities so that the number of intersection calculations are greatly reduced. constant

CONCLUSIONS Parallel processing is known to be a promising technique for improving the performance of process Table 6

Time (s) required

No. of cavities No. of entities Time (sequential) Time (parallel)

648

for the multi-cavities

1 458 2.16 20.90

2 948 9.91 31.97

that require intensive computation. Boolean operation in solid modelling is a computationally intensive task and is thus a good candidate for parallel processing. A distributed boundary representation is proposed which allow parallel evaluation of Boolean operation on objects involving large number of entities. In this representation scheme, the concept of the half-wedge is adopted. This allows modelling of objects with nonmanifold boundary. Half-wedges exist in pair denoting wedges incident to an edge. A loop of half-wedges defines a boundary of a face. A solid is thus represented as a graph of half-wedges in addition to the faces, edges and vertices of the objects. Half-wedges are distributed in an array of processors such that they can be processed concurrently. Parallel algorithms are developed for evaluating the result of a Boolean operation between objects. This includes the complement of an object, the evaluation of vertex-vertex, vertex-edge, vertex-face, edge-edge, edge-face and face-face intersection, as well as the construction of faces and solids.

mould 3 1438 23.24 35.96

4 1898 43.7’2 40.32

5 2418 71.34 43.25

6 2908 107.73 46.92

9 4514 297.19 60.92

Distributed

boundary

representation

and Boolean operations:

2000

--+--

seguential (irregular arrangement)

--+---

parallel (irregular arrangement)

-*---

sequential (regular arrangement)

-4--

parallel (regular arrangement)

---+--

toy car (sequential)

--+-

toy car (parallel)

1800

K C Hui and Y M Kan

1600 1400 1 3

1200 1000

.$

800 600 400 0 0

1000

2000

3000

4000

5000

6000

no. of entities Figure 22

,

I-

The effect of irregular arrangement on the performance of the system

The proposed distributed boundary representation and the parallel algorithms are developed for distributed memory SIMD computer. However, this method can be generalized to shared memory parallel machines. In this case, parallel sortings may be required to group various data for processing in individual processor. An experimental system was implemented on a DECmpp with 8 K processors. Test results on evaluating Boolean operations reveal that the construction of the final solid is the most time-consuming process when the objects involved consist of a small amount of entities. However, its effect drops when the number of entities increases. For large number of entities, face construction becomes the dominant stage. The overall performance of the system is not sensitive to the spatial arrangement of the objects involved but may be affected by the relative sizes of the entities involved. The system performs well for objects with more than 1300 entities in a test being conducted. For objects with fewer entities, Boolean operation is more efficient in a sequential machine owing to the possible overhead in data partitioning and the necessary communication between processors. ACKNOWLEDGEMENTS We would like to thank the Engineering Faculty of the Chinese University of Hong Kong for providing access to the massively parallel computer. Thanks are also due to the Department of Systems Engineering and Engineering Management in providing the financial support for Mr Y M Kan in conducting this research.

Narayanaswami, C ‘Parallel processing for geometric applications’ PhD Dissertation Rensselaen Polytechnic Institute, Tory, New York (Dec. 1990) 7 Strip, D and Karasick, M ‘Solid modeling on a massively parallel processor’ Int. J. Supercomput. Applicat. Vol 6 No 2 (1992) pp 175-192 8 Karasick, M ‘On the representation and manipulation of rigid solid’ PhD Dissertation McGill University, Montreal, Quebec (1988) 9 Bajaj, C L and Dey, T K ‘Constructive solid geometry on a _.__ MIMD distributed memorv machine’ Proc. CSG94 Set-theoretic Solid Moa’elling: Techniques and Applications (1994) pp 213-223 10 Masuda, H, Shimada, K, Numao, M and Kawabe, S ‘A mathematical theory and applications of non-manifold geometric modeling’, in Krause F L and Jansen H (Eds.) Advanced Geometric Modelling for Engineering Applications Elsevier Science Publishers B.V., North-Holland, IFIP/GI (1990) pp 89-103 11 Rossignac, J R and Requicha, A A G ‘Constructive nonregularized geometry’ Comput. Aided Des. Vol 23 No 1 (1991) pp 21-32 12 Gursoz, E L, Choi,Y and Prinz, F B ‘Boolean set operations on non-manifold boundary representation objects’ Comput. Aided Des. Vo123 No 1 (1991) pp 33-39 6

_ K C Hui is a lecturer in the Mechanical and Automation Engineering Department of the Chinese University of Hong Kong. He received his BSc and PhD in mechanical engineering from the University of Hong Kong in 1979 and 1990, respectively. Before joining the Chinese University of Hong Kong, he was a consultant in the CAD Services Centre of the Hong Kong Productivity Council. He is a Chartered Engineer and a Member of the British Computer Society. His research interests include graphics, geometric and solid modelling, and their applications in design and manufacturing.

REFERENCES 1 Yamaguchi, F and Tokieda, T ‘A unified algorithm for Boolean 2

3 4 5

shape operations’ IEEE Comput. Graph. & Applicat. Vol4 No 6 (1984)pp 24-37 Yamaguchi, F and Tokieda, T ‘A solid modeler with a 4 x 4 determinant processor’ IEEE Comput. Graph. & Applicat. Vol 5 No 4 (1985) pp 51-59 Ellis, J, Kedem, G, Marisa, R, Menon, J and Voelcker, H ‘Breaking barriers in solid modelling’ Mech. Engng Vol 113 No 2 (1991) pp 28-34 Holliman, N S, Wang, C M and Dew, P M ‘Mistral-3: parallel solid modelling’, Visual Comput. Vo19 (1993) pp 356-370 Akman, V, Franklin, W R, Kankanhalli, M and Narayanaswami, C ‘Geometric computing and the uniform grid data structure’ Comput. Aided Des. Vol21 (1989) pp 410-420

-I

Y M Kan is a lecturer in the Department of Manufacturing Engineering of the Hong Kong Technical College. He obtained his BEng (Hans) from the Hong Kong Polytechnic in 1988. He had been working as an engineering trainee in the Hong Kong CAD-CAM Services Ltd. and as an CAD/CAM engineer in the ASM Automation Ltd. He is now also an MPhil candidate in the Chinese University of Hong Kong. His research interest includes solid modelling, parallel processing and engineering design methodology.

649