Time and space efficient net extractor Surendra Nahar and Sartaj Sahni N
An efficient algorithm is developed for net extraction. This algorithm is able to handle very large layouts efficiently even when memory is limited. This is done by using disk storage effectively. The algorithm has been programmed in Fortran and is superior to other existing net extractors.
j/Jl /
net extraction, time and space complexity
/
N Software systems for VLSI artwork analysis have been described by several authors 1-a. Asymptotically efficient algorithms with potential application to VLSl artwork analysis have been reported 9-14. The net extractor described in this paper was developed to meet a need at Sperry Corporation for a net extractor that was both faster than its current one and that could be used on layouts containing many more polygons than could be handled by the current net extractor. In particular, there was a need for a net extractor that could work efficiently when the number of polygons in the layout greatly exceeds the capacity of the available memory. The development of this system required the authors to enhance some of the recently proposed algorithms for computational geometry 11, as well as to develop an overlap test for arbitrary polygons that would work well in practice. In addition, efficient external memory methods were developed for the memory intensive parts of the software system. In this paper, the algorithms used by the authors are described as is the architecture of the system. Performance results on six layouts are also presented.
PROBLEM SPECI FICATION The input to the net extraction problem is a set of polygons. There are three types of polygons that comprise the input: net, link, and text. The net polygons are the polygons that are used in layout and fabrication while the link and text polygons are used for documentation purposes. A text polygon is simply a point (x,y) on some layer. The text (macro names, chip name, output pins, fanout, fanin, etc.) associated with this point serves as documentation. Link polygons are used to associate text polygons with net polygons. As an example, consider the polygons of Figure I. Polygons 1 - 4 are net polygons, 5-9 are text polygons and 10 is a link polygon. The text of polygons 5 and 9 is associated with each of the net polygons while the text of polygon 6 is associated with net polygon 2 alone. Computer Science Department, 136 Lind Hall, University of Minnesota, Minneapolis, MN 55455, USA
volume 20 number 1 jan/feb 1988
1
<
e8 T
/
/
/
5 • T
/
L "3
I
ii0 I I
2 e6
N N = net polygon; T = text polygon; L = link polygon; 1,2, :3 etc. polygon numbers
Figure I. Net, rink and text polygons Let the number of net, link, and text polygons be #net, Slink, and #text respectively. The polygons are numbered from 1 through #net+#1ink+#text with the numbers 1 through #net used for the net polygons; the numbers #net+l through #net+Slink used for the link polygons; and the remaining numbers used for the text polygons. While no restriction is placed on the shape of a polygon, it is convenient to categorize polygons as: point polygon (i.e. a text polygon), orthogonal rectangle (cf. Figure 2(a)), general polygon (cf. Figure 2(b)). A point polygon is defined by the (x,y) coordinates of the point, an orthogonal rectangle by its Xmln, Xmax, Ymin and)'max values and a general polygon by a sequence of vertex coordinates together with the Xmin, Xmax, Ymin and Ymax values of its smallest bounding rectangle. The distribution of point, orthogon,al rectangles, and general polygons for 6 sets of polygons generated by the Calma system for 6 different chips is provided in Table 1. The figures in each row sum to more than 100% as the net and link polygons are the same as the orthogonal and general polygons. As can be seen, about 80% of the polygons are either point polygons or orthogonal rectangles. The set of polygons created by the Calma system is partitioned over up to 64 layers. Layers are classified by type (i.e. net layer, link layer, text layer). A text layer can
0 0 1 0 - 4 4 8 5 / 8 8 / 0 1 0 0 1 7 - 1 0 $03.00 © 1988 Butterworth & Co (Publishers) Ltd
17
( X m a x ' Yrnax )
Table 1. Distribution of polygons Chip number
Y
0,0
(Xmin'
Ymin )
( Xmax ' Ymax )
Y
T
% Link polygons
% Text polygons
% Orthogonal polygons
% General polygons
35549 117303 110799 132101 99327 129280
2.0 1.8 1.6 2.0 1.4 1.9
36.0 19.6 18.8 21.2 18.3 19.5
53.1 57.1 58.7 59.2 59.6 60.0
10.9 23.3 22.5 19.6 22.1 20.5
62.0 78.6 79.6 76.8 80.3 78.6
~-x
a
0,0
Total % Net number polygons of polygons
Step 1 Partition the chip area into rectangles or buckets. Step 2 Process the polygons file placing each polygon into all buckets with which it overlaps. Step 3 For each bucket, determine the pairs of polygons that overlap. This is done by examining all pairs of polygons in the bucket. For each pair, the bounding rectangles are tested for overlap. If an overlap is detected, the actual polygons are tested for overlap. Step 4 Sort the file of generated pairs and eliminate duplicates (duplicates are possible as a polygon may be in several buckets).
Figure 4. Previous pair generator
~ x
k
~
( Xmin' Ymin )
b Figure 2. (a) Orthogonal rectangle, and (b) general polygon contain only text polygons; a link layer is restricted to containing link and text polygons; a net layer may contain net and text polygons only. This classification of layers is not relevant to the subsequent discussion. However, it is important to note that the specified layers are logical layers rather than physical ones and that a set of layer pairs, L = { ( i l , j l ) , (i2,j2),... } determines whether or not an overlap of polygons on two different layers constitutes a real overlap (i.e. an electrical contact). Suppose that polygon A is on layer i and polygon B on layer j, i #=j and that the two polygons overlap. There is a real overlap between A and B iff (i,j) E L or (j,i) E L. Note that all overlaps
Pair generator
between polygons on the same layer are to be treated as real overlaps. A net is a collection of net polygons that have real overlaps together with all text information associated with these polygons. Observe that each net polygon is in exactly one net. The net extraction problem is that of determining the nets from L and the given set of polygons. A two phase approach to the net extraction problem is adopted. In the first phase, the polygon and layer pairs information is used to construct pairs of polygons with real overlaps. In the second phase, this information is used to generate the nets. This is shown schematically in Figure 3. The computer used is a Sperry 1190.
PAIR GENERATOR The polygon file is typically too large to fit into memory. Hence the pair generator must work on it piecemeal. The approach taken in Sperry's former net extractor is given in Figure 4. This method is slow in practice. The following causes for
Net generator
Figure 3. Net extractor schematic
18
computer-aided design
Step 1 Partition the x-axis into bins. Step 2 Input the polygons file. Create a new file, called rectangles, that has two entries for each polygon: bottom edge of (bounding) rectangle, top edge. Step 3 Sort the rectangles file into nondecreasing order of y. Within y, bottom edges precede top edges. Step 4 Input the layer pair list L. This is sufficiently small that it may reside in memory. Step 5 Process the bottom and top edges in the above sorted order producing a file of overlapping rectangle pairs. Step 6 Sort the rectangle pairs by first coordinate and within first coordinate by second coordinate. Step 7 Process the sorted pair list eliminating duplicate pairs as well as pairs (i,,/) where one or both of i and ,/are general polygons that do not overlap (even though the bounding rectangle does).
Figure 5. Scan line pair generator
this can be identified: • The preprocessing of steps 1 and 2 imposes an unnecessary overhead. The distribution of polygons into buckets requires a full pass over the polygons file. • Since all pairs of polygons in a bucket are examined for overlap, the processing time per bucket (in step 3) is quadratic in the number of polygons in the bucket. The overhead introduced by step 4 of Figure 4 is not listed above as the generation of duplicate pairs is unavoidable when insufficient memory is available. The two causes of inefficiency cited above may be eliminated by using the scan line method of Bentley 11. This method was developed originally to report all pairs of overlapping rectangles. To adapt the method to polygons, first view all point polygons as rectangles of zero height and width and all general polygons are replaced by their bounding rectangles. The x-axis is partitioned into bins. A horizontal scan line that begins a t y = 0 and moves to the maximumy value is used. Each rectangle (or bounding rectangle) is considered twice: once when the scan line reaches its bottom edge and once again when its top edge is reached. When the scan line moves to y = y ' , all rectangle top and bottom edges at this y value are processed (bottom edges are processed before top edges). When a bottom edge is processed, all bins into which an edge falls are examined. The rectangles in these bins are examined for overlap with the newly encountered rectangle. All detected overlaps amongst rectangles in layers i and j such that (i,,/) E L are reported. The new rectangle is also added to all these bins. When a top edge is processed, the corresponding rectangle is removed from all the bins into which the top edge falls. Since the number of entries in a bin varies with time, the bins are maintained as linked lists. Figure 5 describes the scan line approach more formally. In order to check for actual overlap when a general polygon is involved in an overlapping pair (i,j), it is necessary to use its vertex list. It is not practical to carry this vertex list along with the bottom edge of the bounding rectangle during steps 2 - 5 of Figure 5. Hence it is necessary to determine the vertex list during the processing of step 7. An efficient way to do this is to maintain the vertex lists for general polygons in order of polygon numbers. Let
volume 20 number I january/february 1988
Step 1 If both i and j are point polygons, then they overlap iff they are the same point. If one is a point polygon then overlap reduces to containment of the point in the other polygon which can be easily tested for (see below). So, assume that neither i norj are point polygons. Step 2 (containment check), let Xmin(i) denote the least x-coordinate at which polygon i has a vertex. Without loss of generality, we may assume that Xmi n (i) ~ Xrnin (]). So, if i and j overlap then j is contained in i or two edges o f i and] intersect. Pick an arbitrary vertex of j and determine the number of edges of i that intersect on an infinite horizontal line drawn from this vertex to one side of the vertex. If this number is odd, then i and,/overlap. Step 3 (intersection check), if step 2 fails to detect an overlap, then examine the edges of i and ,/pairwise for an intersection. If an intersecting pair is found, then the polygons overlap. Otherwise, they do not.
Figure 6. Determining overlap of polygons i and j this be done in a file 'ordered vertex lists'. Step 7 is decomposed into:
Step 7Ca) Process the sorted pair list eliminating duplicates. Pairs that do not involve general polygons are output to the final pair file. The remaining pairs are output to a file 'general pairs'. Step 7(b) Sort the pairs (i,j) in 'general pairs' on i. Use a merge type operation to extract the vertex lists for the first polygon of the pairs (i,j) from the file 'ordered vertex lists'. Step 7(c) Repeat step 7(b) forj. Step 7(d) Now each general polygon has its vertex list associated with it. For each pair in 'general pairs' determine if there is an actual overlap. If so, output the pair to the final pairs file. (In practice, steps 7(c) and 7(d) are combined. This saves one input and one output pass over the 'general pairs' file.) Before we can implement the scan line approach described above, two issues need to be resolved. First, we need to know how to determine if two polygons i and ] overlap when at least one is a general polygon (cf. step 7(d) above). Several asymptotically efficient algorithms for this have been proposed 14. Since the number of vertices in our general polygons is quite small (18 on the average), it is not expected that any of these methods would outperform the method described in Figure 6. We shall examine this issue in greater detail later. The second issue is the number of bins to be used. Bentley eta/11 suggest the number 0~(N) I12 where N is the number of polygons and a some constant. Our experiments show that 0.5(/V) I12 is a good choice for the number of bins. Since equal width bins are assumed, the bin width is now determined from a knowledge of the horizontal extent of the layout. The performance of the scan line pair generator described above was compared to that of the partition oriented pair generator (Figure 4). The partition oriented pair generator was available as Fortran code. The authors coded
19
Table 2. Performance comparison o f partition oriented and scan line pair generator Chip number
Partition oriented pair generator (POPG) time (s) CPU I/O Total
Scan line pair generator (SLPG) time (s) CPU I/O Total
Space for
1 2 3 4 5 6
151.7 290.9 322.6 338.8 317.4 346,0
51.3 100.3 103.7 111.4 98.2 110.5
2100 5202 5513 5957 4828 5957
88.8 665.6 642.8 639,5 551.8 639.6
240.5 956.5 965.4 978.3 869.2 985.6
55.5 252.5 245.7 256.4 230.7 252.1
106.8 352.8 349.4 367.8 328.9 362.6
SLPG bins in words
Step Step
Time SLPG/POPG in ,o, 44.4 36.9
Step 3
36.2 37.6 37.8 36.8
Step 4 Step 5
Table 3. P e r f o r m a n c e c o m p a r i s o n o f original w i t h e n h a n c e d scan line m e t h o d Chip
Original scan line method
number [OSLM) Time (s) CPU I/O
Enhanced scan line method
ESLM I MemorY/ rime (s) Total words CPU I/O
Memory Total words
Memory
Time
ESLM/OSLM in %
ESLM/OSLM in %
I 2
6.6 12.7 19.3 2100 26.3 58.0 84.3 5202
4.3 8,7 13.0 2325 15.9 35.1 51.0 5202
110.7 100.0
67.4 60,5
3 4 5 6
25.2 30.8 22.5 30.5
15.2 18.8 13.5 18.6
100.0 100.0 100.0 100.0
60.3 60.0 59.5 60.5
54.9 67,5 48.7 66,4
80.1 99.3 71.2 96.9
5513 5927 4828 5957
33.1 40.8 28.9 40.0
48.3 59.6 42,4 58.6
5513 5927 4828 5957
the scan line generator in Fortran. The two programs were run on a Sperry 1190 system, and the 6 layouts of Figure 3 used as input. Table 2 gives the time taken by both programs to generate the overlapping pairs file. As can be seen, the scan line algorithm takes about one-third the time taken by the partitioning algorithm. Further, the space required to maintain the bins is quite reasonable.
ENHANCEMENTS The basic scan line method described in Figure 5 can be enhanced in two significant ways. First, rather than create two entries for each polygon in step 2, just one is used. The enhanced algorithm takes the form given in Figure 7. Since the rectangles file created in step 2 is now about half its original size, the sort time for step 3 and the input time for step 5 are reduced. Table 3 compares the performance of the improved scan line method with the original scan line method. The times shown are for steps 1 through 5 of Figures 5 and 7. The memory requirements are about the same as before. The run time of the program is about 40% less than before. The second enhancement made by the authors occurs in step 7. This is concerned with determining whether or not two polygons whose bounding rectangles overlap actually overlap. The interesting part of this step is step 3 of Figure 6 which deals with the intersection of two polygons. This may be solved by using the scan line algorithms of Shamos 14 and Szymanski and Van Wyk 7. If the number of edges in the two polygons together is N, then these algorithms will report a polygon intersection in O ( N log N) time. This is considerably better, asymptotically, than checking pairs of edges (as in step 3 of Figure 7). However, when N is small, the pairwise method is faster. So, one possibility is to use the pairwise method for small N and the O(N log N) method for large N. Another possibility is to make use of the following observation. Let A and B, respectively, be the boundary rectangles of the polygons i andj. Let C be the intersection of A and B. If i andj intersect, then they must do so in the rectangle C. This leads to the algorithm of Figure 8.
Example 1 Consider the general polygons i and j of Figure 9(a). The bounding rectangles are, respectively, A and B. The intersection rectangle C is given by the corners cl, c2, ca, c4, edges (i) = {1, 2, 3, 4, 5} and edges (j) = {6, 7, 8}. In the worst case 5 x 3 = 15 pairs of edges need to be checked for
20
I 2
Step 6 Step 7
Partition the x-axis into bins. Input the polygons file. Create a new file, called rectangles, that has one entry for each polygon. This entry is: (Xmin, Ymin, Xrnax, Ymax, polygon_number). Sort the rectangles file into nondecreasing order ofy. Input the layer pair list L. Process the sorted rectangles producing a file of overlapping rectangle pairs. When a rectangle is input, a search is performed in the bins that the edge from Xmin to Xmax falls. Rectangle overlaps are checked in these bins. Further, existing rectangles with Ymax less than the Ymin of the new rectangle are discarded. The new rectangle is inserted into the bins into which it falls. Same as in Figure 5. Same as in Figure 5.
Figure 7. Enhanced version of Figure 5 intersection. As soon as an intersecting pair is found, overlap is reported and the pairwise check terminated. Another example is shown in Figure 9(b). Here edges (i) = ~ and edges (j) = {1, 2}. Since edges (i) = ~, there is no overlap. It might be suspected that if there is significant overlap between the two bounding rectangles A and B, then the overhead of step 2 may not be justified. Further, if one of the two polygons i and j is a rectangle, then not much is to be gained by going through steps 1 and 2 of Figure 8. The algorithm of Figure 10 results when these factors are accounted for. One final consideration is that step 4 of Figure 10 (i.e. step 3 of Figure 8) may be better performed using the O(N log N) scan line method when the number of edges in edges (i) U edges (j) is suitably large. The authors experimented with many pairs of randomly generated polygons as well as with all the pairs generated by our 6 layout test set and found no pair where it was advantageous to use the O(N log N) scheme. The performance of the algorithms of Figures 6 and 10 were compared for the implementation of step 7(d) of the scan line pair generator. The times for the 6 chip test set are reported in Table 4. The value of a used was 0.45. This Step I Find the intersection rectangle C. Step 2 Let edges (i) be the edges of polygon i that intersect an edge of C or are contained in C. Let edges (j) be similarly defined for polygon j. Step 3 Consider pairs of edges (a,b), a e edges (i) and b E edges (j) for intersection. Polygons i and j intersect iff an intersecting pair is found. Figure 8. Algorithm for detecting an overlap between polygons A and B Table 4. P e r f o r m a n c e c o m p a r i s o n o f the a l g o r i t h m s o f Figures 6 and 1 0 detection
for polygon overlap Chip number
I 2 3 4 5 6
Figure 6 algorithm time (s)
Figure 10 algorithm time Is)
CPU
I/0
Total
CPU
I/0
rota]
32.8 31.1 33.6 34.6 33.2 35.]
10.3 41.6 39.6 38.4 37.5 38.0
43.1 72.7 73.2 73.2 70.7 73.1
23.0 24.9 27.6 26.5 26.8 26.0
10.2 39.6 39.2 38.4 36.4 38.1
33.2 64.5 66.g 64.9 63.2 64.1
CPU time Figure 10/Figure 6 in % 70.0 80.1 82.1 76.6 80.7 74,1
computer-aided design
....
/
C4
1
-
'
I
--~~1--'
/
6
L
L' ' ~'1
-1 B i
I
8
I 4
I
:
3 ~ 1 -c;
I.
1
a
i
iA c3
C4r
I I It__
i
1B
[i I
Cl L
,
I
I ~c2
b
I
i and j are polygons for overlap detection. A and B are the boundary polygons of i and j respectively. ClC2C3c4 is the intersection polygon.
I
Figure 9. Polygon pairs for Example I value of ~ was obtained experimentally. In fact, with this a, the algorithm of Figure 9 always did as well (or better) as the original intersection algorithm. As can be seen from Table 4 there is a reduction of 15% to 30% in CPU time. Table 5 compares the performance of all steps of the original scan line pair generator (Figure 5) with that of the enhanced scan line pair generator. The enhanced generator makes use of the enhancements discussed (i.e. the algorithms of Figures 7 and 10 are incorporated). Table 5 shows a reduction of 10% to 15% in the overall time requirements.
NET GENERATOR The net generator constructs the nets from the pairs file created by the pair generator of the preceding section. Step 1 If i or j is a rectangle, then define edges (i) and edges (j) to be all the edges in i and ], respectively and go to step 4. Step 2 Find the intersecting rectangle C. If the area of C is more than a x (area (A) + area (B)) then define edges (i) and edges (j) as in step I and go to step 4. Step 3 Compute edges (i) and edges (j) as in step 2 of Figure 8. Step 4 Same as in step 3 of Figure 8. Figure 10. Improved algorithm for detecting an overlap between polygons A and B
volume 20 number I january/february 1988
There are four categories of pairs in this file: (net, net), (link,text), (net, text), and (net, link). The steps involved in the net generator are given in Figure 11.
Step 1 : partitioning by pair type This can be done during the pair generation phase. Rather than put all pairs into a single file as discussed in the previous section, pairs are placed into one of four different files depending on the pair type. Hence the cost of partitioning is zero.
Step 2: creating the net polygon - text information file The number of (net, link) pairs is sufficiently small (only about 6.4% of the total number of pairs) that these can be Table 5. Performance comparison pair generator
Chip
Originalscanline pair
number generator (OSLPG)
time (s) CPU I/O
Total
of original with
enhanced scan line
Enhancedscanline pair
generator (ESLPG)
time (s) CPU I/O
Total time ESLPG/OSLPG in %
Total
1 2 3
51.3 100.3 103.7
55.5 106.8 39.2 252.5 352.8 81.9 245.7 349.4 87.5
51.3 219.3 222.5
90.5 84.7 301.2 85.4 310.0 88.7
4 5 6
111.4 98.2 110.5
256.4 230.7 252.1
230.4 205.6 226.4
321.7 287.6 316.1
367.8 328.9 362.6
91.3 82.0 89.7
87.5 87.4 87.2
21
Step 1 Partition the pairs file into four files by pair type. Step 2 Process the (net, link), (link,text), and (net, text) pairs to obtain a file that contains all the text associated with each net polygon. This file is sorted in increasing order of net polygon number. Step 3 Process the (net, net) pairs partitioning the net polygons into equivalence classes where each equivalence class defines a net. The output of this step is a file of pairs (net polygon number, net number). Two net polygons have the same net number iff they are in the same net. This file is in increasing order of net polygon number. Step 4 Combine the files obtained from steps 2 and 3 to obtain a new file where the records are of the form: (net polygon number, text information, net number). Step 5 Sort the output file into increasing order of net number.
Figure 11. Outline of net generator processed in internal memory. An array LinkPolygon[1 .. NumberOfLinkPolygons] is used to point to linked lists of nodes. The linked list for LinkPolygon [i] contains all net polygon numbersj such that (j,i) is a (net, link) pair. This set of linked lists can be set up in time O(NumberOfLinkPolygons + NumberOfNetLinkPairs) without sorting the (net, link) pairs. Once the above linked lists have been set up, the (link, text) pairs are processed. The processing of each
(link, text) pair (/;j) requires us to create as many (NetPolygonNumber, j) pairs (i.e. (net, text) pairs) as there are nodes on the linked list LinkPolygon[i] (one pair is created per node). These (net, text) pairs are appended to the (net,text) file. The time required for this is linear in the number of (net, text) pairs generated. The (net, text) file is now sorted by NetPolygonNumber. Since the number of (net,text) pairs is quite large, an external sort is used is .
Step 3: creating the NetPolygonNumberNetNumberFile Internal memory methods Partitioning the net polygons into equivalence classes can be done in time proportional to NumberOfNetPolygons + NumberOfNetNetPairs using the equivalence algorithm discussed in chapter 4 of Horowitz and Sahni is. The space requirements of this algorithm make it suitable only for smaller chip layouts. As pointed out in Horowitz and Sahni is, the space requirements for this algorithm is also O (NumberOfNetPolygons + NumberOfNetNetPairs). Let us estimate the space required to use the equivalence algorithm of Horowitz and Sahni is 'to partition the net polygons from a 200 000 polygon layout. From the data provided in Table 1, it can be seen that the number of net polygons is typically, 75% of the number of polygons (or approximately 150 000 for this example). The number of (net, net) pairs is comparable to the number of polygons. So, a 200 000 polygon layout is expected to contain about 150000 net polygons and about 200000 (net, net)pairs. The data structure used in Horowitz and Sahni TM sets up linked lists that require a total of NumberOfNetPolygons + 2 x NumberOfNetNetPairs link fields and 2 x NumberOf NetNetPairs polygon number fields. Each link field
22
Table 6. Performance comparison o f SPNG and I M N G Chip number
Sperry's previous net generator (SPNG) time {s)
Internal memory net generator (IMNG) time (s)
1 ime IMNG/SPNG
CPU
I/O
Total
CPU
I/O
Total
in %
I 2 3 4
15.7 145.8 173.3 325.1
32.3 135.8 120.8 148.8
48.0 281.6 294.1 473.9
7.8 28.7 26.8 31.5
32.4 131.8 ]20.7 135.9
40.2 160.5 148.5 167.4
83,7 57.0 50.5 35.3
5 6
145.8 307.3
103.7 148.7
249.5 456.0
23.7 31.4
103.6 135.7
]27.3 167.1
51.1 36.6
should be capable of pointing at any of 2 x NumberOfNetNetPair nodes and hence should be at least [log2 2 x NumberOfNetNetPairs] bits long. If the net polygons are numbered 0 through NumberOfNet Polygons - I, then the polygon number fields need to be at least [log2 NumberOfNetPolygons] bits long. For the 200 000 polygon example, the link fields should be at least [log~ 400 000] bits = 19 bits and the polygon number field at least [log2 150 000] = 18 bits. The total space required is therefore 150 000 x 18 + 400 000 x 19 = 1.03 x 107 bits plus program and I/O buffer space. On a byte oriented computer, a whole number of bytes/field would be used. This would require 3 bytes (eight bits/byte)/field. The total space required for the fields is 550 000 x 3 = 1.65 x I06 bytes. Our target computer does not have this much memory available for this application. Further, considering that new layouts consist of a million or more polygons, the space requirement quickly exceeds I0 Mbyte. The memory requirements can be drastically reduced by using the union-find trees strategy to obtain equivalence classes)s. The space needed by this method is NumberOfNetPolygons x [log2 NumberOfNetPolygons]. For the 200000 polygon example, this is 150000 x 18 bits = 2.7 x 10 o bits or 4.5 x I0 s bytes (additional space is required for the program and I/O buffers). A layout with one million net polygons will require about 3 x 106 bytes for the union-find tree structures and some additional space for the program and I/O buffers. The space reduction afforded by the union-find trees solution is obtained at the expense of some computing time. The run time of this scheme is slightly more than linear. Despite this loss, the run times are still acceptable. Table 6 shows the measured run times of the resulting net generator. As can be seen, these times are significantly smaller than those of the method employed in Sperry's previous net extractor. This net extractor had a run time proportional to NumberOfNetPolygons x NumberOfNetNetPairs.
External memory methods While the union-find scheme of the previous section significantly reduces the space requirements for step 3 of the net generator, we expect that this reduction will not be adequate in many environments. For example, a 3 Mbyte data space is not available on most micro computers. In these environments, the CPU capabilities are adequate to perform net extraction for large layouts (millions of polygons) in an acceptable amount of time. In this section, two algorithms for step 3 are explored that can work even when memory is severely limited. The first algorithm is an hierarchical method based on the union-find scheme. This does not guarantee to succeed on all layouts; but does so on typical ones. The second algorithm is guaranteed to complete its task in the given amount of memory.
computer-aided design
a
b
Figure 12. Layout partitioning for hierarchical algorithm Table 7. Breakdown of times (s) for example layout Algorithm
CPU
I/O
Total
internal Hierarchical (16) Hierarchical (7)
7.829 32.355 40.184 16.524 49.820 66.344 16.538 49.976 66.514
Hierarchical algorithm In this algorithm, the layout is partitioned using horizontal and vertical cuts as in Figure 12(a). The partitioning is done in such a way that each partition has about the same number of polygons. This number should not exceed the number of polygons that can be handled in the available memory by the union-find method of step 2. It may be necessary to renumber the polygons in each partition, beginning at '0' so that the number of bits needed]polygon is reduced. The use of horizontal and vertical lines that span the layout (as in Figure 12(a)) is not essential. One could use a partitioning scheme such as the one in Figure 12(b). The hierarchical algorithm is comprised of the steps given in Figure 13.
Experimental results. Chip number 1 of Table 1 has 22 032 net polygons. When adequate memory to handle this many polygons using the union-find method is available, the net-net pairs can be processed in 40.184 s. Assume that enough memory to handle only 4000 polygons is available. If the layout is partitioned into exactly 16 equal area partitions, the number of net polygons in each partition is as shown in Figure 14(a). The union-find method may be used on each partition. The number of nets that result for each partition are shown in Figure 14(b). The total number of nets is 3765. So, even if the local nets are not eliminated, there is adequate space to process all the nets (steps 5 and 6). Of the 22 103 (net, net) pairs generated by the pair generator, only 2817 are mixed pairs (i.e. in partition p + 1). When these are processed the number of nets that remain is 1963. The total time taken by the hierarchical method is 66.344 s (see Table 7). So, time has increased by approximately 50% while the space required is about one-sixth. An examination of Figure 14(a) reveals that the task can be accomplished using the same amount of memory, (i.e. for 4000 polygons) and only 7 partitions as in Figure 15(a). If this is done, the number of nets/partition is as shown in Figure 15(b). The total number of nets created is 3469 and the number of mixed pairs is 2080. When these are processed the number of nets that remain is 1963. The total time taken is 66.514 s which is about the same as when sixteen partitions were used (c.f. Table 7). This is not surprising as the union-find algorithms run in near linear time. So, it is expected that the hierarchical method will have
volume 20 number 1 january/february 1988
Step 1 Assignment of polygons to partitions Each polygon is assigned to a unique partition. This is the partition into which the polygon vertex with least x-coordinate falls (if several vertices have this x value, then pick, from these, the one with leasty). Step 2 Pair partitioning Let p be the number of partitions of the layout area. Assume these partitions are numbered 1 through p. The pairs are partitioned into p + l partitions. Let (i,j) be a pair of overlapping net polygons. If / and j are in the same layout partition k, then (i,j) is in the pair partition k, 1 ~< k ~< p. If i and j are in different layout partitions, then (i,j) is in the pair partition p + l . Pairs in this last partition are called mixed pairs. Step 3 Renumbering If a layout partition has more polygons than can be handled by the union-find method and the available memory, then renumber the polygons from '0' if this remedies the problem. Otherwise, this partition has to be partitioned further. If the polygons are renumbered, then renumber the pairs corresponding to this partition. Step 4 Process each partition Process all net polygons and (net, net) pairs in the same partition using the union-find method. This processing assigns each polygon in each partition to a unique net (or equivalent class). Step 5 Net classification Consider the nets created in step 4. A net N is local if the pair partition p + l contains no pair (i,j) such that i (or j) is in N. All local nets may be output. Step 6 Process remaining nets Let n be the number of polygons that can be handled by the union-find method. If the number, r, of remaining nets is no more than n, then regard these nets as polygons; transform the polygon pairs (i,j) in partition p + l into net pairs (o(i), o(j)) where o(i ) is the net containing polygon i; and use the union-find algorithm to process these pairs. The resulting nets are output. Step 7 Too many remaining nets If r in step 6 exceeds n, then attempt to merge the nets of the partitions together in such a way that adequate memory is available at each step. This typically involves considering pairs of adjacent partitions. The net-net pairs (i,j) with i and j in these two partitions together with the nets for each of these partitions are processed as in step 6. The two partitions are replaced by a single one, and local nets are output. The pairwise processing of partitions is continued until either only one partition remains or there is no pair that can be processed in the available memory. In the latter case, the algorithm fails to construct all nets.
Figure 13. Hierarchical algorithm a run time almost independent of the number of partitions used.
Nonhierarchical method The hierarchical method just described fails to work when
23
n = number of lists in the block 498 1055 1076 457
99
177
187
76
1138 2791 2805 1142
198
461
500
189
s i = start of list/in the block (note that list] always begins at position n + l )
2829 1071
184
473
476
188
500 1092 1161 578
113
162
168
105
1067,2763
a
b
Figure 1~ Partitioning (16) for examp~ layout (a) polygons in eachpar#tion, and (b) ne~ in each partition step 3
455
3086 2791 2805
461
500
2763 2829
473
476
3797
a
552
3952
543
b
Figure 15. Part/tioning(7) for example layout (a) polygons in each partition and (b) nets in each partition the layout contains a very large number of nets that span all the partitions. The method described in this section will succeed under all circumstances. This method is an adaptation of the internal memory equivalence algorithm described in Chapter 4 of Horowitz and Sahni 15 . The available memory is used for the following data structures. First, process stack (Stack) is discussed. Let the max/hum number of polygons in a net by MaxNetSize. Ideally, a stack capable of holding MaxNetSize - 1 polygon numbers is needed. In a typical layout, most nets (about 90%) are quite small, say <~ ~. The performance of the algorithm is not materially affected if the stack is large enough to hold only ~ - 1 polygon numbers. In this case, segments of the process stack need to be written to and retrieved from disc when the larger nets are processed. Second, there is the polygon status table (PolygonOut). This is a one-dimensional array. Each entry is a bit (true/ false). Initially, PolygonOut(i) = false for all i. PolygonOut(i) is set to true as soon as polygon i has been output in some net. If there are n net polygons, then n bits are needed for this table. When n is large, this table resides on disc. It is stored in fixed length blocks so that a block for a particular i may be retrieved with a single disc access. For the implementation here, it was assumed that n bits of memory are available for this table. For good performance, it is necessary that this table reside in memory. Third, there is the block table (Block). The first step in the net generation algorithm is to construct a file of polygon overlap lists. The polygon overlap list for net polygon i consists of all net polygons with which polygon i overlaps. The lists for several net polygons are packed into a single block that has the following structure:
r l,is -, i i,,st2i,,st, [ 24
If list1 is the overlap list for polygon i, then list2 is for polygon i+1, list3 for polygon i+2, etc. All blocks are of same size and a list is not split across blocks unless it does not fit into a single block. In this case, the list begins in a new block and uses as many blocks as necessary (leaving empty space in the last block if necessary). The block table, Block, is a one-dimensional array that has as many entries as blocks. Block(i) gives the polygon number corresponding to listn in block i. We assume that the number of blocks is sufficiently small that the block table may reside in memory at all times. Fourth, there is the block page table (BlockPage). A certain amount of memory is set aside to hold some of the blocks. This memory is divided into pages where each page is the size of one block. Let p be the number of pages available. BlockPage is a one-dimensional array that has as many entries as the table Block. BlockPage(i) = 0 iff block i is not in one of the memory pages. If BlockPage(i) =/: 0, then block i is in BlockPage(i). After the blocks have been created, the block page table has the values given below I BlockPage(i) =
0
p ~< i < TotalBIocks
i
1%i
p
i = Total Blocks
When the number of blocks is large, the organization of this table may be changed so that it has only p entries. BlockPage(i) gives the block number of the block in the ith page. A search of this table is necessary to determine if a block is currently in memory. The nonhierarchical algorithm consists of the steps given in Figure 16.
Optimization. There are several possibilities for optimizing the basic strategy outlined above. These are stated below. First, since the elements of PolygonOut are accessed in a random order and since each element may be accessed several times (at least twice), it is highly recommended that this array be kept in memory in its entirety. If there aren net polygons, then n bits are needed for this array. Second, if inadequate space for an in memory process stack is available, then the stack may be buffered with lower buffers backed up to a disc. Since elements are deleted from a stack in a very structured manner 'last-infirst-out', this does not result in a substantial increase in disc accesses (unless each buffer can hold only a small number of stack elements). Third, there is a trade off between the number of pages and the size of a page. Since the available memory is fixed, the number of pages is determined once the page size is determined. The larger the page size, the smaller the number of pages. Since the page size is equal to the block size, the page size also affects the number of blocks and hence the size of the block page table. Ideally, it would be preferable that when a block is input into memory, all its overlap lists be processed in step 4 before the block is overwritten. When this is the case, each block is input at most once (the blocks that are in memory after step 2 are never written out and hence never read back). The ideal situation depicted above is difficult to ensure in practice. If a fraction, f, 0 < f~< 1 of the lists in a block are actually used each time a block is input, then the block
computer-aided design
Step 1 Create blocks of overlapping lists First, the file of (net, net) pairs is processed and a new file containing pairs (i,j) and (j,i) for every pair (i,j) in the (net, net) file is created. This new file is sorted into nondecreasing order of i and within i into increasing order of j. A single scan over this sorted file is performed and the blocked polygon overlap list together with the block table created. Step 2 Blocks in memory Suppose that p pages of memory are available. So, p of the blocks created in step 1 can be retained in memory and the remaining written to a disc. Since 1 page is used as a buffer in which the next block is created, blocks for the remaining p - 1 pages only can be selected. This is done by attempting to predict the first p-1 blocks that will be required by the algorithm. The block page table is initialized in this step. Step 3 Initialize The polygon status table, PolygonOut, is initialized to 'false', the process stack is initialized to be empty, net polygon 1 is selected as the polygon from which processing will begin, and PolygonOut(1 ) is set to true. Step 4 Process To process a polygon, all polygons on its overlap list that haven't yet been output are added to the process stack. Their status bit is set to true and they are output as part of the current net. If the overlap list for the polygon being processed is not in one of the memory pages, then the appropriate block is input. This overwrites a block currently in memory. A first-in-first-out discipline is used for page replacement. Step 5 Iterate Polygons are extracted from the process stack and processed as in step 4 until no polygons remain on this stack for processing. When this happens, an entire net has been output. Step 6 Start next net To start the next net, we search through PolygonOut for a polygon that hasn't been output. If all have been output, then no new nets are to be created. Otherwise, the polygon not yet output is the seed for the next net. It is output and its status set to true. Control is transferred to step 4.
Figure 16. Nonhierarchical algorithm is to be input I/f times in all. The total time spent entering blocks is
1 (ts+tl+tt) x b f
(ts+tl+tt) x P
where ts = seek time; tl = latency time; tt = block transmission time; b = total number of blocks; and p = number of pages in memory. The second term accounts for the fact thatp blocks are in memory following step 2. If the block size is increased, f, b, and p decrease. However, tt increases. If the block size is decreased, f,, b and p increases while tt decreases. Since the internal processing time (i.e. the CPU time) is relatively
volume 20 number I january/february 1988
Table 8. Effect of page size on the performance of external memory algorithm Number of blocks on disc
Number Block of blocks size of memory
11 21 41
1 2 4
3584 1792 896
81 162 325
8 16 32
448 224 112
CPU time (s)
I/O time (s)
Total
8.5 8.5 8.7
647.7 408.6 270.0
656.2 417.3 278.7
153.6 141.6 149.6
162.6 151.4 160.9
9.0 9.8 11.3
time (s)
insensitive to the block size, the block size is to be chosen so that the input time is optimized. Fourth, the fraction, f, above is affected by the way in which net polygons are numbered, f tends to be high when polygons that overlap have similar numbers. In this case, it is desirable to replace the process stack with a process heap so that overlapping polygons are processed in order of their numbers Is . Fifth, by looking ahead, some of the blocks that will be required early in the processing can be determined. For example, the overlap list for polygon I can be put on the process stack as it is created and the polygon status fox these polygons set to 'true'. An attempt is made to retain the blocks containing the lists for the polygons on polygon I's overlap list in the available p pages.
Experimental results. In the implementation, the firstp - I and the last block in memory at the end of step 2 were simply retained. Further, we assumed that enough space to retain the polygon status table block page address table, and the process stack is available in internal memory. Table 8 shows the results of varying the block size under the assumption that a total of 3584 words of memory is available for pages. The layout used is layout # I of Table I. As expected, CPU time is quite insensitive to block size. However, the I/O time has the expected trough like behaviour achieving a minimum of 141.6 s when the block size is 224 words.
Steps 4 and 5: associating text information The output file from step 3 consists of pairs (polygon number, net number). This file is in nondecreasing order of net number. To introduce the text information, this file is first sorted into increasing order of polygon number. The resulting file is merged with the (polygon number, text) file created in step 2. Finally, the file is reordered by net number to get the desired output.
CONCLUSIONS A net extraction system has been developed that may be used even when memory is severely limited. The system makes effective use of good data structuring and algorithm design principles. The system compares very favourably with the previous net extraction system at Sperry. The new system runs significantly faster and utilizes much less space. On our six chip test set the new system took between half and a third of the time required by the old software and its memory requirements are less than half that of the old software. As a result, it is possible to handle much larger layouts using the available memory and using less computer time. When layout size exceeds the capacity of the internal memory methods, one of the two external
25
memory schemes discussed may be used. The hierarchical scheme is about 2.5 times faster than the optimized version of the nonhierarchical scheme. However, it does not guarantee success. The major contributions made here are summarized as follows: •
It has been shown how the rectangle overlap algorithm of Bentley et a111 may be enhanced by using only one record/rectangle rather than two as suggested in Bentley et a/11 . • A fast and practical algorithm for overlap detection of arbitrary polygons has been developed. This involves working with the intersection rectangle of the bounding rectangles. This algorithm was found to be faster than the O(N log N) algorithm on all polygon pairs tested 14. • Two external memory methods for net extraction have been developed. These are very practical in environments where the internal memory available is inadequate. In fact, the hierarchical method could be expected to require only about 50% more time than the internal memory method.
ACKNOWLEDGEMENTS The work for this paper was supported in part by the National Science Foundation under grant DCR-8305567. Surendra Nahar was also part supported by Sperry Corporation as a student intern.
REFERENCES 1 Baird, H S and Cho, Y E 'An artwork design verification system' Proc. 12th Des. Automation Conf. (1975) pp 414-420 2 Baird, H S 'Fast algorithms for LSl artwork analysis' J. Des. Automation and Fault-Tolerant Computing
3
26
Vol 2 No 2 (1978) pp 179-209 Le Carpentier, Jacques 'Computer aided synthesis of an IC electrical diagram from mask data' IEEE Int. Solid-State Circuits Conf. (1975) pp 84-85
4
Chapman, P T and Clark Jr., K 'The scan line approach to design rules checking: computational experiences' Proc. 21st Des. Automation Conf. (1984) pp 235-241
5
Lauther, U 'An O(NlogN) algorithm for Boolean mask operations' Proc. 18th Des. Automation Conf. (1981 ) pp 555-562
6
Preas, B T, Lindsay, B W and Gwyn, C W 'Automatic circuit analysis based on mask information' Proc. 13th Des. Automation Conf. (1976) pp 309-317
7
Szymanski, T G and Van Wyk, C J 'Space efficient algorithms for VLSl artwork analysis' Proc. 20th Des. Automation Conf. (1983) pp 734-739
8
Wilmore, J A 'Efficient Boolean mask operations on IC masks' Proc. 18th Des. Automation Conf. (1981) pp 571-579
9
Bentley, J L and Ottmann, T A 'Algorithms for reporting and counting geometric intersections' IEEE Trans. Comput. Vol C-28 No 9 (September 1979) pp 643-647
10 Bentley, J L and Wood, D 'An optimal worst case algorithm for reporting intersections of rectangles' IEEE Trans. Comput. Vol C-29 No 7 (July 1980) pp 571-577 11 Bentley, J L, Haken, D and Hen, R 'Fast geometric algorithms for VLSl tasks' IEEE Compcon. Spring (1980) pp 88-92 12 Lauther, U '4-dimensional binary search trees as a means to speed up associative searches in design rule verification of integrated circuits' J. Des. Automation Fault-Tolerant Comput. Vol 2 (1978) pp 241-247 13 Nievergelt, J and Preparata, F 'Plane-sweep algorithms for intersecting geometric figures' Comm. ACM Vol 25 No I0 (October 1982) pp 729-747 14 Shames, M I 'Geometric intersection problems' Proc. 17th A nnual Syrup. Foundations of Computer Science (1976) pp 208-215 15 Horowitz, E and Sahni, S Fundamentals of data structures Computer Science Press Inc. (1975)
computer-aided design