Comput. & Graphics Vol. 12, Nos. 3/4, pp. 557-564, 1988
0097-8493188 $3.00 + .00 © 1988 Pergamon Press plc
Printed in Great Britain.
Technical Section
POINT DATA ANALYSIS* ALLEN KLINGER Computer Science Department, University of California, Los Angeles, California 90024-1596 and WARREN K. FOX Hughes Aircraft Company, E1 Segundo, California 90245 Abstract--This paper describes graphic modelling algorithms and simulation experiments for modeling solid objects. The source data is idealized point images containing the corner locations of the objects. The results suggest practical constructive procedures for syntactic primitive detection and primitive sequence recognition. 1. FROM DATA TO MODELS
a. Introduction The process of characterizing shape from line-drawings such as the outline or set of edges derived from the image of a solid object is well established in the research literature. There are many edge- and boundary-extraction algorithms. Such algorithms are often used to derive lines that are parts of basic constituents of patterns called primitives. (A primitive is an element or unit of a structure.) Combinations of primitives make up objects in images in the same way as letters form words. In particular, primitives are combined according to rules analogous t o ' "q" is followed by "u" ' in spelling. For a basic reference on primitives see [ I ] or [2, pp. 426-433]. Primitives are the basis of models of objects in scenes in [3, 4]. For geometric solids there is a useful set of line= based primitives due to Guzman[5]. (These are: "arrow," "tee," "fork," and several other multiline elements[2, pp. 442-449].) However, a large number of situations and sensor modes give rise to data that is predominantly point or blob (diffuse points). Many imaging devices use transform data (Fourier, Mellin). High spatial frequencies are detectable in the transform domain through filtering. This makes it possible to sense sharp corners, something that may be particularly useful in detecting the presence of manmade objects from synthetic aperture radar imagery, or in other remote sensing situations. Finally blobs in imagery (which of course may arise from noisy point-data), have been the basis of a successful commercial blood-cell counting machine[6], that treated them as their "eroded" central points. Practical use of point data in these situations is not matched by theoretical work on models that derive primitives from sets of points. This paper addresses that issue by using computer graphics and simulation models to ascertain the utility of point data in recognition of manmade solid objects. The basic concept of our research is as follows. Figure 1 illustrates a solid object and the point data derived * This research was partially supported by the State of California and the Hughes Aircraft Company through the MICRO program.
from the object. The task is to construct a model of the underlying object given only the point data. Our bottom-up approach begins with identifying primitives in the raw data. Further steps in the model building process then operate on the primitives. We then apply syntactic pattern recognition techniques to the set of discovered primitives. This paper concerns two systems of point data primitives and the object-characterization possibilities of the primitives. b. Background Many algorithms connect points to neighbors based on local properties; the connection criterion is called a neighborhood function. In a pioneering work on point interconnection, Zahn[7] used the minimal spanning tree neighborhood function. Ahuja[8] applied Voronoi diagrams as neighborhood functions, obtaining improvements in segmentation and pattern matching over earlier systems. Toussaint[9] presented the relativeneighborhood-graph (RNG), a subset of Voronoi diagrams and he concluded that RNGs produce useful point-adjacency graphs. Toussaint[ 10] gives an extensive treatment of neighborhood functions and point processing techniques. Point processing has been applied by Lavine, Lainbird and Kanal[11], Zucker and Hummel[12], and Klinger, Bassett and Fox[ 13] to match patterns and characterize points, Reference [ 11] aligned time-varying data by finding the best match between points taken from two images. Reference [ 12] labeled points as "interior," "edge" or "noise" and then used relaxation to find a consistent labeling. c. Objectives Our goal is to use computer graphics and simulation modelling methods to detect special groups of points in planar images. We further wish to (1) give symbolic labels to these point sets, (2) develop a graphic-based point data modelling method for pattern recognition, and (3) investigate algorithms to classify/identify objects giving rise to the point data. This paper extends work we initiated in [13], tests its algorithmic effectiveness, and adds both a new primitive and a computer graphics based software testing environment. 557
558
ALLEN KL1NGER and WARREN K. FOX
4-
4-
4-
Fig. 1. Visible edges and comer points.
known network concept, presented briefly next to make this paper self-contained.
2. DEFINITIONS
This section combines concepts from geometry and real analysis, with graph theoretic and computer-vison/ scene-analysis notions. The underlying approach of our experiments, that is, the algorithm idea, and the categories we apply to computer graphics models of solidobject images, results from these definitions. The categories are shown by pictorial examples in tables in the last two subsections of this part. The idea is a well-
a. M i n i m a l spanning tree
Let V be a collection of nodes. A spanning tree T on V is a set of edges joining the nodes such that the graph is complete and without cycles[14]. (Complete graphs have every node connected; cycles are nonoverlapping paths connecting a pair of nodes.) The
Table 1. Description of neighborhood primitives.. Primitive Type
No. Points in Neighborhood
O
1
+
Single point
I
2
+--+
Two points
L
3
.~ / . P
Point not connected
V
3
Picture
~
Description
to both others
: P
Y
4 .~
r-
Point connected to
~
others and not in a
P "x~
T
4
+ - ~
p
Point connected to both others
-,~
half-plane
Point not connected
to others
W
4
~,,~ P
Point connected to others in a half-plane
X
5
no restrictions
5 points
*
6+
no restrictions
6 or more points
Point data analysis
559
Table 2. Description of point-pair primitives. Primitive
Type
s
No. Point-pairs With Same Length
(single)
Description
Single pair with same orientation as another pair
D (double)
2
Two pair with same orientation and length
P (Uiple)
3
Three pair with same orientation and length
M (many)
4+
total cost of a spanning tree Tis the sum of the weights ofaU of T's edges. The minimal spanning tree (MST) is the special graph with the lowest total cost derived from all spanning trees.
b. Neighborhood primitives A set of straight-line intersections found in a twodimensional image representing a solid composed of intersecting planes, described in [5], became a key lineanalysis technique in computer vision. This paper contributes an innovation which extends that set to situations where points are the basic data elements. A new set of primitive subclasses (see Table 1) is based on the neighborhood, or region surrounding every data point ("centerpoint"). These classes are based on the gross geometry and number of neighboring data points to the centerpoint. The definitions here use either the minimum spanning tree information or a measure of angles about the centerpoint.
Four or more pairs having same orientation and length
c. Point-pair primitives Since parallel lines are useful in analysis, we seek these by defining pairs of points as entities[ 15], as part of the process of locating candidate object edges. Table 2 describes how groups of points can indicate the presence of edges; multiple-edge primitives strengthen hypotheses regarding the presence of planar surfaces. Table 2 lists four point-pair primitives. Each pair of points delimits a distance, (length of the segment that joins them) and an orientation (angle the joining line makes to the horizontal axis). The "point-pair primitive-set" labels all point-pairs that have similar distance and orientation. The point-pair labeling process uses three threshold parameters: distance, angular resolution and lengthsimilarity values. The distance threshold is an upper bound for the point-pair distances to be considered as instances of a primitive. The angular resolution threshold is the maximum angle difference for pointpairs to be labeled "same orientation." The length-
Piped
Pedestal Cube
Fig. 2. Objects used in experiments.
560
ALLEN KLINGER and WARREN K. FOX
start query user for parameters read object data loop 1,000 times generate view of object points increment viewing angle for next iteration compute interpoint distances and angles if point pair-primitives selected compute true edges computer primitive labels end if if neighborhood primitive selected compute MST computer primitive labels end if print results end loop stop Fig. 3. Control program pseudo-code. similarity threshold is the accuracy for two point-pairs to be labeled "same length."
the primitives in each object view, and output associated primitive-label strings. We treat each string of symbols as a concise description of the point image. We then analyze the set of strings using syntactic pattern recognition techniques[ l ].
c. Program description We developed several programs to apply the point primitives to dot patterns. The programs[15] consist of approximately 600 executable statements in eight C-language source files. Together, they generate the images and determine their primitive labels. Figure 3 presents a pseudo-code description of the control program. The program begins by reading a data file containing the three-dimensional corner coordinates and edges of an object. The images are generated by the program incrementing viewing angle to the object, applying the perspective transformation, and obtaining the two-dimensional Cartesian-coordinate corner representations. The point analysis computes the distances and angles between points and the rest of the point set. This data is processed to detect occurrences of both types of primitives.
3. EXPERIMENTAL ANALYSIS
a. Data This section develops the computer graphics model we use. We assume that a digitized grey-level image has been processed (for example, by convolution with masks followed by high-pass spatial filtering and thresholding) to yield a binary array whose one-values represent object corners, edges/lines, and isolated points. These experiments simulated such data arrays by a short list of Cartesian coordinate values (x-y l~rs) that correspond to object corners. In [13] a program to rotate the underlying solid object was used to generate many such lists. A different program for rotations[15] added perspective transformation before projecting the three-dimensional object coordinates onto a two-dimensional viewing surface. The basic dataset for our experiments consisted of 3,000 sets of coordinate values, 1,000 views of each of three different objects: "cube", "piped", and "pedestal" (see Fig. 2). b. Experiments The general idea of the computations is to find any instances of the several point primitives in the views, and to determine correlation of patterns in primitivetypes with objects. The left column of Tables 1 and 2 indicate how primitive type is denoted. The labels in Table l adapt the notation of [5] to point data. (For a visual presentation see column 3 of Table 1.) Specifically, the experimental computations detect and label
oiiyttt iillvtt iillvtt oiilvtw
oiilvyt iillvtt iillvtt oiilvtw
oiilvyt iillvtt iillvtt oiilvtw
oiilvyt iiiivv iillvtt oiilvtw
d. Output format The output consists of strings of primitive labels, either from {o i 1 v y t w*} for the neighborhood, or from {s d p m} for the point-pair case. Since each view contained at most eight Cartesian-coordinates of points, the output string size is bounded by eight for the neighborhood primitives, and twenty-eight for the point-pair primitives (combinations of eight things two at a time yielding all "singles"). Figure 4 shows some example primitive strings that describe different views of the cube. e. Pattern analysis The strings of primitive labels (words) obtained from single viewing angles become the input to the analysis routine. The analysis program scans the output words for patterns. Here a pattern is defined roughly as the occurrence of one or more primitives within a word. For example, the pattern OI is present in any word that has both an O and an I. Patterns also are present when any one of a set of primitives occur. E.g., a pattern can be occurrence of O, I or both in a word. We denote this by parentheses, i.e., the above becomes: the pattern (IO) matches any word that contains either an O or an I. Thus adjacency of primitives of parentheses implies logical "and," while the contents of any parenthesis are logically "or"-ed. The experiments defined approximately fifty distinct patterns and applied them to all three thousand views.
oiiilvt iiiivv iillvtt oiiilvt
iillvtt iiiivv oiiilvt iillvtt
iillvtt iiUvtt oiilvtw iillvtt
Fig. 4. Symbol strings of neighborhood primitives for cube object at thirty-two viewing angles.
iillvtt iillvtt oiilvtw iillvtt
Point data analysis PATTERNS (user supplied) 0 0 , OI, (OI) ...
L
SYMBOL STRINGS
(program output) OOI LVT OIIYTT I I I LLV
MATCHING ALGORITHM
Pattern
Number of Matches
oo oI (01)
Fig. 5. Pattern search process. This is illustrated in Fig. 5, where Fig. 4 words are one input, and user-supplied patterns, the other. 4. RESULTS
The just-described program generates strings of primitive labels which we analyzed for recurring patterns. We found confirmation of the hypothesis that patterns of primitive labels occur more often in views of one object than the others.
561
Table 4. Medium-neighborhood primitive frequencies. Pattern OccurrencesPet I000 Views
pattern 0 I L V Y T W X
O0 H LL VV "IT 000 (OI) OVY) O(LV) ofrwY) I(LV) I(TWY)
cube 640 932 552 673 302 790 71 0 0 191 871 297 157 530 51 1000 373 337 526 673 793
Object piped pedestal 97 722 438 836 676 377 825 436 19 26 622 611 19 72 2 363 0 25 34 262 420 450 627 112 350 266 583 458 2 82 442 997 38 98 55 437 77 539 267 531 225 561
a. Neighborhood primitives Tables 3 to 5 present the frequencies of a set of twenty-one primitive patterns over the three objects in
Fig. 2. Notice the variation of the neighborhood sizes between the tables. These patterns were chosen to demonstrate the potential of this approach. [Detailed
Table 3. Small-neighborhood primitive frequencies.
Table 5. Large-neighborhood primitive frequencies.
PatternOccurrences Pet I000 views
Pattern Occurrences Per I000 Views
pattern O I L V Y T W X OO II LL W TT OOO
(oi) (wY) O(LV) O(rWV) I(LV) IfrwY)
cube I000 870 0 340 0 0 0 0 0 524 870 0 291 0 293 I000 0 340 0 340 0
Object piped 366 759 448 587 4 393 8 0 0 227 758 396 277 337 197 845 12 63 10 432 340
pedestal 943 690 292 385 19 253 14 13 0 729 622 222 168 163 516 I000 33 552 243 418 155
pattern O I
L V Y T W X OO II LL VV Tr OOO
(Ol) (wv) O(LV) O(VWV) I(LV) I(TWV)
cube 0 248 883 794 0 852 14 584 143 0 34 227 360 791 0 248 14 0 0 248 215
Object piped 0 201 753 962 1 831 11 669 56 0 122 300 366 750 0 201 12 0 0 201 101
pedestal 0 745 478 566 132 776 76 523 195 0 579 207 349 614 0 745 208 0 0 501 660
562
ALLENKLINGERand WARRENK. FOX Table 6. Edges found by point-pair primitives. point-pair parameter values number of labeled edges angular distance actual spurious resolution threshold detections detections 9 .97 8188 8775 9 .94 8145 8454 9 .91 8103 8122 8066 7720 9 .88 9 .85 8040 7001 9 .82 8022 6175 9 .79 8008 5004 9 .76 8004 3592 9 .73 7988 2483 9 .70 7639 1694 7344 1372 9 .67 9 .64 6879 1204 9 .61 6091 1067 5096 864 9 .58 12 .97 8013 8295 12 .94 7978 7977 7943 7651 12 .91 7096 7228 12 .88 12 .85 7878 6433 12 .82 7859 5598 12 .79 7841 4452 12 .76 7822 3147 12 .73 7814 2100 12 .70 7459 1342 12 .67 7169 1054 12 .64 6705 919 12 .61 5906 803 12 .58 4874 622 15 .97 7832 7703 15 .94 7807 7445 15 .91 7780 7172 15 .88 7745 6729 15 .85 7716 5927 15 .82 7689 5072 15 .79 7671 4046 15 .76 7640 2838 15 .73 7628 1828 15 .70 7265 1128 15 .67 6960 870 15 .64 6484 746 15 .61 5658 640 15 .58 4643 490
entries in [ 14] list the frequency of co-occurrence of a large set of test patterns.] The tabulated experimental data enables the following observations. Under the small neighborhood radius (Table 3), most of the image points in views of the cube have only one or two close-by points. In our experiments the cube showed only O, I, and V primitives. On the other hand, the piped geometry guarantees a long MST edge. Using a small percentage of the long axis length isolates the points from the two ends. But this threshold also allows most end points to fall within
true edges missed 180 223 265 302 328 364 360 364 380 729 1024 1489 2277 3272 355 390 425 462 490 509 527 546 554 909 1199 1663 2462 3494 536 561 588 623 652 679 697 728 740 1103 1408 1884 2710 3725
the radius of the nearby points (same end). The piped image set has the O, I, L, V, T, Y, and W primitives and a significantly larger average neighborhood size than the other two objects. The medium neighborhood radius (Table 4) leads to another set of observations. This radius removes only the longest MST edge from the point interconnections. All of the objects have fewer O's and I's and more larger size primitives that depend on several points. The Y primitive is detected in 302 cube views compared to only 19 times for the piped, and 26 for
563
Poim dam an~yms
Labeled Edges vs. Distance Thresho 8000"
7000" correctly labeled edges 6000incorrectly labeled edges
5000" Number of Edges
4000-
3000-
2000-
i000-
0 .50
I
I
• 60
.70
I
I
t
.80
.90
1.0
Distance Threshold (Units: Largest Interpoint Distance) Fig. 6. Edge classification and distance threshold.
the pedestal. The center point of the seven visible corners of the cube forms the Y configuration with three other points. Similarly, the piped has a large number ofT primitives because each point connects to all others at its end and none from the other end. Overall, the cube and pedestal average neighborhood sizes agree while the piped size is significantly larger. In the large neighborhood radius case shown in Table 5 the relatively equally-spaced cube points have neighborhoods that include most or all of the other points. In contrast, the piped and pedestal point labelings change less; at this radius, the number of X and • primitives becomes significant. The piped has the fewest • labelings due to the separation of the ends by the long axis. None of the objects has an O primitive, while the pedestal has the most I labels caused by the bottom edge being wider than the radius. Here the large neighborhood causes the Y primitive to become an indicator of the presence of the pedestal object.
b. Point-pair primitives Table 6 shows the accuracy of the point-pair primitive labels in finding the spurious object edges. The table lists 42 parameter settings and their ability to find true and false edges and ignore true edges. The table specifies the angular resolution as the number of divisions of 360 degrees and the distance threshold as a fraction of the largest interpoint distance. Distance thresholds in the lower end of the range cause the algorithm to retain most of the true edges but only a minimal number of the false edges. Detailed analysis[14] confirms that a choice of parameter such as 12 angular distinctions and a small radius, maximizes good edge detections and minimizes false edge labels; see Fig. 6. 5. CONCLUSIONS The point primitive sets presented here support syntactic pattern recognition• We found that the neigh-
ALLENK.LINGERand WARRENK. FOX
564
borhood primitive labels vary widely between the sample objects. Therefore careful examination of the data can be used to form syntactic rules to distinguish among objects. That Y labels strongly favor the cube serves as an example of such a rule. The point-pair labels also support syntactic processing. We found that parameters of our algorithm could be chosen to minimize the number of spurious edges and maximize the actual edges found. The computer graphics routines worked well to generate the data for our simulation experiments. The process showed that integrating the rotation of view into a simulation study is very manageable. The experiments themselves lead us to conclude that point data can be the basis of useful algorithms. For example, rules that compare the distinct orientations and total edges against a database can be used to recognize objects. These procedures may yield judgemental statements such as the following. For example, a symbol string describing an object with 15 edges and 5 orientations could have as a summary statement of the recognition achieved: "90 percent confidence for object type of A." Finally, the appended photographs show that point data are significant in real contexts, so that these conclusions may become applied to recognition. Finally, the use of computer graphics, in combination with simulation stands as valid method for testing vision concepts. The use of point data is, despite the remarkable success of that approach for cellcounting[15], relatively unutilized in vision. We hope that this paper contributes a forward step in that direction. REFERENCES
1. R. C. Gonzalez and M. G. Thomason, Syntactic Pattern Recognition, Addison-Wesley, Reading, MA (1978).
2. R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis, John Wiley & Sons, New York (1973). 3. D. J. Braunegg and R. C. Gonzalez, An approach to industrial computer vision using syntactic/semantic learning techniques, in Proc. Seventh International Conf. on Pattern Recognition, Vol. 2, pp. 1366-1369 (1984). 4. H. S. Dond and K. S. Fu, A syntactic method for image segmentation and object recognition, ibid, pp. 1380-1382 (1984). 5. A. Guzman, Decomposition of visual scene into threedimensional bodies. Proc. FJCC 33, 29 !-304 (1968). 6. C.T. Zahn, Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans. Comp. C-20, 6886 (January 1971). 7. N. Ahuja, Dot pattern processing using Voronoi polygons as neighborhoods, in Proceedings 5th International Conference on Pattern Recognition, Miami, FL., pp. 11221128, IEEE Computer Science (catalog no. 80CH14993) (1980). 8. G. T. Toussaint, The relative neighborhood graph of a finite planar set. Pattern Recognition (1980). 9. G. T. Toussaint, Pattern recognition and geometrical complexity, in Proceedings 5th International Conference on Pattern Recognition, Miami, FL., pp. 1324-1347, IEEE Computer Society (catalog no. 80CH 1499-3) (1980). 10. D. Lavine, B. A. Lambird and L. N. Kanal, Recognition of spatial point patterns. Pattern Recognition 16, 289295 (1983). 11. S. Zucker and R. Hummel, Toward a low-leveldescription of dot clusters: Labeling edge, interior and noise points. Comp. Graphics Image Processing 9, 213-233 (1979). 12. A. Klinger, E. Bassett and W. Fox, Models and primitives from point sets in Intelligent Robots and Computer Vision, David P. Casasent, Ernest L. Hall, Eds., Proc. SPIE 531, 176-183 (1985). 13. R./'rather, Discrete Mathematical Structures for Computer Science. Houghton Mifflin Co., Boston, MA (1976). 14. W. K. Fox, Experiments in low-levelpointing processing. M.S. thesis, University of California, Los Angeles (1986). 15. K. Preston, Jr., M. J. B. Duff, S. Levialdi, P. E. Norgren and J-i. Toriwaki, Basics of cellular logic with some applications in medical image processing. Proc. IEEE 67, 826-856 (1979).