Symbolic reasoning in object extraction

Symbolic reasoning in object extraction

COMPUTER VISION, GRAPHICS, AN,, IMAGE 52, 447-459 (1990) PROCESSING NOTE Symbolic Reasoning in Object Extraction AMNON MEISELS AND DORONMINT...

840KB Sizes 3 Downloads 114 Views

COMPUTER

VISION,

GRAPHICS,

AN,,

IMAGE

52, 447-459 (1990)

PROCESSING

NOTE Symbolic

Reasoning

in Object Extraction

AMNON MEISELS AND DORONMINTZ Department

of Mathematics

and Computer Beer-Sheva,

Science, Ben-Gution 84-105, Israel

University

of the Negev,

Received September 28, 1988; accepted June 18, 1990 A realization of the top-down use of knowledge in the process of extraction of simple man-made objects from aerial photographs is presented. The finding of objects is performed by a reasoning rule-based program written in prolog. The program is purely symbolic and has no access to digital data, yet it produces the needed objects by controlling three other modules of the system through a purely symbolic interface. The program was run as a road finder on aerial images and the experiment is described in detail. We demonstrate the simple “programmability architecture” of the program by presenting examples of simple additions that make it find different kinds of simple objects. Our paradigm makes it possible to understand much better, via the reasoning mechanism, the process of object extraction. 0 1990 Academic

Press, Inc.

1. INTRODUCTION

Image understanding systems (IUS) are usually divided into a low-level stage (LLS), dealing mainly with pixel segmentation, and a high-level stage (HLS), which models a scene and interprets it in terms of human concepts [ll, 11. Most early expert systems for vision limited the “reasoning” process to the HLS [l, 61. In these systems, the LLS was implemented as a set of inflexible image processing routines. That is, the LLS provided a fixed choice of the binarization and segmentation techniques to the HLS, each with a relatively narrow parameter space [l, 61. In the more flexible of the early systems, the HLS acted as the “chooser of parameters” for the LLS [11,93. In recent years, a change in this trend is taking place. An ever growing number of researchers maintain that an IUS must have knowledge about the context of the picture, in order to be able to segment and interpret a scene [3,10,7]. Three-level architectures have been proposed for the implementation of knowledge-based control on image interpretation [lo] and there seems to be general agreement as to the usefulness of an intermediate level of representation [lo, 71. However, these proposed systems and paradigms do not deal with the design of specific feedback mechanisms for HL control over LL processes. We propose here a realization of just such a mechanism for feedback and control of the high level program over the intermediate level one. It is implemented in the form of a system for the extraction of simple objects from aerial photographs. Our system maintains a list of simple, geometrically defined, objects that it can “look for” and “find” in a picture [S]. The design of our rule-based object finder (ROF) is based on the following main operational principles: 1. ROF is divided into rule-based modules, 2. all communication among the modules is symbolic, 447 0734-189X/90

$3.00

Copyright 0 1990 by Academic Press, Inc. All rights of reproduction in any form reserved.

448

MEISELS

AND

MINTZ

3. each module has its own set of tasks, and 4. the rules of ROF are visible to the human operator. These principles of ROF are implemented by the use of four conceptually different modules. The HLS of the system is the symbolic reasoning module (SRM), which is familiar with the human interface and is able to translate it into rules and actions of the system. All functions that relate to the picture, such as filtering, edge finding, and region starting, are performed by the low level module, the raw data module (RDM). Two modules perform intermediate level tasks. operating between the high SRM and the low RDM. These are the aggregation module (AM), which aggregates pixels chosen by the RDM into tokens that are described symbolically, and the quantification module (QM), which serves the three modules (SRM, AM, RDM) with quantification services. Many of the self-consistent thresholding techniques of the quantifying module are adapted from Nagao and Matsuyama 191. The present paper discusses the design of the symbolic program that serves the task of the SRM in ROF. It receives symbolic directives from the user, in the form of descriptions of objects, and tries to find these objects on the image by rule-based reasoning. The SRM is written in prolog (see Appendix), which makes it very much user programmable. We have chosen to implement the principles of ROF within a detailed experiment of road finding on aerial images. This is an intrinsically 21) implementation that takes roads to be curvilinear narrow strips that can be extracted on the basis of their shape, edges, brightness, and regionlike inside. The reason for this 2D choice is twofold. First, we have designed the experiment so

FIG.

appear

1. The in many

aerial photograph examples below.

that

was analysed

for

road

finding.

The

boxed

lower

window

will

SYMBOLIC

REASONING

IN OBJECT EXTRACTION

449

that it can be compared to former similar experiments like [6,11,2], all of which dealt with 2D object extraction by one or another form of reasoning. Second, we used the same image (Fig. 1) as Canning et al. in cvl at the University of Maryland [2], for our experiment and so we can compare notes on curvilinear feature extraction. 2. SELF-CONSISTENT

AND ADAPTIVE

REASONING

The reasoning procedure of the symbolic reasoning module takes the form of a hierarchy of rules. When the SRM is faced with the problem of the construction of a line from two line segments, it reasons about the possibility that they are actually connected into one line. It does so by evaluating measures of nearness and other kinds of compatibility attributes in a hierarchical procedure that prefers high-level hypothesizing to low-level checking. This approach to hypothesizing is a generic feature of the reasoning module. The eualuations of the measures of nearness are not a part of the symbolic program. These evaluations, which use statistical data about the digital picture, are performed by a quantifying program, the intermediate level QM [8,4]. Line segments and uniform blobs serve as the building blocks of the segmentation program. Unlike the part of the system that grows line segments, the symbolic reasoning program is not acquainted with the actual lists of pixels that form line segments or blobs. It therefore needs a symbolic description for items such as elementary blobs. The SRM’s decisions about connecting blobs take into consideration the distance between the center points of the blobs in relation to their radii, the difference in their (representative) brightness, and other attributes. All of these considerations are based on an intermediate level symbolic representation of uniform blobs. Similarly, decisions about joining line segments use information about directions, lengths (in relation to gaps), and the brightness to the sides of the edges of these lines and line segments. Decisions about joining uniform blobs, made by higher level reasoning, need lower level backtracking for undecidable cases. We backtrack by a process that uses data about actual pixels and their measured parameters. These intermediate processes, such as the aggregation module, have mixed symbolic and numerical mechanisms that have access to the digital data. Unlike the SRM, they can check parameters like the average distance between edge pixels of the two blobs, estimate the percentage of “different” pixels in the area that contains both blobs, and calculate the space between them. Sometimes these modules perform their checks by trying to find a path of compatible pixels from one blob to the other (see connect( ) in the Appendix for details) [8,5]. In reasoning about the construction (extraction) of objects in the picture, the symbolic reasoning module makes extensive use of the concept of nearness. Geometric nearness between any two objects is measured relative to their sizes. The self-consistent quantification procedure of ROF also provides symbolic measures for nearness [8,4]. All sizes of objects are defined in our scheme in a relative manner and in relation to the specific picture. A specific example for a measure of nearness that is part of the quantifier (QM) of our road-finding routine is the following. Two objects are defined to be very near each other if the distance between them is less than the measure of the size of a small object of the same type. By the same reasoning, two objects are considered to be near, if the distance

450

MEISELS

AND

MINTZ

between them is less than the measure of their own size. This kind of definition of the concept of nearness is clearly adaptive. A combination of edge and region evidence-sources is used in the construction process. Assume that line segments were generated first in the process and that edges of segments were constructed from them. Smooth-region evidence can be used for the detection of edge delineation inconsistencies by calling on lower level routines. These calls verify the existence of a clear connection between blobs that form the inside of the sketch of the edges of the object that was constructed. To help the process in making decisions, a certainty factor is returned by all lower level routines, estimating the lower level confidence in all rechecks of connections. 3. A DETAILED

EXAMPLE:

ROAD

FINDING

The operation of finding roads on an aerial image such as Fig. 1 is the result of connecting strips, which are defined in terms of ROF’s primitives. These strips are then used to construct road segments. This way, a road forms an additional level in the hierarchy of the objects that the SRM can construct out of image primitives. The process of constructing a strip starts at the low level part by finding edge pixels. These are combined into line segments, by the intermediate level AM, out of which the SRM constructs strips. Line segments are combined by the SRM into longer lines. This operation is done for very near line segments without checking the actual pixel-to-pixel existence of connections with lower level programs. This procedure can be seen in the Appendix as the prolog predicate Igrow lines( ). The formation of sketches of strips out of the lines that were constructed starts with the choosing of lines that are parallel to each other. Later on, these sketches will be made into full strips by checking that they contain a uniform area between the two antiparallel lines that form their edges. Lines are first chosen by their length and contrast and the program opens a window around them in which it looks for a matching line whose direction differs by about 180”. This way, seeds of strips or primary strips are constructed. Seeds of strips, which consist of two line segments which were found to be parallel to each other, can be seen in Fig. 2. After a strip is found, more line segments become possible candidates for being parts of it. In the example on the top of Fig. 3, one side of the constructed strip is made of strongly connected line segments and the other side is made out of fragments of line segments. The SRM reasons that the fragmented part of the strip is actually connected and asks the lower level programs to check this assumption. The typical gap between two line segments that appear on the top of Fig. 3 is being sent back to the line segments constructor, the aggregation module (AM), to try and reconstruct a new line segment there. The relevant prolog rule which is used to elongate line segments by requests from the aggregation module can be seen in the Appendix, it is called longer-( 1. The final result of joining candidate line segments to the primary strip by the reasoning program is on the bottom of Fig. 3. Note the remaining gap in the strip of the main highway; that is the intersection of the wide highway with another road. We will discuss an example of a higher level mechanism that can bridge this gap in the next section. In order to complete the process of road segment construction, the SRM performs a check of the uniformity of the area which is enclosed by the edges of the strips. If the area is uniform, and of the requested brightness, the SRM assumes that it has constructed a road segment. Partial results of this repeated

SYMBOLIC

REASONING

IN OBJECT EXTRACTION

451

FIG. 2. Primary strips of large width (top) and narrow width (bottom). Note that the two sides of a primary strip do not have to be of equal length.

452

FIG. 3. connecting

MEISELS

AND

MINTZ

Candidate line segments for addition to the primary wide strips on the top and the result primary strips to additional lines and strips by the SRM on the bottom.

01

SYMBOLIC

FIG.

connecting

4.

REASONING

IN

OBJECT

4.53

EXTRACTION

Filled strips of large width that were found in the image by the SRM the inside regions of two of these filled strips is shown on the bottom.

(top).

The

results

of

454

MEISELS

AND

MINTZ

procedure are shown on the top of Fig. 4 and further results on the bottom of Fig. 4. The insides of these filled strips are regions that were constructed by the SRM from intermediate level uniform blobs. Uniform blobs are grouped only if they are near enough and compatible in brightness. The difference between a symbolic representation of blobs and their actual pixel representation is apparent in Fig. 4, in the form of many small holes that remain in the pixel representation of filled strips. Regions inside the filled strips are made of grouped uniform blobs and have to be very near and very similar in order to be grouped by the SRM, which is not always the case. This is why repeated steps of grouping are used. First the SRM groups intermediate level tokens based on symbolic representation of distances, brightness, and uniformity. Second, the intermediate level AM is invoked by the SRM, based on its HL knowledge of the road segment, in order for the AM to use its knowledge of the actual list of pixels, to connect nearby regions [5,8]. Edges and blobs in Fig. 4 are colored in alternating white and black to enhance their separate visibility. 4. DISCUSSION

Results of the experiments detailed above and other experiments give us reasons to believe in our approach toward object extraction. It assists a high-level image understanding program in constructing a description, made of image primitives, of the contents of the image. Our rule-based reasoning module can be easily changed and the enlarging of its vocabulary is very simple. Take as an example the connection of two road segments. If there exists an intersection, no low level procedure will find any evidence of edges to connect two road segments on two different sides of the intersection. This is exactly the case shown on the top of Fig. 5, where the insides of these road segments are not colored to improve visibility. However, if a wider definition of a road is added to the program, an intersection such as on the bottom of Fig. 5 can be bridged. In the run of our experiment we added to the symbolic reasoning program a definition of a road as an object made out of road segments which contain compatible and connectible areas. Using the above, wider, definition of roads on top of the existing one, produces a bridge over the intersection, which is apparent on the bottom of Fig. 5. Our system, which currently works as a sequential program, can undergo minor changes and allow concurrency between the different modules and even within them. Some of the high level processes can be executed in parallel. The transformation to a parallel scheme means making the necessary changes to translate the program from regular “sequential” prolog to one of its concurrent versions, e.g., FCP (flat concurrent prolog) [12]. Database inconsistencies can be avoided by the use of the retract and assert mechanisms for all database entries. A successful test of this possibility was run by us for the case of finding primary strips and rectangular forms in parallel. Figure 6 illustrates the result of a test of such a parallel action of our system. On the image on the top of Fig. 6, a small window was defined that includes both road segments and rectangles which we interpret to be houses. Road segments and rectangular regions were constructed independently from line segments in a way similar to the above description for strips, but, with no reference to the existence

SYMBOLIC

FIG 5. lem. The

REASONING

IN

OBJECT

EXTRACTION

On the top are two road segments that have what appears to be an intersection between result of connecting the inside regions of the two road segments are shown on the bottom.

MEISELS

FIG. 6. segments

AND

MINTZ

An aerial image of urban area which includes both roads and houses (top), and rectangular regions produced independently on a small window of it (bottom).

and

road

of one another. The test utilizes an example of an FCP predicate which makes it possible to construct different objects in parallel. The high-level program can be described as an object, which constructs in-coming requests for objects in parallel. A tree of processes is built such that at the root we have the higher level predicates which deal with the symbolic objects and the leaves of the tree are many processes which run in parallel in an attempt to grow different line segments or

SYMBOLIC

REASONING

IN OBJECT EXTRACTION

457

blobs. The main predicate is sm( R,S) :-rectangle( R?) ,strip( S?), where the predicate rectangle is of the form rectangle( [ RI T]) :-rectangle( R,B,S) ,rectangle( T?), implying the construction of the same type of object in parallel. This way, strips and rectangular areas are constructed independently. APPENDIX:

EXAMPLES

OF PREDICATES

OF THE SRM

All the examples given in the appendix are real pieces of code taken from the prolog program that is the SRM. First are two predicates that are used to connect line segments to construct longer lines. The symbolic connector grow-lines connects lines for which the returned measure of nearness is very-near. The predicate longer uses the certainty factor returned by the AM predicate connect-U 1 for its connection decision. grow-lines:-- Take the most promising line constructed so far _ _ Match its relevant features with another line

first(Line), line(Line,-,-,-,-,Shade, Side,-,O,-,-,-,-), line(Line I,--)--)-,--) Shade, Side,-,0 ,--)-, unused,-), _ _ Are the two lines near each other? Then form a new line

near(Line, Line 1, connect, HI, H = very-near,

new _ line(Line, Line 1, _ >,!. longer (line, Line, NewLine):line(Line,_,-,_,-,Shade,Side,-,O,_,-,-,-), -- Choose a small window for search _ _ Find another line-segment in that area

extended-line-area(Line, Area), line(Line 1, _ ,-,-,-, Shade, Side,-,0 ,-,-, unused,-), in-area(Line 1, Area), --

Check new line for nearness

near(Line, Line 1, connect, H >, _ _ In case of need use the lower level, connecting routine (H = very-near; (H = near, connect L(Line,Line l,Conf),

(Conf = very-high; Conf = high))), new-line(Line, Line 1, New Line). The next example shows predicates that deal with connecting strips. The blobs in each strip are connected to form a larger blob. Then, the aggregation module is being called, with the predicate connect-B( ), to check whether the two blobs are compatible.

458

MEISELS

AND

MINT2

connect (Rs, S, Nl, N2, Case):_ _ Pick a road-segment

and a strip

road-segmentW.s,-,-,-,-,- ,-,-,- ,-,- ,Strip,-,--,striP(Strip,_,_,_,_,_,-,_,_,Br,_,_,-,-,_,-,-), --

),

Construct the blobs in both strips Check the closeness of the blobs

blob-in-strip(Strip,Bl,Br), blob_in_strip(S,B2,Br), close(Bl,B2), _ _ Call AM to try to connect the blobs.

connect B(Bl,B2,-,C12), (Cl2 = high; Cl2 = very-high), add-strip-to-road(&,S,Case). Finally, we present a predicate that connects two strips whose edge-lines are not near each other, as in Fig. 5. The proof, hypothesizes a new strip, which lies in the connecting area between the two strips. The sketch of this new strip is then filled with blobs. connect(Rs,S,N 1, N2,Case):--

Check amount of nearness with the quantifying module

not_far(Nl,N2), Call one of the road-segments by a name road_segment(Rs,_,_,_,_,_,_,_,_,_,_,Strip,,-,-,StriP(StriP,-,_,_,_,_,_,_,_,Br,_,_,-,-,-,-,-), -- Generate a (hypothesized) strip between the two

1,

new-strip(regular&,S,Sid), _ _ Find the blobs in all three strips.

blob-in-strip(Sid, Ml, Br), blob_in_strip(Strip,SB2&), blob_in_strip(S,SB3,Br), --

Check for compatibility.

connect B(SBl,SB2,-,C), connect B(SBl,SB3,-$11, _ _ If confidence is high-enough

perform connection

(C = high; C = very-high), (Cl = high; Cl = very-high), add-strip-to-road(l,S,Case). REFERENCES 1. R. A. Brooks, Symbolic Reasoning Among 3-D Models and 2-D Images, Ph.D. thesis, Stanford University, 1981. 2. J. Canning, J. J. Kim, and A. Rosenfeld, Symbolic Pixel Labeling for Cur&near Feature Detection, TR-1761, Center for Automation Research, University of Maryland, 1987. 3. P. Fua and A. J. Hanson, Extracting generic shapes using model-driven optimization, in Proceedings, of the Image Understanding Workshop, Los Angeles, California, Februaty 1988, pp. 994-1004. 4. 0. Hason and A. Meisels. Quantifying the Operations of Image Segmentation and Understanding, FC-TR-018, August 1988. 5. H. Hess and A. Meisels, Generating Line-Segments and Blobs and Correcting Them, FC-TR-020, Frankel Center for Computer Science, Ben-Gurion University, Beer-Sheva, August 1988.

SYMBOLIC

REASONING

IN

OBJECT

EXTRACTION

459

6. S. S. V. Hwang, EGdence Accumulation for Spatial Reasoning in Aerial Image Understanding, Ph.D. thesis, University of Maryland, 1984. 7. D. M. Mckeown and J. L. Denlinger, Cooperative methods for road tracking in aerial imagery, in Proceedings, CWR88, Michigan, June 1988, pp. 662-672. 8. A. Meisels and S. Bergman, Finding objects on aerial photographs: A rule-based low level system, in Proceedings, of CVPR88, Michigan, June 1988, pp. 118-123. 9. M. Nagao and T. Matsuyama, A Structured Analysis of Complex Aerial Photographs, Plenum, New York, 1980. 10. A. R. Hanson and E. M. Riseman, A methodology for the development of general knowledge-based vision systems, in Ksion, Brain and Cooperatiue Computation (M. Arbib and E. Hanson, Eds.), MIT Press, Cambridge, MA, 1986. 11. P. G. Selfridge, Reasoning about success and failure in aerial image understanding, Ph.D. thesis, University of Rochester, Rochester, NY, 1982. 12. L. Sterling and E. Shapiro, The Art of Prolog, MIT Press, Cambridge, MA, 1984.