Polygonal shape recognition using string-matching techniques

0031-3203/91 $3.00 + .00 Pergamon Press plc ~) 1991 Pattern Recognition Society Pattern Recognition, Vol. 24, No. 5, pp. 433-440, 1991 Printed in Gre...

Download PDF

719KB Sizes 0 Downloads 105 Views

Report

PDF Reader
Full Text

0031-3203/91 $3.00 + .00 Pergamon Press plc ~) 1991 Pattern Recognition Society

Pattern Recognition, Vol. 24, No. 5, pp. 433-440, 1991 Printed in Great Britain

P O L Y G O N A L S H A P E R E C O G N I T I O N USING STRINGMATCHING TECHNIQUES MAURICE MAES Philips Research Laboratories, Room WY-256, P.O. Box 80.000, 5600 JA Eindhoven, The Netherlands

(Received 18 April 1990; in revised form 16 July 1990; receivedfor publication 24 September 1990) Abstract--In this paper we study several aspects of the use of string-matching techniques as an approach

to the problem of recognizing and classifying polygons. Several authors have already proposed methods for polygon recognition that are based on string-matching. In many cases, however, linear strings are used to represent polygons, which makes it difficult to handle different orientations of an object efficiently. We can, however, easily extend the linear string-matching techniques to cyclic strings, at some small computational cost. We will propose a method to represent polygons as cyclic strings and we will show how cyclic string-matching techniques can be used for rotation-, translation- and scaleindependent polygonal shape recognition. We will, however, also point out the limitations of such an approach. Pattern recognition Automatic inspection Polygonal shapes String-matching Cyclic strings

I. INTRODUCTION In this paper we study several aspects of the use of string-matching techniques as an approach to a basic problem in pattern recognition: the problem of recognizing and classifying polygons. In many applications the polygonal approximation of an object in a digital image adequately describes the shape of the object. Therefore, many polygonal approximation algorithms have been presented, (1-6~ and the recognition of polygonal shapes has been the subject of considerable study.C7-13~ Applications in which the restriction to polygons is useful, are for instance the automatic recognition of e.g. characters, maps or chromosomes. Before we come to the precise description of the problem that we will be discussing here, we should mention one application that we have in mind: the automatic inspection of industrial parts. Suppose we have a set of reference models, consisting of polygonal representations of some industrial parts. Furthermore, we have a target object which has to be recognized, and which is inspected by a camera connected to an image acquisition system. We assume that the object and the lighting conditions are such that a polygonal approximation of the object can easily be extracted from the image. In general, the polygon that we obtain in this way will not be congruent to any polygon of the reference models. We have to deal with distortions due to manufacturing defects, lighting conditions, noise, digitization and the polygonal approximation. We do not intend to discuss the problem of how to determine a polygonal approximation of an object, which can be a difficult one in practice. We will always assume that the polygonal approximations are given.

The problem is now to define dissimilarity measures between two polygons, and to find algorithms that compute these measures fast enough. We will describe an approach to this problem which is independent of the scale, the translation and the rotation of the object. (Note that the digitization and the polygonal approximation of the object will in general depend on these transformations, but there is nothing we can do about that.) In particular, the problem of finding the proper orientation of an object is a major problem in shape recognition. In this paper we will consider a dissimilarity measure based on string-matching techniques. For a general overview of string-matching or sequence-comparison techniques and applications, we refer to the book by Sankoff and Kruskal. (14) The use of string-matching techniques for shape recognition has already been proposed by several authors.Ca, 15-17)In all these papers however, polygons are encoded as linear strings, which leads to serious problems in some cases. The basic problem of the use of linear strings is to find the orientation of the object in order to determine the proper starting symbol of the corresponding string. This problem does not exist if one uses cyclic strings instead of linear strings; in fact cyclic strings provide us with the tool with which to determine the orientation of the object! A t some small computational cost (O(nm log m) instead of O(nm) for two strings of length n and m, with n -> m), linear string-matching techniques can easily be extended to cyclic strings.C18)In this paper, we propose a method of representing polygons as cyclic strings and we define a cost function for the edit operations that will be applied to them. The applicability of the method will be discussed,

433

434

MAURICE MAES

and it will be shown that there are still some problems with a dissimilarity measure that is purely based on the string-matching cost as it is defined here. We will see that the method is sensitive to different segmentations of an object, and that this problem can only be solved properly at great computational cost. 2. PRELIMINARIES

2.1. Definitions Some of the basic notions and definitions of this section are given in Wagner and Fischer.09) Let Z be a set of which the elements are called symbols and let Z* denote the set consisting of all finite strings over Z. The length [A I of a string A E Z* is the number of symbols in A. Let A denote the nullstring, which has length 0. For a stringA = ala2. • • a, E Z*, and for all i, j E {1, 2 . . . . . n}, let A(i, j} denote the string aiai+ l . . . ai, where, by convention, A(i,j} = A if i > j . A cyclic shift is a mapping o:Z*---> Z*, defined by

o ( a l a 2 . . , a,) = a2a 3 . . . anal. For all k • N, let 0k denote the composition of k cyclic shifts. Two strings A and A in Z* will be called equivalent i r A = ok(.,~) for some k • N. Clearly, this defines an equivalence relation on Z*. The equivalence class of a string A, which will be denoted by [A], will be called a cyclic string. A n edit operation s is an ordered pair (a, b) 4= (A, A) of strings, each of a length less than or equal to 1, denoted by a--~ b. A n edit operation a--~ b will be called an insert operation if a = A, a delete operation if b = A, and a change operation otherwise. We say that a string B results from a string A by the edit operation s = (a--+ b), denoted by: A ~ B via s, if there are strings C and D such that A = CaD and B = CbD. A n edit sequence S := sis2. • • Sk is a sequence of edit operations. We say that S takes A to B if there are strings Ao, A x , . . . , A k such that Ao=A, Ak=B and Ai_l--->Ai via si for all i • 11, 2 . . . . . k}). Let y be a cost function that assigns a non-negative real number y(s) to each edit operation s. For an edit sequence S as above, we define the cost y(S) by: ?(S) := Y/~=l y(si). Theedit distance ¢5(A, B) from string A to string B is then defined by

m, respectively. Given a cost function 7, the (linear) string-to-string correction problem is the problem of determining 6(A, B) and a minimum cost edit sequence taking A to B. We will briefly describe the well-known algorithm of Wagner and Fischer.09) This algorithm takes O(nm) time which in a sense is optimal.(2°) It can easily be proved that b(A, B) can be found by determining a minimum weighted path in a weighted directed graph. Consider the weighted directed graph G = (V, E), shown in Fig. 1, which we will call the graph associated with A and B, with vertices v(i, j ) , for all i E {0, 1 . . . . . n}, j E {0, 1 . . . . , m}, and with the following arcs: (i) (v(i, j ) , v(i, j + 1)) with weight w0j+, := y(A ~ bj+l) for all i E { 0 , 1 . . . . . n}, ] E { 0 , 1 . . . . . m - 1}, (ii) (v(i, j ) , v(i + 1, j ) ) with weight wi+l.o := Y(ai+l ~ A) for all i E { 0 , 1 . . . . . n - l } , j E {0, 1 . . . . . m}, (iii) ( v ( i , j ) , v ( i + 1, j + 1)) with weight Wi+l,j+ 1 := Y(ai÷l---*bj+l) for all iE {0,1 . . . . . n - 1 } , j E { 0 , 1 . . . . . m - 1}. The arcs of type (i), (ii) and (iii) correspond to insertions, deletions and changes, respectively. The problem of finding a minimum cost edit sequence taking A to B is now reduced to finding a minimum weighted path in G from v(0, 0) to v(n, m). In Fig. 1, a path and its interpretation as an edit sequence are given. The minimum weighted path can be computed as follows. Let D(i, j) denote the cost of a minimum weighted path from v(0,0) to v(i,j), which corresponds to a minimum cost edit sequence taking A(1, i) to B(1,j). Then D(n, m) = 6(A, B) and the following algorithm determines D(n, m) in O(nm) steps.

The linear string-to-string correction algorithm: begin D(0, 0) := 0; for i : = 1 to n do D(i, 0) := D(i - 1, O) + Wio f o r j := 1 to m do D(O,j) := D(O,j - 1) + w0i for i := l to n do f o r j : = 1 to m do begin ( D ( i - 1 , j ) + wio

D(i,J):=~Dli,J-1)+Wo, 1, j -

6(A, B) := min{y(S)lS is an edit sequence taking A to B},

(1)

and the edit distance 6([A], [B]) between two cyclic strings [A] and [B] is given by a([A],

[B]) : =

1) + wij

end

min{b(oa(A), og(B))[k, l • N}. (2)

2.2. The linear string-to-string correction problem Let A and B be two strings over Z of length n and

end Note that once we have found all the D(i, j)values, the actual minimum weighted path and thus the edit sequence that realizes the minimum cost D(n, m), can easily be determined in O(n + m) steps.

2.3. The cyclic string-to-string correction problem Given two finite strings A and B as above, the

String-matching techniques

0

1

2

435

3

Edit sequence:

" WI3 ~ Wt0

Sl: (at --~ A ) S2: (a2 ---4bl)

~ w2o

~ ~ w21 W2o wz2 , W2o w. , w20

s3: (a3 ----)A) s4; (a4--)b2) ss: (A --> b3)

~/30

W31 !W30

I 'WI0

WI1 X¢WI0

WI2

,WI0

W32 N 'W30

W33

'W30

Trace: a t a2 a3 a4 ~/40

W41 ~ ~/40

W42 \ eW40

W43 ~ ~V40

bl b2 b3 4 Fig. 1. The edit graph G for IAI = 4 and IBI = 3, and a path and its corresponding edit sequence and trace.

cyclic string-to-string-correction problem is the problem of determining di([A], [B]) and an edit sequence realizing this cost. Without loss of generality, we assume in this section that m-< n. It is easily seen that in order to compute the edit distance between two cyclic strings, one can simply choose a representative of one cyclic string and compute the edit distance between this linear string and the other cyclic string. So if we compute 6(A, #(B)) for all l E {1, 2 . . . . . m}, as in reference (21), then we obtain the required result in O(nm 2) time. However, we can do better than this, as shown in reference (18), where an O(nm log m) algorithm is presented. It is not known whether this is optimal. Let BB = bib2.., bmblb2.., br,, be the concatenation of B with itself. Then consider the graph H associated with the strings A and BB. For all l E {1,

2 . . . . . m}, we can find a minimum cost edit sequence from A to d ( B ) by determining a minimum weighted path within H from v(0, l) to v(n, m + l) (see Fig. 2). Although the computation of only one such path takes O(nm) time, the computation of all these paths can be done in O(nm log m) time. The crucial observation to achieve this is the fact that all paths can be chosen such that two different paths never cross. For more details we refer to reference (18). 3. STRING-MATCHING FOR POLYGON RECOGNITION

In this section we will discuss several aspects of the use of string-matching techniques for polygon recognition. Firstly, we will consider existing methods that already apply these techniques. It will be pointed out that there are some serious problems

BB

0

1 W01

2

3

5

4

6

W02 ,

w,,.

, 7,o

w,,.7,o

W02 ~.

,,o w., w.

w.

A

.y..

.,,

L "o,. \ 130

W3 . ~,y30

W32

~

W33 ,,y30

. . . y20

WZ3

W32 ~' W30

W33

~20

w

W31

~30

w\ ~40

W41 \

2

,w40

W43 ,~Y 40

W41

"40

W42 Xy 4 0

W02,.

Fig. 2. The graph H associated with A and BB.

W43

W40

436

MAURICE MAES

involved with these methods, which are mainly caused by the choice of primitives that are to be used as symbols, and by the representation of polygons by linear strings instead of cyclic ones. Next, we will propose an approach that overcomes most of these problems. Several features of a polygon can be used to represent it as a string. In order to apply string-matching, we first have to decide which features to use and we have to choose a cost function for the edit operations that will be applied to them. In 1985, Tsai and Yu (8) pointed out that the use of a discrete set of symbols, as described in references (16) and (17), has its shortcomings, in the sense that if we want every polygon to be sufficiently accurately described, the strings will tend to be long, leading to an increased matching time. Moreover, by using a discrete set of primitives, one will not be able to deal with scaling and rotation efficiently. Tsai and Yu therefore use symbols with attributes (i.e. numerical values representing for instance the length or direction of line segments). Their method still suffers from some weaknesses, however, as we shall show in the following subsection. We first have to define some notations. In this paper, a polygon P will be given by a sequence Pl, P 2 , . . . , Pn, of points in the Euclidean plane which are the vertices of P. We assume that they are given in the correct order: for all i E {1, 2 . . . . . n}, the line segment si := lPi, Pi+I] is an edge of the polygon, where Pn+l := Pl, and the sequence is chosen such that the boundary of the polygon is traversed in a fixed orientation, say, counterclockwise. The perimeter and area of a polygon P will be denoted by l(P) and A(P), respectively. From now on, we consider two polygons P and Q, represented by sequences of vertices PI, P 2 , . . . , P n and ql, q2,. • -, qm, and with line segments denoted by sx, s2, .,S n and tl, t 2 , . . . , tm. .

.

3.1. Tsai and Yu's method for attributed string matching We will now resume and discuss attributed stringmatching, as described in reference (8). Starting with a polygonal approximation P, the first step towards $1

/x

attributed string-matching is the extraction of the primitives and the definition of a cost function. The primitives that are used are the line segments si, which are represented by a pair of numbers (li, (ai) for all i E {1, 2 , . . . , n}. Here li denotes the length of the line segment and q~i denotes the angle (given in degrees and measured counterclockwise) formed between si and a reference line segment, which is chosen to be the first segment in the representation of the polygon. Let P and Q be two polygons with string representations (ll, dpO . . . (ln, dpn) and (kl, 0 1 ) . . . (km, On), respectively. The costs of the three edit operations change, delete and insert are defined as follows:

~((li, ~i)~ (kj, li + I(P)

/-/(¢~, 0j) 0 j ) ) :~--

lt~k~)

180

~v/(nm)'

(3)

y((li, 6pi) ~ A) := K1 + ~ y(A ~ (k j, 0j)) := K 2 + ~

li kj

X/(nm),

(4)

~¢/(nm),

(5)

where K1 and K2 are constants whose value should be assigned between 0 and 1, and H(~, O) is given by f l t P - 01 H(¢,

0):= ~ (360 -

if Iq~- 01-< 180,

I~ - 01

if Iq~- 01 > 180.

(6)

Note that the cost of each of these edit operations can be seen as the sum of an angle cost and a length cost. As far as the angle cost is concerned, the cost of a change operation is natural. The constants K1 and K2 for the other two edit operations give us the problem of how to assign them. Considering the length cost, up to the factor V'(nm) this choice is also natural. Division by l(P) and l(Q) in fact makes the matching cost of the polygons scale invariant. The weighting factor X/(nm) is included in order to compensate an undesirable cost bias for angle difference. Note that by the choice of this factor, the cost function y does not only depend on the symbols involved in the edit operations, but also on the strings in which they appear. tI

sS2~s~3

S11

Fig. 3. Illustration of the impact of the choice of the reference line segment in Tsai and Yu's(8) method.

String-matching techniques

S1

I S

t!

t2~

t4 t3

Fig. 4. Illustration of the impact of the choice of the starting symbol in Tsai and Yu's18~method.

A serious problem that we encounter in this approach is the fact that the directions of the line segments are related to the direction of a fixed reference line segment, which is more or less arbitrarily chosen. As a result of this, the matching cost strongly depends on the choice of the reference line segment. Consider for example the two polygons in Fig. 3. There is an obvious one-to-one correspondence between the line segments of the two polygons, since they are similar up to the first two line segments. Now suppose sl and tx are chosen as reference segments. Then we have ]q~i- Oil 4:0 for all i 4: 1. So the costs of the change operations for all other line segments are affected by the distortion that involves the first two segments only. Obviously, if we had chosen s3 and t 3 as reference line segments, then ]cPi- Oil = 0 for all i ~ {3, 4 . . . . . 13}, and the total edit cost decreases. Not only the choice of the reference line segment, but also the choice of the first line segment affects the matching cost of two polygons. This can be seen in Fig. 4, in which two reference models, P and/5, are shown, as well as a polygon Q that has to be classified. Now, if we choose t~ as a starting symbol, then Q is classified as P, since the matching cost of Q and P is less than that of Q and/5, which is easily verified. On the other hand, if we choose t4 as a starting symbol, then the matching cost of Q and P equals 0, so Q is recognized as/5. Consequently, a second major problem that we have to solve is that of determining the correct starting symbol in the string representation of the polygon. This symbol should be chosen such that the resulting matching cost is minimal. The problem is in a sense equivalent to the problem of finding the proper orientation of the polygon. Tsai and Yu propose the following heuristic to solve it (we quote from reference 8, p. 460):

437

The way we solve this problem is to make use of sharp corners on the shape boundary. Sharp corners, with their large curvatures, usually are less affected by noise or distortion than dull ones, i.e., they are more likely to be kept in the given shape. Therefore, in forming a string representation, it is reasonable for us to select as the start primitive the one right at one side of the sharpest corner (with the maximum curvature). In this way, the choice for us to get two identical strings of a single shape in two distinct orientations will be largely increased. Of course, the shape might be severely distorted right at the sharpest corner, resulting in a lower curvature at that corner. In such a case, we may still try the next sharpest corner as the start point, and the chance to get two identical strings may still be high. To be safer, the third sharpest corner may still be tried and so on. It will not be easy to design a general and robust method, based on these heuristics, that enables us to find the orientation of an object. For instance, the heuristic does not work if there are no sharp angles or if they are distorted. Moreover, if two polygons are not at all similar, then we will not be able to find the proper starting point of the strings, so stringmatching cannot directly be applied to them. What happens here is that the difficult problem of finding the orientation of an object, and of determining the correspondence between significant features (such as sharp angles) of two polygons, has to be solved before string-matching can be applied! Conversely however, string-matching can be used as the tool to deal with this difficult problem.

3.2. Representing polygons by cyclic strings In this section we will show how most of the problems that we encountered in the previous one can be solved by the use of cyclic strings instead of linear strings, and by a more locally oriented way of representing the direction of a line segment. Because we want the recognition to be independent of the scale, the translation and the rotation of the object, we will choose the string representation accordingly. To this end, we will first describe how the angle at a vertex of a polygon will be represented. For all i E {1, 2 , . . . , n}, let &/be the angle at a vertex Pi, measured at the interior side of the polygon and given in degrees. So 6"/E (0, 360). Then the number o;i E ( - 1 8 0 , 180) is defined by c~i:= 180 - &i-

(7)

Figure 5 illustrates this way of representing angles, which leads to a simple and natural definition of the cost function later on, see Equations (8) and (9). Furthermore, the number ljl(P) will represent the length of the line segment si. The alphabet Z that we are going to use consists of two types of symbols: angles, which are elements of (-180,180) and which will be denoted by o~i or fli, and lengths (sometimes also called segments), which are elements of (0, 1) and which will be denoted by ).i or/~i, for some i E N.

~'i"=

438

MAURICE MAES

~

how (and if) the costs on lengths and angles should be balanced strongly depends on the specific application, for instance on the segmentation phenomena described in the next section.

:,,0(, :::i, Iii!i2iii (i:i!!i ¸

P

i,~!i i! i~I~

= 150" 0t =30

= 180" t~ = 0

4. DISCUSSIONOF APPLICABILITY ~=

210"

~ = -30

Fig. 5. Examples of the symbolic representation of angles. Now suppose that we have defined a proper cost function for the edit operations on these symbols. Then we are ready to define a dissimilarity measure between two polygons P and Q in the following way. For both these polygons, first construct the cyclic strings [.41,] and [BQ], represented by the linear stringsAp : = o l 1 ~ 1 . . . Oln~ n and BQ := fld~l.., flmltm. (Here again, the symbols are the angles and the lengths of the polygons in the correct order, as described above.) Then, given a cost function y, the dissimilarity measure between P and Q is defined to be 6([.LIp], [BQ]), see Equations (1) and (2). The cost function y that we associate with the edit operations that can be applied to these symbols is defined as follows. For all x,y ~ Z, let Ix - Yl

if x and y are angles,

7(x--->y) :=lw[x - y [ ifxandyarelengthS,otherwise,

(8) I Ix[ ),(x ~ A) = 7(A--> x) :=

(wlxl

if x is an angle, ifx is a length,

(9) where w E R is a weighting factor, included to balance the costs associated with angles and lengths. The cost associated with lengths is essentially the same as in the previous section. As far as directions or angles are concerned, the main difference lies in the fact that they are measured locally. In this way we avoid the undesirable error propagation effect that we encountered in the previous section, which was caused by the choice of the reference line segment. Furthermore, we now have a natural choice for the cost of the deletion or the insertion of an angle, so the problem of choosing the constants K~ and K2 in Equations (4) and (5) is also solved. Note that the representation of a polygon by a cyclic string as described above is unique, up to scaling, translation and rotation, and that P and Q are congruent up to these transformations if and only if 6([Ap], [Bo] ) = 0. In order to determine 6(lAp], [B0] ), we can use the algorithm that is mentioned in Section 2.3. By using cyclic strings and the cyclic string-matching algorithm, the problem of finding the proper starting symbol simply does not exist. The choice of a proper weighting factor is still a problem;

The main problem that limits the general applicability of string-matching techniques to polygonal shape recognition is the influence of the polygonal approximation of an object on the total matching cost. In the following section we will discuss this problem.

4.1. Sensitivity to segmentation phenomena We will now investigate the effect of deviations in the polygonal approximation (or, for short, segmentation) of an object. Even if objects are not distorted, if lighting and noise conditions are optimal, and scaling effects are not present, then the rotation of the object will in general still influence the segmentation. And although the polygons representing two identical objects may still "look alike", the string representations can differ too much. Experimental results have indicated that there are two basic segmentation phenomena that we encounter, which are in a sense dual to each other. They can cause significant features, such as (relatively) long line segments and sharp angles to be segmented inconsistently. To illustrate this, consider Fig. 6. In Fig. 6(a), three different segmentations of a long edge of an object are given. Consider the first two segmentations. It is likely that in the matching of the strings of these polygons, (the symbol representing) the obtuse angle f12, as well as the segment /~2, will be inserted, and/~1 will be changed into/~1. The cost of the edit operations involving the angle is relatively small, since the angle is obtuse, whereas

(a)

(b)

Fig. 6. Segmentation phenomena.

String-matching techniques the length-cost is relatively high. Similar observations can be made for any other combination of the three segmentations. In Fig. 6(b), we see the opposite effect occur at a sharp angle: three different segmentations of a sharp angle of an object are shown. The following edit operations realize the matching of the first two segmentations: the change of oq into ill, and the insertion of r2 and/a~. In this situation, the cost associated with angles is relatively high compared with the cost of the insertion of the small segment #1. These segmentation inconsistencies can have a great influence on the matching cost, and this is a serious problem that restricts the applicability of the string-matching techniques as described above. If we don't adapt our method to this problem, we will need a polygonal approximation algorithm that does not suffer from these phenomena too much. Further research is required to determine what approximation algorithms are well suited for this purpose, for instance whether the approximations should be rough or fine. The problem can possibly be dealt with by choosing a polygonal approximation that consists of a fixed maximum number of segments thus avoiding some approximations being too rough and others being too fine. Note that the weighting factor also plays a role here, since segmentation phenomena of type shown in Fig. 6(a) and (b) lead to high length and angle costs, respectively. So if one phenomenon occurs more often, then the weighting factor should be chosen so that it compensates for the corresponding effect on the matching cost. We will conclude this section by mentioning two methods to (partly) solve the problem caused by the segmentation phenomena. The first one is Tsai and Yu's method for stringmatching with merging (a) which uses more powerful edit operations, involving more than one symbol from each string. With this extension, string-matching techniques can become much more powerful for shape recognition, but there are two problems that we encounter. One is from the theoretical viewpoint: how should we define the cost function associated with these merge operations? The other, more practical problem restricts the applicability of the extension of string-matching with merging operations to many pattern recognition tasks, i.e. it is easy to see that the complexity of the string-matching algorithm increases from O(nm) to O(n2m 2) when merging is introduced. (For cyclic strings (with m-< n) this increase would be from O(nm logm) to O(n2m3), using algorithms that are presently known.) Another approach to deal with the segmentation problem is one that is based on a learning phase, used to create string representations of objects in such a way that the edit costs associated with symbols depend on the consistency of the segmentation of the corresponding angles or line segments. Further research is needed to investigate the feasibility of such an approach. PR 25:5-F

439

4.2. Conclusions and suggestions for further

research We have presented a method for scaling-, translation- and rotation-independent polygonal shape matching, which is based on cyclic string-matching techniques. We have seen that our approach solves several problems that were encountered in previous methods. String-matching techniques are a powerful tool for polygon recognition in those situations where the polygonal approximation of objects can be done consistently. As we saw in Section 4.1, this will in general not be the case, and we have indicated two approaches to deal with the problem. So far however, we have only considered the influence of the segmentation phenomena on the total matching cost of two strings, on which the classification decision would depend. The optimal matching of two strings, however, not only gives us

the cost of the matching, but also the edit sequence realizing it. From this edit sequence, a one-to-one correspondence between several vertices and line segments of the two polygons can be obtained: although the total matching cost can be unreliable as a classification criterion, the edit sequence often still provides a correct matching of the significant features of two approximately similar polygons. This property can be used if we combine string-matching with other techniques, such as finding a superposition of two polygons such that a global dissimilarity measure, e.g. the intersection area of the two polygons, is minimized. For instance, Belleau and Cohen, (1°) as well as Kashyap and Oommen (9) use initial correspondences between vertices of the two polygons in order to compute intersection area based dissimilarity measures. We have seen that stringmatching techniques can provide such initial correspondences. This aspect of string-matching, which has received little or no attention so far, might broaden the applicability of the techniques we have presented. SUMMARY

In this paper we study several aspects of the use of string-matching techniques as a structural approach to scaling, translation and rotation-independent polygonal shape recognition. First of all, we define the so-called string-to-string correction problem, which is the problem of determining a distance between two strings that is based on edit operations such as inserting, deleting and changing the symbols of the strings. We briefly describe the algorithms that solve the problem for linear strings 09~ and for cyclic strings. (18~ Next, we consider existing methods that already apply string-matching to polygon recognition. A method proposed by Tsai and Yu (8) will be investigated and it will be shown that there are some problems involved with this method, which are

440

MAURICEMAES

caused by the way in which the directions of line segments are represented, and by the use of linear strings to represent polygons instead of cyclic ones. In order to cope with these problems, we propose a new way of representing polygons as cyclic strings, and define a cost function for the edit operations that will be applied to them. The most important advantage of this new approach is the fact that one does not have to find the orientation of an object before string-matching can be applied; on the contrary, the use of cyclic strings enables us to determine the orientation. The main problem that we still encounter in applying the string-matching techniques is the fact that the polygonal approximation of an object in practice often depends on noise and lighting conditions, as well as on the scaling and the rotation of the object. These conditions are the cause of segmentation inconsistencies, to which the string-matching techniques are sensitive. If the conditions under which the polygonal approximation has to be obtained are such that these inconsistencies occur too often, then the strength of the techniques is limited. Tsai and Yu (8) therefore propose a method to solve this problem by extending string-matching with merging, which is a more powerful edit operation, involving several symbols from both strings. However, this extension increases the computation time considerably. In the final section of this paper we mention an interesting subject for further research, by considering an aspect of the use of string-matching techniques for shape recognition that has received little attention in the literature so far. It is the fact that not only can the matching cost for two strings be used as a classification criterion, but that the edit sequence that realizes the cost is also useful, because it provides a one-to-one correspondence between significant features of the two polygons. This oneto-one correspondence can be used as a heuristic approach to the calculation of global dissimilarity measures for polygons. Acknowledgement--The author wishes to thank Ernst van der Plas for carrying out the programming and experiments that have influenced this work and for valuable discussions on the subject. REFERENCES

1. T. Pavlidis, Structual Pattern Recognition. Springer, New York (1977).

2. J. Sklansky, Fast polygonal approximation of digitized curves, Pattern Recognition 12,327-331 (1980). 3. A. Sirjani and G. R. Cross, An algorithm for polygonal approximation of a digital object, Pattern Recognition Lett, 7, 299-303 (1988). 4. K. Wall and P. Danielsson, A fast sequential method for polygonal approximation of digitized curves, Comput. Vision. Graphics Image Process. 28, 220-227 (1970). 5. K. Suzuki, Y. Nishida and S. Hada, A fast polygonal approximation method for real-time shape recognition, IEEE Proc. Comput. Vision Pattern Recognition, pp. 388-394 (1986). 6. U. Montanari, A note on minimal length polygonal approximation to a digitized contour, Comm. A C M 1, 41--47 (1970). 7. P. Cox, H. Maitre, M. Minoux and C. Ribeiro, Optimal matching of convex polygons, Patern Recognition Lett. 9, 327-334 (1989). 8. W. H. Tsai and S. S. Yu, Attributed string matching with merging for shape recognition, IEEE Trans. Pattern Anal, Mach. Intell. PAMI-7, 453-462 (1985), 9. R. L. Kashyap and B. J. Oommen, A geometrical approach to polygonal dissimilarity and shape matching, Proc. 6th Int. Conf. Pattern Recognition, pp. 472479 (1982). 10. J. Belleau and P. Cohen, Flexible matching algorithm for 2-D polygonal shapes, SPIE Proc. Advances in Image Process. 804, 28-37 (1987). 11. M. W. Koch and R. L. Kashyap, Using polygons to recognize and locate partially occluded objects, IEEE Trans. Pattern Anal. Mach. lntell. PAMI-9, 483--494 (1987). 12. N. Ayache and O. D. Faugeras, HYPER: a new approach for the recognition and positioning of twodimensional objects, 1EEE Trans. Pattern Anal. Mach. lntell. PAMI-8, 44-54 (1986). 13. L. S. Davis, Shape matching using relaxation techniques, I E E E Trans. Pattern Anal. Mach. lntell. PAMI-1, 60-72 (1979). 14. D. Sankoff and J. B. Kruskal (Eds), Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparison. Addison Wesley, Reading, MA (1983). 15. T. I. Fan, Optimal matching of deformed patterns with positional influence, Inf. Sci. 41,259-280 (1987). 16. S. Y. Lu and K. S. Fu, Stochastic error-correcting syntax analysis for recognition of noisy patterns, IEEE Trans. Comp. 26 (12), 1268-1276 (1977). 17. S. Y. Lu and K. S. Fu, A sentence-to-sentence clustering procedure for pattern analysis, 1EEE Trans. Syst. Man Cybern. SMC-8, 381-389 (1978). 18. M. Maes, On a cyclic string-to-string correction problem, Inf. Proc. Lett. 35, 73-78 (1990). 19. R. A. Wagner and M. J. Fischer, The string-to-string correction problem, J. A C M 21 (1), 168-173 (1974). 20. C. K. Wong and A. K. Chandra, Bounds for the stringediting problem, J. A C M 23, 13-16 (1976). 21. H.-C. Liu and M. D. Srinath, Classification of partial shapes using string-to-string matching, SP1E Proc. Intelligent Robots and Computer Vision 1002, 92-98 (1989).

About the Author--MAuR1CEJ. J. J. B. MAESwas born in 1963 in Maastricht, The Netherlands. In 1981

he started his studies in Mathematics at the University of Nijmegen, The Netherlands, from which he graduated in 1987. Since 1987 he has been with the Philips Research Laboratories in Eindhoven, The Netherlands.

Polygonal shape recognition using string-matching techniques

Polygonal shape recognition using string-matching techniques

Recommend Documents