Pattern Recognition Letters 1 (1983) 245-253 North-Holland
May 1983
Inexact graph matching for structural pattern recognition H. B u n k e Lehrstuhl fffr Informatik 5 (Mustererkennung), Universitiit Erlangen-Niirnberg, F.R. Germany
G. A l l e r m a n n Mannesmann Forschungsinstitut, Duisburg, F.R. Germany Received 25 January 1983; revised 10 March 1983
Abstract: This paper is concerned with the inexact matching of attributed, relational graphs for structural pattern recognition. The matching procedure is based on a state space search utilizing heuristic information. Some experimental results are reported.
Key words: Attributed graphs, inexact matching, heuristic search, structural pattern recognition, silhouette recognition.
1. Introduction In the structural approach, patterns are characterized as consisting of sub-patterns among which various relations exist (Fu (1982), Pavlidis (1977)). There are several ways for pattern description and representation in structural pattern recognition. A basic way is by means of strings. Due to the two-dimensionality of pictorial patterns, however, a two-dimensional description formalism seems to be more adequate for certain applications. Consequently, the use of trees for pattern representation has been proposed (Fu and Bhargava (1973)). A natural generalization of trees is graphs. Their use allows the description of complex structures in a straightforward way on a high, problem-oriented level. Besides representation of single patterns, the description of pattern classes is crucial. A method which has gained much importance over the last decade is the use of formal grammars (Fu (1982)). Grammars are favourable when a large number of different pattern classes is involved. At the other hand, there exist cases where the use of a grammar is not feasible, for example, when the set of sample
patterns is too small for the inference of a representative grammar. In this case, the patterns in the sample set can directly be used as prototypes, representing a class each. For the classification a n d / o r description of an unknown pattern, one can match the input pattern against the prototypes. Pattern matching is equivalent to graph matching when the patterns in question are represented by graphs. There are several papers in literature addressing the domain of matching by finding a graph isomorphism, e.g. Corneil and Gottlieb (1970). In practical applications, however, there is noise and distortions causing variation in both the input graph and the prototype. Therefore, our problem turns into a problem of inexact matching. This means given an input graph ~o and a set of prototypes ~ ..... ~ , all possibly different from co, we want to find out which one of the prototypes most closely resembles the input graph. As a consequence, we have to define some measure of similarity on graphs. Several approaches to inexact graph matching have been proposed in literature, e.g. Cheng and Huang (1981), Sanfeliu (1980), Shapiro (1981), Tsai and Fu (1979), Tsai and
0167-8655/83/53.00 ~ 1983, Elsevier Science Publishers B.V. (North-Holland)
245
Volume 1, Number 4
PATTERN RECOGNITION LETTERS
Chang (1982). The approach proposed in this paper is closey related to the work of Tsai and Fu (1979) and Tsai and Chang (1982), i.e. their definition of relational graphs is adopted and matching is based on a state space search. Following are differences: inexact matching is generalized to arbitrary graphs; an extension to attributed graphs is proposed; the cost functions are defined in such a way that the graph distance fulfils the properties of a metric under certain conditions.
2. Basic definitions Let Vs and Vs be a set of node and branch labels, respectively. Each symbol belonging to VN or VB is of the form (s,x) where s is the syntactic c o m p o n e n t and x = (xl . . . . . xn) is a semantic vector consisting of attribute values associated with s. The following definition is due to Tsai and Fu (1979). Definition 1. An a t t r i b u t e d r e l a t i o n a l g r a p h over V= VN U Vs is a 4-tupel ta = (N, B, a, e) where (1) N is a finite, n o n e m p t y set of nodes, (2) B c_N x N is the set of branches, (3) a : N - - , VN is a function for labeling the nodes, (4) c:B--. VB is a function for labeling the edges.
May 1983
DELNODE(0")(INSNODE(0")).The costs for substituting a node with syntactic label a by a node with syntactic label r are given by SUBNODE (e, r). Similarly, SUBBRANCH(a,r) are costs for branch substitution. Defining costs for branch deletion and insertion is slightly more difficult since these transformations are dependent on the bordering nodes. When a node is deleted then all its outgoing and incoming branches vanish, too. Conversely, when inserting a node, there are usually branches to be inserted, too, for properly embedding this node in the rest of the graph. As a consequence, we define DELBRANCH(O')and INSBRANCH(O') as costs for insertion and deletion of a branch with syntactic label (7 while none of the bordering nodes is inserted or deleted, respectively. In addition, DELBRANCH'(0")and INSBRANCH'(O')are corresponding costs for branch deletion and insertion when at least one of the bordering nodes has been deleted or inserted, respectively. While the above considerations are addressed to distortions on only syntactic symbols, one has to take into account also distortions on attributes. For deletion and insertion, the costs for attributes are assumed to be included in the costs concerning the syntactic symbol. In the case of node substitution, however, the total costs for the syntactic symbol and the semantic vector are given by SUBNODE ((7, r ) + ~ gilxi--Yil i=I
An element (x, y ) e B is interpreted as a directed edge from node x to node y. Nodes x and y are called the bordering nodes o f the edge. An unattributed relational graph results when the semantic vector is empty, i.e. n = 0 , for each syntactic symbol. Using attributes besides syntactic labels has been proposed by Tsai and Fu (1979). It is a generalization of the formal graph models used by Tsai and Chang (1982), resp. Cheng and Huang (1981). In this paper, we are concerned with distortions on graphs. Any such distortion can be described in terms of basic transformations, namely deletion, insertion, and substitution of nodes and edges. There are costs associated with each basic transformation. We denote the costs for deleting (inserting) a node with syntactic label a e VN by 2,16
where s = (a, (x, . . . . . x , ) )
and t = (r, (y~ . . . . . yn))
are the labels o f the nodes involved and gi are weighting factors. The costs for edge substitution are analogously defined. The costs as well as the weighting factors gi are some means for the user to express a priori knowledge about the problem domain. Low costs should indicate high likelihood for a particular transformation and vice versa. Remark that costs depend only on labels and attributes, not on particular nodes or branches. Costs have to range between zero and infinite where infinitely high costs
Volume 1, Number 4
PATTERN RECOGNITION LETTERS
are to be defined for impossible transformations (with respect to a particular problem domain). When an inexact match between graphs o~ and a~2 is established, three cases can occur: a node n ~ N ~ can be deleted, or a node n t ~N~ can be substituted by a node n_, e N 2, or a node n 2 e N 2 can be inserted. Following is a formal definition for inexact matching taking into account these cases. Formally, we augment both N l and N 2 by the symbol $ for indicating node deletions and insertions. A node in co L, resp. w 2, which is mapped to the symbol $ is deleted from ~ l , resp. inserted in a~2.
;7
T
7
×
Q
×
tM
Q
tJ 2 : 7
b)
o) Fig. 1. Two graphs.
tion and substitution of nodes and branches. Note that mapping nodes by means of f induces a mapping of branches, too. A branch (n, n')~B~ is substituted when there exists a branch
Definition 2. Let
091=(Ni, Bt,l~l,et) and ~2=(N2, B2,I.ta,e2) be attributed relational graphs. Furthermore, let NI=NIU{$},
May 1983
.N'2=N2U{$},
$ ~ N I U N 2.
An inexact match between co~ and co2 is a partial function f:/Vl -* N2 where (l) f ( $ ) ~ $. (2) n ~ n ' = f ( n ) ¢ f ( n ' ) for all n , n ' e N l and
f ( n ) , f ( n ' ) ~ N 2. (3) For each n2eN2, there exists an nl e~rl with f ( n i ) = n 2 and for each n l ~ N l , there exists an n 2 e ~ : 2 with f ( n l ) = n 2. A node nleNl with f ( n l ) = n z e N 2 is substituted by a node n 2 in co2. A node n~eN~ with f - t ( n j ) = = $ is deleted while a node n 2 e N 2 with f - 1(n2) = $ is inserted. The property that f is a partial function is only addressed to the element $ e N1, i.e. when there are no nodes to be inserted in w2 then f ( $ ) is undefined. Condition (3) means that each node in w~ (~o2) must be either substituted or deleted (inserted). Notice that Definition 2 covers inexact matches of any type. Particularly, it contains, as a special case, errorcorrecting isomorphisms introduced by Tsai and Fu (1979).
Definition 3. Let co I and tz)2 be attributed relational graphs and f:.K: l -'/if/2 an inexact match. The costs o f that match, c o s t ( f ) , are obtained by summing up all costs arising from deletion, inser-
(f(n), f ( n ' ) ) e B2; a branch (n, n ' ) e B l is deleted when there is no branch
( f ( n ) , f ( n ' ) ) e B2; a branch (n, n ' ) e B z is inserted when there is no branch
(.f - l ( n ) , f - l(n')) e Bl. Example 1. Two graphs co I and co2 are shown in Figure la,b. An inexact match between m~ and oo2 is given by f1:2~4,
3--5, 1--$.
Let DELNODE(CQ=2
for a e { a , r } ,
SUBNODE(a, r ) = SUBNODE(r, O') = 1, DELBRANCH(X) = 2 ,
DELBRANCH '(X) = l ,
XNSaRANCH(x) = 2, then c o s t ( f 1 ) = 2 + 1 + 1 + 1 = 5 by deleting a node with label a , deleting two edges (where the costs are DELBRANCH'(X) in this case), and substituting one node label. Another inexact match is given, for example, by f2:1--4, 2--5, 3--$.
Here, we have c o s t ( fz)= 6. There are no attributes in this example. 247
Volume 1, Number 4
PATTERN RECOGNITION LETTERS
Definition 4. Let fl . . . . . fn be all inexact matches which are possible between an attributed relational graph co! and co_,. The optimal inexact match f * between col and co2 is defined by c o s t ( f * ) = min {cost(f)}. fl ..... fn
Lemma 5. Let cost(cot, co2) denote the costs of the optimal inexact match between col and co,.; cost(wt, co2) is a metric t if the following holds: (1) SUBNODE (O', O') = SUBBRANCH(O', O') = 0 for all syntactic symbols a, and SUBNODE(a, r), SUBBRANCH(a, r ) > 0 for ere 1., and the costs for any deletion or insertion are greater than zero. (2) SUBNODE(O', 1.) = SUBNODE(1., O'), DELNODE(O') = INSNODE (O'),
DELBRANCH'(O') = INSBRANCH'(O')
for any symbols er, 1.; the analogous conditions hold true for branches. (3) Let c2 and c3 be the costs for two basic transformations yielding the same result as one basic transformation with costs cl, then always Cl~C2+C
3•
The proof is straightforward by independently concluding all metric properties from condition (1)-(3). An example for condition (3) is SUBNODE (O', l") __
Notice that none of the approaches to inexact graph matching cited in the introduction is a metric.
3. Finding the optimal inexact match Referring back to the problem stated in the introduction of finding that prototype col among ml . . . . . con which most closely resembles an input graph co, a straightforward solution is to deter-
1 This means (a) cost(w,, oaz) -> 0 and cost(oat, oa,) = 0 iff oat=w,; (b) cost(w, co,)=cost(oa2, w,. oat); (c) cost(oat, w3) -
248
May 1983
mine cost(co~, co) .... ,cost(con, co) and to pick up that prototype with minimum costs. However, since we have to take into consideration all inexact matches between co and each prototype for finding the optimal inexact match, high computational costs are involved. As a matter of fact, finding the optimal inexact match requires exponential time with respect to the number of nodes. In this section, we introduce heuristic search according to Nilsson (1981) for the purpose of effectively computing the optimal inexact match. The costs for deletion, insertion and substitution of nodes and edges are employed to guide the search. The method described here is closely related to the approach proposed in Tsai and Fu (1979), Tsai and Chang (1982), Matching co~ against co2, a state S in the search space is given by a partial inexact match fs between co~ and co2. Thus a state can be represented through a subset S c_,,91 × N2 with
( x , y ) ~ S ~* f s ( x ) = y , where the mapping fs fulfils condition (1) and (2) in Definition 2. A state in the search space gives that part of the desired inexact match which has been found until a certain moment. The initial state is defined by S0=0. A final state Sf is characterized by covering both N l and N 2. Note that the inexact match corresponding to a final state fulfils also condition (3) in Definition 2. By means of expansion of a state S, a new state S' is generated. Expansion of a state s={(x~,yl) .....
(xn, y.)}
is defined by adding a new pair (x, y) e.~l x -'V2 to S where X ~ {X 1. . . . . X n } - -
{S}, y ~ {Yl . . . . . Y n } - - {S},
(x,y)*($, $). Expansion is defined only for nonfinal states. The meaning of expanding a state S by the pair (x,y) is to augment the partial inexact m a t c h f s by one more element. As an example, the search space for matching co! against co2 in Example 1 is given in Figure 2. The initial state is the root o f the tree while tip nodes are final states. The costs of a state S are given by the costs of the partial match fs according to Definition 3. The costs of the initial state are defined as being equal
Volume 1, Number 4
PATTERN R E C O G N I T I O N LETTERS
to zero. Now, the problem of finding an optimal inexact match is equivalent to finding a final state with minimum costs among all other final states. A solution to this problem is provided by starting with the initial state and expanding iteratively that state with minimum costs. Expanding is repeated as long as the state with minimum costs is no goal state. According to Nilsson (1980), this procedure is guaranteed to yield the optimal inexact match. As an example, the costs corresponding to the states are given inside the circles in Figure 2. The optimal inexact match between co~ and co_, is given by ./'1 in Example 1. For speeding up the search, additional heuristic information is to be utilized. According to Nilsson (1980), we use an estimate
heuristic information, and the algorithm is still guaranteed to find the optimal inexact match. For the computation of cost2(S), suppose that subgraph al of coI has already been matched against subgraph cq of co2. Now the objective is to get an optimistic assessment of the future costs, cost2(S), when matching the remaining part ¢ o l - a ~ against co,-0t 2. For this purpose, we focus on subgraphs ,6'1 and ,82 each node of which is connected via at least one edge with st and et2, respectively. We divide cost2(S) into one part A arising from insertions and deletions, and into another part B arising from substitutions, i.e. cost2(S) = A + B. Part A is calculated by determining the difference between ,81 and ,g2 with respect to the number of nodes. The difference serves as a basic for an optimistic assessment of future costs arising from insertions a n d / o r deletions. Part B, i.e. future substitution costs, are optimistically assessed by matching in a tentative fashion independently in the cheapest way each node in ,81 against a node in ,sz. It can easily be seen that our procedure yields a consistent lower bound estimate for the real future costs. As the experiments described in the next section have shown, assessing the future costs results in a considerable speed up.
cost (S) = cos h (S) + cost2(S) for the costs at state S where costl(S) is the costs of the partial match fs as defined beforehand, and cost2(S) is a lower bound estimate of the costs which are expected when the rest of ~1 and oJ2 not yet taken into account at state S will be matched against each other. It can be formally proven that with such an estimation fewer nodes will be expanded, compared with a search utilizing no
{(1,4)}
May 1983
{(I,5)} /
/
{(I,$)} / \',\
//
g
{(1,4),(2,$)} {(i,5),(2,4)} {(1,5),(2,$)} {(I,$) ,(2,4)} {(I,$)
Q
{(1,4),(2,5),(3,$) }
@
{(1,5) ,(2,4),(3,$2)
{(1,4),(2,$ ,(3,5)}~6~
{(I,5),(2,$),(3,~
(2,5)}
{(I,$),(2,4),(3,~ {(I,$),(2,5),(3,4)}
Fig. 2. Search space for matching eJ I against w2 (see Figure 1). Costs are encircled.
249
V o l u m e 1, N u m b e r 4
PATTERN
RECOGNITION
LETTERS
M a y 1983
(c,i06)
Table I Syntactic s y m b o l
()
Meaning
a
a < 900
b
a=90 °
c
90 ° < g <
d
180 ° < a < 2 7 0 °
e
a = 270 °
f
270 ° < ~ < 360 °
180 °
(c,i06)
(E,5)
-() (E,9)
(E ,9)
(),
(E,~O)
(a,74)
(a,74)
4. Experimental results and conclusions
aJ
The method described in Section 2 and 3 has been implemented in PASCAt on a TR 440 Computer at the University of Erlangen. Geometric silhouettes have been used for testing the programs and gaining experience. The transformation of silhouettes into graphs has been done manually. Two different methods for silhouette representation by means of graphs have been used. In the first method, silhouette corners correspond to graph nodes while silhouette edges correspond to graph branches. Syntactic symbols 'a', 'b', .... 'f' have been used for labeling nodes corresponding to corners with an angle a. The meaning of these labels is shown in Table 1. There is only one node attribute, namely the size of the angle in degrees. The syntactic symbol E is used for all graph branches while the (only) branch attribute defines the length of the corresponding silhouette edge. For an example, look at Figure 3a where a graph representation of Figure 4c is given. In the second representational method, silhouette corners correspond to graph edges while silhouette edges are represented through graph nodes. All labels are identical to
1
(a,74) ~ (E,t0) b)
Fig. 3. G r a p h r e p r e s e n t a t i o n of F i g u r e 4¢ by m e a n s o f two methods.
method 1 up to the fact that there is an additional syntactic symbol 'g' indicating parallel silhouette edges having the distance of those edges as attribute. For an example, see Figure 3b which again represents Figure 4c.
3
3
bJ
c) Fig. 4. Five p r o t o t y p e s .
250
~ (a,74)
1
3
a)
(E ,5)
3
dJ
e)
Volume I0 Number 4
PATTERN R E C O G N I T I O N LETTERS
May 1983
Table 3
Table a
b
c
d
e
f
$
Input pattern
0 0.5 1.0 1.5 2.0 2.5 2.0
0.5 0 0.5 1.0 1.5 2.0 2.0
1.0 0.5 0 0.5 1.0 1.5 2.0
1.5 1.0 0.5 0 0.5 1.0 2.0
2.0 1.5 1.0 0.5 0 0.5 2.0
2.5 2.0 1.5 1.0 0.5 0 2.0
2.0 2.0 2.0 2.0 2.0 2.0 -
Best matching prototype
Costs Method 1
Costs Method 2
a b c d e
a d c d c
1.8 1.98 1.36 2.7 7.25
3.0 2.58 1.36 2.7 7.85
Five prototypes and five input patterns are shown in Figure 4 and 5, respectively. The results of matching the input patterns against the prototypes are given in Table 3. The numbers at the edges of the input patterns refer to the corresponding edges of the best matching prototype. Both representational methods yield the same results with respect to the best matching prototype and the edges, resp. corners, corresponding to each other. This isn't surprising since the role of nodes and edges is almost symmetric. Only the costs of the best match are different, in general. For a further example, look at Figure 6c where the prototypes in Figures 6a and b have been superimposed. Matching the graph corresponding to Figure 6c with the union of both prototype graphs, we get costs of 16.00 (representation method 1) and 8.89 (representation method 2) for the optimal inexact match. Corresponding edges are indicated in Figure 6 by numbers 1. . . . . 7, similiar to Figures 4 and 5. So far, we have run a total of about 100 matches on our computer using additional prototypes and input graphs to Figures 4 to 6. It has turned out that it is not an 'art' to determine suitable costs for basic transformations and the weightings gi (see Section 2). Instead, at least for our problem domain, the results of matching show stability with
The substitution cost matrix for matching graphs represented according to the first method is given in Table 2. The entry in column c and line d gives, for example, the costs for substituting node label d by c. The last column, resp. last line, gives deletion costs, resp. substitution costs. The weights gt . . . . . gn of Section 2 have been choosen as g~ = 10 -3, n = 1. Both the syntactic labels and attributes depend on the angle size. While the categorization into angle types according to Table 1 induces a coarse rating based on relatively high weights, a finer distinction within each category is obtained by means of the attributes. Remark that there are other ways feasible for selecting node labels (and costs), e.g. by choosing the same label for the case 0 < a < 90 ° and a = 90 °. Considering branches, only deletion and insertion costs, i.e. no substitution costs, are needed. We have used the following costs: DELBRANCH(E) = INSBRANCH(E) = 1.5
and I N S B R A N C H ' ( E ) = D E L B R A N C H ' ( E ) = 1.0.
The costs for representation method 2 are defined similarly. All costs have been heuristically determined.
r
1 a)
b)
c)
d)
e)
Fig. 5. Five input patterns.
251
Volume 1, Number 4
PATTERN RECOGNITION LETTERS
o)
b)
May 1983
c)
Fig. 6. Two prototypes and their superposition.
respect to modifications of the basic transformation costs. Computing time is linear in the number of states generated (see Figure 7). Utilizing heuristic information as described in Section 3 leads to the expansion of fewer states. As an example, when matching the input pattern e in Figure 5 against prototype e in Figure 4,43 expansions are needed utilizing heuristic information while 83 expansions are needed otherwise. The authors want to point out that their approach to graph matching should not be regarded as being primarily devoted to the matching of silhouettes, for which other methods are known from the literature, e.g. Kashyap and Oommen (1982). Instead, this problem domain has been used as just one example for testing and gaining general experience. Other applications of graph computing time (s)
25
fJ
20
J
/
~r
15
/
20
40
J 60
f
80
100
120
140
160
Fig. 7. Time versus number of states. 252
#of s t a t e s
matching in the field of medical image sequence analysis are currently under investigation (Bunke, Sagerer and Niemann (1982)).
Acknowledgement The authors want to express their gratitude to Prof. K.S. Fu of Purdue University, West Lafayette, IN for many fruitful discussions on the topic of this paper.
References Bunke, H., G. Sagerer and FI. Niemann (1982). Model based analysis of scintigraphic image sequence of the human heart, Proc. of the NA TO ASI on Image Sequence Processing and Dynamic Scene Analysis, Braunlage/Harz, F.R. Germany, June/July 1982. Cheng, J.K. and T.S. Huang (1981). Image recognition by matching relational structures. Proc. IEEE Comp. Soc. Conf. on PRIP, 542-547. Corneil, D.G. and C.C. Gotlieb (1970). An efficient algorithm for graph isomorphism. J. Assoc. Comput. Mach. 17 (1). Fu, K.S. (1982). Syntactic Pattern Recognition and Applications. Prentice Hall, Englewood Cliffs, NJ. Fu, K.S. and B.L. Bhargava (1973). Tree systems for syntactic pattern recognition. IEEE Trans. Compur 22, 1087-1099. Kashyap, R.L. and B.J. Oommen (1982). A geometrical approach to polygonal dissimilarity and shape matching, Proc. 6th ICPR, 472-479. Nilsson, N.J. (1980). Principles of Artificial Intelligence. Tioga, Palo Alto, CA. Pavlidis, T. (1977). Structural Pattern Recognition. Springer, Berlin-Heidelberg-New York. Sanfeliu, A. (1980). Distance between attributed relational
Volume 1, Number 4
PATTERN RECOGNITION LETTERS
graphs for pattern recognition, TR-EE 80-39, Purdue University, West Lafayette, IN. Shapiro, L.G. and R.M. Haralick (1981). Structural descriptions and inexact matching. IEEE Trans. Pattern Anal. Mach. Intell. 3(5), 504-529.
May 1983
Tsai, W.H. and K.S. Fu (1979). Error-correcting isomorphisms of attributed relational graphs for pattern analysis, IEEE Trans. Systems Man Cybernet. 9(12). Tsai, W.H. and F.Y. Chang (1982). General error-correcting matching of relational graphs for pattern analysis. To appear.
253