Pattern Recognition, Vol. 30, No. 9, pp. 1451-1462, 1997 cL~ 1997 Pattern Recognition Society. Published by Elsevier Science Ltd Printed in Great Britain. All rights reserved 0031-3203/97 $17.00+.00
Pergamon
PIh S0031-3203(96)00181-1
MATCHING STRUCTURAL SHAPE DESCRIPTIONS USING GENETIC ALGORITHMS MONTEK SINGH, AMITABHA CHATTERJEE and SANTANU CHAUDHURY* Department of Electrical Engineering, Indian Institute of Technology, Hauz Khas, New Delhi 110016, India
(Received 13 January 1995; in revised form 24 May 1996) Abstract--This paper presents a genetic algorithm for solving the problem of structural shape matching. Both sequential and parallel versions of the algorithm have been presented. The genetic operators--reproduction, crossover and mutation--have been constructed for this specific problem. A new variation of the crossover operator, called the color crossover, is presented. This operator has resulted in significant improvement in runtime and algorithm efficiency. Parallelization has been achieved using an "island" model, with several subpopulations and occasional migration. A complete framework for an object recognition system using this genetic algorithm has been presented. Encouraging experimental results have been obtained. ':!i) 1997 Pattern Recognition Society. Published by Elsevier Science Ltd. Structural descriptions
Graph isomorphism
Genetic algorithms
1. INTRODUCTION An object can be described in terms of its parts and spatial relations between these parts. Such descriptions explicitly encode spatial organization of the primitives forming the objects. Object descriptions formed in this way are referred to as structural object descriptions. ~1) Structural descriptions are, in general, invariant to geometric transformations. This is true even for perspective and projective transformations when appropriate qualitative descriptors are used for characterizing spatial relations and invariants ~2) are used for the parts. Structural descriptions are useful for part-based representation of object models and scene structures. Recognition schemes based on structural models are robust against occlusion, distortion and deformations. ~3~ Structural description of images have also been used in a stereomatching procedure. ¢4) Techniques for incremental construction of the structural model of a scene are essential for acquisition of the cognitive map ¢5) of the environment by a robot. This integration problem is also important for recognition of objects in distributed sensor networks. ~6) Integration requires identification of common portions in the structural description of two different scenes or views. The structural model of a shape can be represented in terms of a directed or undirected graph in which sets of nodes represent the primitives and arcs represent the spatial relation between the primitives. These graphs are commonly referred to as attributed relational graphsJ 7) Attributes associated with the nodes and the arcs provide the necessary descriptive power to the graph. The problem of matching structural shape descriptions, * Author to whom correspondence should be addressed. E-mail: santanuc @ee.iitd.ernet.in.
Parallelization
in this case, reduces to the problem of graph isomorphism. In the simplest formulation of the problem, given a graph with vertices 1 , 2 , . . . , N and another with vertices 1,2 t, . . . , N t, the task is to find a correspondence between the vertices such that the spatial constraints are not violated. This is required for determining whether two graphs representing two different entities are identical, except for the label differences, indicating that the two objects have the same structure. Formally, a graph G = (V, W) is isomorphic with a graph G' = (V', W') if there is a one--one, onto mapping f from V to V', satisfying (u, v) E E ¢¢~ [f(u),f(v)] E E'. Mapping f is called a graph isomorphism. The problem becomes more complex when a mapping is to be found between the nodes of the graph which would maximize the match between common portions of the two structural descriptions. A special instance of such a problem is called the subgraph isomorphism problem. A graph G = (V, W) is isomorphic to a graph G' = (V', W') if there is a one-one mapping f from V to V', satisfying (u,v) c E ~ [f(u),f(v)] E E'. Here, the objective is to find the largest subgraph of the larger graph that matches the smaller graph. Still more general is the problem of finding the largest subgraph of a graph that matches a subgraph of another. These graph isomorphism problems are NP-hard problems and require application of heuristic schemes like Neural Optimization or Genetic Algorithms for obtaining reasonable solutions (if not optimal) of practical problems. Genetic Algorithms provide a useful paradigm for solving combinatorial optimization problems. ~s) As opposed to other optimization techniques (such as simulated annealing) which work with a single point in the search space, the genetic algorithm maintains a large population of configurations and combs the search space
1451
1452
M. SINGH et al.
from a multitude of points. Besides, in GA each new solution string is constructed from two previous solutions, which means that in a few generations all the individuals in the population have a chance of contributing their desirable traits to the offspring. For these reasons GA has been found to be an efficient optimizer and genetic algorithms have been applied to a wide variety of problems: multiple fault diagnosis, (9) set covering, (l°) controller design, (~ ~) VLSI cad, (~2) etc. However, on the negative side, the memory requirement for GA is large. This paper is concerned with a genetic algorithm for matching the structural shape descriptions. There has been no other work reported so far on application of genetic algorithms to this problem. In this paper, a new crossover and a mutation operator are proposed. These operators can exploit domain-dependent characteristics of the problem for increasing the efficiency of the search. A parallel formulation of the genetic algorithm is also presented in this paper. The performance of the parallel algorithm has been evaluated with an implementation on a transputer network. Finally, the paper provides a framework of an object recognition system using genetic algorithms.
2. GENETIC ALGORITHM FOR SHAPE MATCHING
In this section we have considered the problem of finding a match between a candidate shape and its prototype so that the correct correspondence can be established between similar primitives of the two. Individual primitives or nodes in the corresponding graph are characterized by a set of geometric measurements constituting an attribute vector. Correspondence can be established between the primitives having similar attribute vectors. In addition, spatial constraints between the primitives in one shape must be preserved among the matched primitives in the other shape. Spatial constraints or edges in the graph can be characterized by an attribute vector. The genetic algorithm-based matching scheme proposed here has been formulated on the basis of these considerations. For the genetic algorithm, possible solutions are the genes which are manipulated to obtain the optimal solution. Genetic algorithms in their simplest form consist of the following operations: I. Evaluation of fitness of genes (as measured by their survival probability), 2. Formation of a gene pool through the process of "survival of the fittest," and 3. Recombination by crossover and mutation. These three operators are applied repeatedly to a population of solution strings to result in a new generation. This procedure may be iterated until a desired solution quality is achieved. The choice of these operators and the evaluation function is dependent on the representational scheme chosen for the genes for a given problem.
2.1. Representational issues It has been found that the performance of the genetic algorithm to a great extent depends on the representation of the solution string. A carefully selected representation can greatly simplify the crossover operator. Conversely, an efficient crossover operator may require a particular representation. This interdependence is sometimes ordained by the constraints that have to be satisfied by each solution string. More specifically, for the shape matching problem since the mapping is one-one, every solution string should satisfy the constraint that no more than one node of graph G should map onto the same node of graph G'. This is the mapping constraint. This constraint severely restricts the choice of the crossover operator that can be used for that representation. Below we discuss possible representations and their suitability.
2.1.1. Bit matrix representation. Given two graphs A and B having p and q nodes, respectively, a solution can be represented as an p x q bit matrix X. A cell X(ij) in the matrix is set to 1 if and only if the ith node of graph A is mapped onto the node j of graph B. Because of the optimization constraint, each row or each column cannot have more than one bit set to 1. A row or a column having no bits set to 1 indicates that the corresponding node in the graph has not been mapped to any node in the o t h e r graph. This p o s s i b i l i t y m a k e s the representation versatile enough to handle the problem of inexact matching between the shapes where primitives of one object may not be present in the other and vice versa. But bit-matrix representation makes the design of crossover and mutation operators complicated. 2.1.2. Integer string representation. In this representation, a solution is encoded as an array of n integers (an), where n = max(p, q) and p and q are the number of nodes in the graphs A and B, respectively; (i,ai) is the mapping between node i of the smaller graph and node ai of the bigger graph. Again, this scheme is quite similar to the bit-matrix representation with (i,ai) denoting the (row, col) values that contain a " 1 " in the sparse matrix of the earlier representation. A location in the array having value 0 indicates that the corresponding node in the bigger graph has not been mapped to any node of the smaller graph. However, null-mapping for the nodes of the graph having fewer numbers of nodes cannot be explicitly indicated in this representation. A crossover operator defined on an integer string obviously has to respect integer boundaries (that is, an integer can be swapped or moved around, but cannot be split).
2.2. The fitness function Given a possible mapping between the nodes of the graphs as represented by an element of the population,
Matching structural shape descriptions using genetic algorithms the cost of the mapping can be calculated in terms of distance measures defined between attributed relational graphs. (7) Assume that the solution represents an operation necessary for transforming graph A into graph B. With this assumption the cost for the mapping has following formulation: Cost = (Node Substitution + Node Addition
1453
ratio of the fitnesses of the maximally fit and the minimally fit solution strings a "nice" value, say between 2 and 5. This ratio we term selectivity. For instance, a selectivity of 2 would mean that the best solution string in any given population has twice as much probability of surviving the struggle for existence as the worst solution. Thus, for a particular value of selectivity, Sel, K is determined as follows:
÷ Node Deletion 1 Sel - "
+ [Edge Substitution ÷ Edge Addition + Edge Deletion].
fmax fmin
-
K - Costmin K - Costmax '
therefore, The components of the cost are explained below: Node substitution cost. This term contributes to the cost whenever there is a mismatch between attributes of nodes mapped onto each other. This can be defined as the Euclidean distance between attribute vectors of the nodes. Node addition cost. This term represents the cost of adding a node to graph A for correct transformation. This term is applicable for the case of null mapping for a node in graph B. This cost, like all other insertion and deletion costs, are constant penalty terms. Node deletion cost. This term refers to the cost incurred in deleting a node from graph A for correct mapping. Particular mapping of the nodes implies corresponding transformation operation for the edges linking these nodes. These transformation operations contribute the following coss: Edge substitution cost. This term is the contribution to the net cost from mismatches in edge attributes• Edge addition cost. This term results from the edges that need to be added for correct mapping. Edge deletion cost. This term results from edges that need to be deleted for correct mapping. The fitness function should be a monotonically decreasing function of the cost. This can be achieved in one of the following ways: (K - Cost)" (K/Cost) n
Here, K and n are constants to be determined experimentally, and Cost is the cost function. It is important that the fitness function be chosen very carefully as the reproduction operator is very sensitive to this. Care should be taken that the reproduction is neither overselective (greedy) nor under-selective. Therefore, it is not sufficient that the fitness function f(Cost) be varying in the interval over which it is defined. Since in a given generation, all the individuals have costs that lie in a band, it is important that the fitness function be discerning over the range of values the cost function is expected to take for any given generation. Thus, if we define f as follows: f = K - Cost, then K is a parameter which determines variation infover the space of solution strings. We may wish to have the
K =
Costmax -- Costmin ~-Costmax. Sel- 1
Thus, K is adaptive (Costmax and Costmi n vary with generations), making sure this ratio remains constant. Usually, K is reevaluated every five generations. We have found a value of 3-5 for Sel to be optimal. 2.3. The crossover operator
Genetic operators, namely crossover and mutation operators, are the essential components of any GA formulation since the process of adaptation is governed by these operations. Crossover is the operation by which an offspring solution is obtained from two candidate solutions of a population. Let Ci and C j be two valid candidate solutions. Valid candidate solutions are those which satisfy the mapping constraint. Then the crossover operation represented as
@(ci, cj) -~ ck must be constrained to produce valid solutions C~. Elimination of infeasible solutions will automatically reduce the search space. In this section, reformulation of classical crossover operators have been considered so that only valid offsprings are produced by them. Three crossover operators have been considered here, namely Order, P M X and Cycle. The operators have been described with reference to integer string representation. 2.3.1. Order crossover. In order crossover, a cut-point is selected at random in the integer array of the parents. An array segment to the left of the cut-point is copied from one parent to the offspring. The remaining portion of the offspring array is filled by going through that portion of the second parent which is to the right of the cut-point. In case there is a conflict, select elements in order from the remaining portion of the second parent which are not violating the mapping constraint. 2.3.2• P M X crossover. PMX stands for partially mapped crossover. It is implemented in the following way: choose a random cut-point in the integer array and consider the segments following the cut-point in both the parents as partial mapping of the cells to be combined to generate the offspring• In order to satisfy the mapping constraint, certain additional steps are essential• In the offspring, if a particular node a of graph
M. SINGH et al.
1454
A is found to be mapped to two nodes b and c of the other graph B, consider mapping of the nodes b and c in one of the parents and replace occurrence of a as mapping of b with the mapping of c in the parent.
2.3.3. Cycle crossover. In the cycle crossover operation, location 1 of the integer array of, say, parent 1 is considered first. The corresponding entry is copied to the offspring. Now, the occupant of location 1 of the other parent is searched for in the integer array of parent 1. If it is found at location x then the occupant of location x is copied to the offspring. Now, the same steps are repeated for the occupant of location x in parent 2. This process continues until a cycle is formed, i.e. we reach a location which has already been copied to the offspring. If all the cells of the offspring are not filled, the same process is started with a location in parent 2. These operators can also be applied with the bit-map representation of the solution state. In this case, we need to take care of the possibility that none of the rows and columns have more than one nonzero entry. This involves additional computational complexity for the offspring generation task. An example of the application of these three operators for the integer string representation is presented in Fig. 1. These three operators are, however, generic in nature. Therefore, though widely applicable, they are not always very useful. They do not use any information about the problem domain that may be available. Combining these operators with domain-dependent strategies for local search space exploration can speed up convergence.
PARENT STRINGS
ABCDEF I GHIJ ECIADH I JBFG I cut-point
OFFSPRINGS:
ORDER CROSSOVER ABCDEFJIHG PMX CROSSOVER ABCDEFJHIG CYCLE CROSSOVER ACIDEHGBFJ Fig. 1. Example of crossover operators.
2.4. The mutation operator Mutation is effected simply by swapping the correspondences of two nodes in the integer array. Assume that before the mutation, (a~a2..-ajak... an) is a solution string. Then, we select a pair of positions j and k at random and swap aj and ak. Thus, (ala2"" akaj.., a,) is the mutant string. The same steps are implemented for bit-array representation. In this case a flipping is also accepted as legal if and only if it does not violate the mapping constraint.
3. IMPROVED GENETIC OPERATOR
3.1. Motivation Jbr the operator It is in general observed that as the problem size becomes larger, the traditional crossover operators become increasingly ineffective. This is because the survival of a schema is inversely related to its length. Since the population size is limited by constraints of space and time, the traditional crossover operators, while exploring newer areas in the search space, often eliminate some very good specimens by unwittingly destroying their genotype. The solution to this problem lies in reducing the schema length. For the present problem, each node in a graph has associated with it an attribute vector. We can define a
function diff as follows: d i f f : A × A ---+ R, where A is the set of attribute vectors. Here diff is the measure of the mismatch between the nodes. This mismatch can be attributed to the difference in shape of the corresponding subparts of the object. The node substitution cost is expected to be a function of diff. It is very likely that similar subparts will be in correspondence in the final solution of the shape-matching problem because the corresponding mapping would incur minimum substitution coss. This information can be effectively used for increasing the efficiency of the genetic algorithm by reducing the search space.
3.2. Pre-processing for efficient searching When two graphs are being considered for matching, before initiation of the genetic algorithm, nodes can be grouped into similarity classes using a simple hierarchical clustering algorithm. In this scheme, nodes are grouped into the same class as long as maximum intra-class distance, i.e. diff, does not exceed a threshold. The maximum substitution cost that would be incurred by mapping nodes of the same similarity class to one another will be bounded by a function proportional to this threshold. Let the node substitution cost function be a
Matching structural shape descriptions using genetic algorithms monotonically nondecreasing function of diff. Then, if the threshold is less than the minimal inter-class separation, it is guaranteed that the substitution cost among the nodes belonging to the same class is less than that between nodes belonging to different classes. Under this circumstance substitution between nodes belonging to the same class is a preferred operation. In particular, if we consider the problem of finding an exact match of a subpart in another object, i.e. the exact subgraph isomorphism problem where possibilities of node/edge insertion and/or deletion are not taken into account, the optimal solution, under this condition, will contain only mapping between nodes belonging to the same class. For the general shape matching problem, it is not essential that the optimal solution will contain only mapping between nodes belonging to the same class because substitution between nodes belonging to different classes can provide better match between remaining elements of the shape. But, if we consider only the possibility of substitution between nodes belonging to the same class, it is very likely that we shall be obtaining good, possibly suboptimal solutions in much less time because we are restricting the set of possible matches between the nodes and thereby reducing the search space. Therefore, consideration of only substitution between nodes belonging to same class can help us to speed up the genetic algorithm. In the subsequent discussions, nodes belonging to the same class are said to have same color.
3.3. Modified crossover operator Once nodes are grouped into classes by preprocessing, a node of one graph can be mapped into a node of an other graph only if both have the same color. Thus, the solution string can be represented by as many substrings as there are nodecolors in the graph. Within each substring or submatrix, all the nodes have the same color (isochromic nodes). The solution, then, is represented as a "hyperstring." Hyperstrings or Hyper bit matrices can also have null mappings, if same number of nodes with identical color are not present in both the graphs. During crossover, each of the substrings in one parent is crossed with its counterpart in the other parent. This avoids the potentially wasteful mating between strings of different colors. It is immediately apparent that the resulting algorithm will be faster by orders of magnitude.
1455
3.4. The color mutation operator We randomly select a color out of all the node colors. We then pick out the substring of that color from the hyperstring. Mutation is then performed on it in a manner similar to that described for the simple mutation operator before.
Algorithm ColorMut (Solution 1): Output: Solution2 1. Randomly choose one of the nodecolors. 2. Select the substring corresponding to that nodecolor,
(ala2....ajak'"a,,). 3. Select a pair of positionsj and k at random and swap aj and a k. 4. Return (ala2 . . . a k a j . . . an) as the mutant string.
4. PARALLELIZATION Distributed implementation of Genetic Algorithms is possible by creating multiple sub populations and allowing evolution of each population in parallel. The basic steps involved in the parallel algorithm are as follows: • A genetic representation of the optimization problem is defined. • A number of subpopulations are created. • Each subpopulation evolves independently of others. Mating, mutation and reproduction are carried out within every subpopulation. • Migration of individuals across subpopulations occurs at regular intervals. In the island model of parallel genetic algorithms, each sub population is isolated and each sub population accepts immigrants in accordance with a predefined policy. Distributed implementation of the genetic algorithm obviously allows faster convergence to the solution since subpopulations are allowed to evolve concurrently in an asynchronous fashion. Larger population sizes can be explored because the amount of memory available for exploring the solution space increases with the availability of multiple processing units. Having several subpopulations with different mutation rates also helps in avoiding the problem of preconvergence to a local minimum. ~13~The gene pool is able to preserve diversity much better in this distributed
model. Algorithm ColorXO_l (Solutionl, Solution2): Output: Offwring l , Offspring2 1. For each of the color substrings of Solutionl, select that substring of Solution2 that has the same color. Perform Step 2 for each such pair. 2. Mate the two substrings using any of Order, PMX or Cyclic crossover operators, yielding new substrings. 3. Steps 1 and 2 yield sets of color substrings. Each of these represents a hyperstring. The hyperstrings thus formed are the offspring which go into the next stage of the genetic cycle (struggle for existence).
The most important issue for the distributed algorithm is the design of the appropriate criterion for selection of solutions that are to be communicated to neighboring populations, and the rules for selection of solutions that these will replace. One could do with random selection, but then the migration will not be very effective as "good" and "bad" solutions will have equal probability of selection. Clearly, the selection criteria should be biased in favor of solutions with lower cost (or, higher fitness) values and the replacement strategy should pick poorer solutions with greater probability.
1456
M. SINGH
One way to implement this is to define a as follows:
migration
function
MTx (i)
exp(- K /Fitnessi ) ~ j exp(-K/Fitnessj)'
where K is a user-selected constant. The denominator is the normalizing summation which makes the value of the migration function bounded between 0 and 1. However, its value undergoes an exponential decay as the fitness of the solution decreases. Consequently, a solution with high fitness has a very high migration function value and it becomes the most likely candidate to migrate. For solutions with low fitness, the value of the migration function becomes almost the same. The replacement .function is similarly defined:
MRx (i)
exp(-Fitnessi/ K) ~ j exp(-Fitnessj/K)
In this case, the value of the replacement function for a solution with a minimum fitness value is maximum because it is the most appropriate candidate for replacement. The distributed algorithm is essentially equivalent to a sequential algorithm running on each processor node, with migration at periodic intervals. During each migration, a number of solutions are chosen to be communicated to other processors. The frequency of migration, as also the number of migrants (measured as a fraction of the total population), are important parameters for the distributed algorithm. The probability of a solution string i being chosen for migration (nondestructive copy) is given by MTx(i), the migration function. The probability that, on the destination processor, solution stringj will be replaced by the incoming migrant is given by the replacement function, MRx(J). Thus, the fitter ones are selected for migration, and the poorer ones are removed to make space for the former. In general, the mutation probabilities may be kept different for different subpopulations. We kept the mutation probability substantially higher in a few subpopulations than in the rest. These subpopulations then act as vortices of gene production, thereby enriching the gene pool. Excessive random perturbations in the gene pool will, however, be held in check by the other subpopulations. Hence, the ability to learn from history will not be lost. This results in a much faster evolution process. As an illustration, suppose we take two runs of the parallel genetic algorithm, each with four subpopulations. Assume in the first run the mutation probabilities of the four subpopulations are all equal to 0.01. In the second case, assume the respective values to be 0.005, 0.005, 0.005 and 0.025. Now, it may be noted that the average mutation probability in both cases is 0.01. Therefore it might be expected that the runs should yield similar results. However, it was observed that the second case produced much better results. We offer the following explanation for this phenomenon.
et al.
A mutation most often results in a degradation of the phenotype. As such, in a large population of fit individuals, such a mutation is unlikely to be propagated through generations. This is responsible for the relatively poor performance of the first run. However, some of these mutations may be promising candidates for the future and should be given a chance. In the second run, one of the processes had a very high mutation rate of 0.025. This results in a large number of mutations being induced into the population. The average fitness of the subpopulation drops and the whole subpopulation becomes somewhat unfit. Therefore, relatively speaking, the chances of survival of a "bad" mutation somewhat increase. These mutations, albeit few in number, are responsible for enriching the gene pool by introducing diversity. This explains why an unequal distribution of mutations amongst the subpopulations yields a more efficient algorithm. The above observation for the parallel algorithm is consistent with the results obtained for the improved genetic algorithm proposed in reference (13).
5. OBJECT RECOGNITIONUSING GENETIC ALGORITHM In this section, we propose a framework for the development of an object recognition system using the genetic algorithm presented here. In the model based object recognition system we need to look for the best match of the candidate object with the members of the object library. Each element in the object library is expected to be represented by an attributed relational graph. The object library contains additional information about the types of nodes. Depending on the attributes and the nature of attributes, the nodes are grouped into classes. These classes can be considered as the set of generic parts of the shapes and they correspond to chromatic labels of the nodes. Given a candidate object, nodes of the object are mapped to the nearest class using the simple nearest neighbor classification rule. Genetic populations considered for this problem will contain solution strings/bitmatrices corresponding to each element of the library. Consequently, each solution representation has an additional field indicating the identity of the corresponding object. Again elements in each solution are organized into isochromatic groups. For the purposes of successor generation, the crossover operator is applied between isochromatic substrings of the genes referring to the same object in the library. For generation of the initial population the following strategy is applied. Each object in the library is associated with an accumulator. In the process of initial node classification, whenever a particular node of the given unknown object is mapped to a node class, all the objects which have contributed at least a node to that class are considered as promising candidates and the accumulator is incremented. After completion of the mapping process, the initial population is generated on the basis of the value of the accumulator associated with the object. The proportion of solution strings belonging to a particular object are determined by this value. If the available
Matching structural shape descriptionsusing genetic algorithms memory can accommodate M members of the population and an object has accumulated x votes, then x * M / y number of solution strings of the initial population will belong to this object, where y is the total number of votes accumulated by all the objects in the library. In this way we can ensure that the initial population contains a greater proportion of better candidates without losing variety in the population. The above mentioned approach becomes particularly-suitable for part based object recognition schemes. In the part based scheme, given a set of parts we assume that all the objects are composed of some or all of these parts and there could be multiple instances of these parts in an object. Objects are distinguished on the basis of the spatial configurations of these parts. These parts naturally correspond to the classes of nodes described earlier. For a very large collection object models, typically in a pictorial or a graphical database, when an object is to be identified from partial information, this genetic algorithm provides an efficient solution strategy. For the problem of recognition of objects in a complex multiple object image, this genetic algorithm based approach can be applied to the consistent merging of object hypothesis. Invariant based indexing techniques have been found to be a powerful method for the recognition of objects in a cluttered environment under affine and projective transformations. ~2)This approach is particularly suitable when a large number of object models are involved. Invariants computed from the image are used to generate index vectors for the object models. On the basis of mapping formed through this indexing process, initial object hypotheses are generated. These invariants are computed from a collection of primitives found in the image which are mapped to the corresponding collections in the object models. Since there can be multiple collections of such primitives in the same object as well as different objects more than one object hypothesis will be generated by this process. Once all the hypotheses are generated, it is essential that hypotheses be merged so that a consistent set of hypotheses can be selected for the final verification task. For hypotheses merging we need to consider all possible combinations of these primitive mappings for the individual objects. These combinations can be represented using the integer string based representation discussed in Section 2, because through the indexing process possible mappings of only primitives of model objects have been obtained. The cost of the mappings will be determined by the distance between the edges representing spatial relations between the primitives. Optimal combination will provide the best hypotheses. This a combinatorial optimization problem having exponential time complexity. Particularly for complex objects having local invariants the overhead of this computation can be large. Hence, the present formulation of the genetic algorithm provides an attractive alternative. Straightforward application of the genetic algorithm using the standard crossover operator formulated in this paper is appropriate for this problem.
1457
5.1. Recognition of planar shapes---a case study In this section we consider the application of genetic algorithms formulated in this paper for the recognition of planar shapes. We have considered the problem of finding the closest match for a given handtool among the set of similar handtools known a priori. As handtools, different types of hammers have been considered. Hammers have been assumed to be composed of subparts each of which is characterised in terms of their feature vectors describing their shapes. They belong to any of the three clusters representing the following shapes: triangle, rectangle, trapezium. Various types of joins among the parts have been used as spatial relations between the parts. Instead of specific numeric parameters, use of the basic nature of the join as relational features has enabled us to obtain invariant shape descriptions despite minor variations in the shape. Different types of joins considered for the present problem are following: (T1,X, Y) join of one end of primitive X with the primitive Yat the mid-point of Yin an orthogonal fashion. (T2, X, Y) join of endpoint of the primitive X with that of Y (T3,X~ Y) placement of the primitive X or Y at the middle of the other between its both ends. Corresponding to these types of joins, edges in the attributed relational graphs have been labeled as Rl, R2, R3 respectively. Individual prototype handtools and the corresponding attributed relational graphs are shown in Fig. 2. In the figure individual primitives have been indicated as Li. In case of multiple instances of the same primitive, Lj indicates the ith instance of thejth primitive. For experimentation we have considered the problem of matching undistorted hammerl with the stored prototypes. We have considered the color-crossover and mutation operators. In all the cases the correct match is found within 10 generations. We also considered the problem of finding the match for distorted hammers (in fact hammers with missing parts). These hammers and the corresponding attributed relational graphs are shown in Fig. 3. Even with this hammer correct recognition results have been obtained. The genetic algorithm produced correct results within 15 generations. In all these experiments, the initial population contained five randomly chosen candidate solutions. These case studies establishes feasibility of the approach. In the next section we describe extensive experiments carried out with randomly generated graphs to establish the effectiveness of the crossover operator proposed, the parallelization scheme and the effect of different parameters on the performance of the algorithm.
6. RESULTS AND DISCUSSION
In this section we present the results the regarding performance of the genetic algorithm. The objective of this phase of experimentation was evaluation of the genetic algorithm. For this purpose we have used synthesized attributed relational graphs of different sizes.
1458
M. SINGHet
al.
1.3 2 L4
I
L4
R3
~©L3 L2 Ll
(
0:4
R2~
~
LI4
HAMMER 1
R3 L6 L2 LI RI i
L2
--~L6 ~MM~2
2 L5
R?
1
L5 2 L5
k.JL3
L2
R2 L5 HAMMER 3
?Z LI
RI
L2
L7
L2
i HAMMER 4
Fig. 2. Modelhandtools.
Matching structural shape descriptions using genetic algorithms
1459
R3
I ,l
LI
L3
LI
2
L2 k..~ L 4
DISTORTED
HAMMER
1
L3
RI L2 LI
1
L 5 k__) L3
DISTORTED
HAMMER
3
Fig. 3. Distorted handtools.
(a)
(b)
4°° I
2O
250
\
200 150 100
8 E i=
Y
15 10 5
50 0
//
25
350 L 300
(5
30
0
i
t
i
f
5O
1O0
15O
200
0
250
0
i
i
i
t
50
100
150
200
250
Population Size
Population Size Fig. 4. (a) and (b): tuning population size.
6.1. Sequential algorithm The following are the parameters that were varied in the implementation.
6.1.1. Population size. The p o p u l a t i o n size determines how extensively the search is conducted over the solution space. A small population would mean
less genetic memory. A large population, on the other hand, translates into more computation time per generation. Figure 4 records the variation in the performance of the genetic algorithm with variation in population size for two sample problems involving graphs of size 20 and 10 nodes, respectively. The performance is evaluated using two indices: (i) number of generations before the first exact solution is
M. SINGH et al.
1460
140
140
120
120
100
100
0
o3
(.9
o
80
80
60
60
40
40
20
20
0
i
0
0.1
i
i
i
0.2 0.3 0.4 Mutation Probability
0
i
0.5
0.6
r
~
i
i
0.4
0.6
0.8
1
Mating Probability
Fig. 5. Tuning mutation probability.
found,~ and (ii) time taken for such solution to be found. We have found that a population of 100 is sufficient for a problem size of 20 × 20 nodes (these pertain to the number of nodes in the Graph and the Template). A 100 x 100 nodes problem, however, needs a population size of 300-400.
i
0.2
Fig. 6. Tuning mating probability.
120 100
\
80 ._o 60
6.1.2. Mutation probability. A mutation probability between 0.03 and 0.15 mutations per population works well. Higher values (>0.15) sometimes result in quicker solutions but poorer convergence. Therefore, we have used a value of 0.1 or 0.09 in most problems (see Fig. 5). However, for large problem sizes (greater than 30 nodes), increasing it slightly to 0.12 gives better results. Since the size of the search space increases exponentially with the number of nodes in the graphs, and the population size has to be kept within the constraints of computing memory, increasing the rate of mutation is one way of conducting a more extensive search. 6.1.3. Mating probability. This is the probability that a solution string will undergo the mating phase. Thus, if the mating probability is 0.75, a quarter of the solution strings will move into the next generation without undergoing crossover. A very low value of this parameter makes the genetic algorithm dependent upon mutation as the primary means of evolution, thereby resulting in longer execution times. Too high a value introduces a tendency to eliminate potential solutions, again increasing runtimes. We have found a value of 0.7 to be optimal (see Fig. 6). 6.1.4. Selectivity. It is a measure of the variation in the fitness function across the population. Assigning it a fixed value makes the fitness function adaptive to
lAn exact solution may not exist for some problems. For such problems, the statistics pertain to the number of generations or time taken to converge. The convergence criterion: if the maximum fitness in the population does not increase for 10 successive generations, the algorithm is halted.
/
/
/
40 20 0
i
2
i
i
4 6 Selectivity
i
8
10
Fig. 7. Tuning selectivity.
changes in the constitution of the population. A higher value implies greater discrimination. Too high a value squanders the fruits of mutation since mutant offsprings are, more often than not, temporary misfits but potential solutions. It also reduces diversity. Too low a value, however, implies a slower algorithm. We have found a value of 3-5 optimal (see Fig. 7). Tables 1 and 2 summarize the algorithm performance (as measured by the number of generations to convergence) for different problems. The problems were generated randomly such that the average number of neighbors of each node was around three to four for the smaller problems (10-30 nodes) and around eight to ten for the bigger problems. Ten different problem instances were tried for each problem size listed. Each problem instance was tried three times using different random number seeds. There was no noticeable difference in algorithm performance for the three traditional conflictless crossover operators--Order, PMX and Cycle. The results for these three operators are shown together in Table 1. Table 2 summarizes the results obtained for the algorithm that uses the color operators. We have used two equiprobable colors as node and edge attributes.
1461
Matching structural shape descriptions using genetic algorithms Table 1. Algorithm performance when Order, PMX and Cycle crossover operators are used Problem size*(Nxn) Search space (Np,) Mutation prob. (optimal) Population size Number of generationst Solution quality§
10x 10 3.6 × 10 6 0.05 100 30-50 Exact
30x30 2.7 × 1032 0.09 100 Infinity* ~
50×50 3.0 x 1064 0.10 100 Infinity --
* N and n are number of nodes in the Graph and the Template, respectively. t This is the number of generations before either an exact solution is obtained, or the algorithm converges. Ten problem instances for each problem size were tried, and the number of generations is reported as a range. Z~Forproblem sizes as large as 30 × 30 and larger, the genetic algorithm failed to yield acceptable solutions. Even when it was allowed to run for 200 generations or more, the solution quality (defined later) remained less than 0.5. § Solution quality is defined as the ratio of the number of edges of the Template (7) that are correctly "matched" with corresponding edges in the Graph (G). An edge Er in T is said to be correctly matched with its counterpart Ec, in G if their associated edge attribute vectors are equal (or the diff between them is below a threshold) and the bounding nodes of Er match the bounding nodes of Ec in node attributes within a threshold. • The solution quality was less than 0.5.
Table 2. Algorithm performance when the Color crossover operator is used (two colors) Problem size (Nxn) Mutation prob. (optimal) Population size Number of generations Solution quality
10× 10 0.05 100 5-15 Exact
Thus, the color crossover operator outperforms the traditional crossover operators--Order, PMX and Cyclic. The results in Table 2 have been compiled for problems involving only two colors. If, instead, three or more colors are used, further improvement results--the solution quality in the third class (50 x 50) becomes "exact" and the nlntimes in the first two size classes (10 x 10 and 30 x 30) get reduced.
30×30 0.09 100 50-100 Exact
50x50 0.10 100 100-200 0.8
Table 3. Performance of the distributed algorithm with the color operators Problem size (Nxn) Mutation probability Population size per node Number of generations Solution quality
30x30 * 100 10--30 Exact
50x50 * 100 70-100 Exact
*See text.
6.2. Parallel implementation The following parameters are critical for the parallel implementation.
6.2.1. Mutation probability. A mutation probability of 0.05~0.10 has been used on most subpopulations, and a higher value of 0.3~).4 has been used in one of them. 6.2.2. Time to migration. This is the number of generations after which each subpopulation allows migration. A very low value not only nullifies the effect of isolation (the basic philosophy behind the "island" model), but also leads to excessive processor communication overheads. On the other hand, with a high value of this parameter, the subpopulations evolve independent of each other, and the algorithm tends to lose the benefits of parallelization. We have used a value of 5 for this parameter. 6.2.3. Fraction to migrate. This is ratio of the number of migrants from each subpopulation to the total size of that subpopulation. Too high a value implies a homogenous population, and defeats the very principle of isolation upon which the distributed model was
based. Similarly, too low a value is unacceptable. It is meaningless, however, to talk of this parameter without regard to the number of generations between successive migration events. If migration is allowed every five generations as mentioned above, a value of one-tenth to one-twentieth for the above parameter results in a good evolutionary process. Table 3 shows a few results of the parallel implementation. Ten instances of each problem size were generated as for the sequential version. The parallel algorithm was run on four transputers.
7. CONCLUSIONS This paper presents a genetic algorithm for matching the structural shape description of objects. A framework for an object recognition system using genetic algorithms has been provided here. Other contributions of this paper are the following: • We have tuned the genetic operators, namely reproduction, crossover and mutation, to the specific problem of shape matching.
1462
M. SINGH et aL
• We have designed a new variation o f the crossover o p e r a t o r - - the color crossover--which speeds up the evolution process by an order o f magnitude. • We have done the parallelization using the island model. We have suggested certain optimal values o f the genetic algorithm parameters after extensive experimentation. This technique will be particularly suitable for retreival and identification o f records in a large pictorial databases using shape descriptions of manufactured objects. REFERENCES
1. L. G. Shapiro and R. M. Haralick, Organisation of relational models for scene analysis, IEEE Trans. Pattern Analysis Mach. lntell. PAMI-4(6), 595-602 (1982). 2. C. A. Rothwell, Recognition using projective invariance, D.Phil. Thesis, Department of Engineering Science, University of Oxford, Oxford (1993). 3. R. T. Chin and C. R. Dyer, Model based recognition in robot vision, ACM Comput. Surveys 18, 69-108 (1986). 4. K. L. Boyer, A. J. Voyda and A. C. Kak, Robotic manipulation experiments using structural stereopsis for 3d vision, 1EEE Expert 2, 73-94 (1986).
5. B. J. Kuipers and T. S. Levitt, Navigation and mapping in large scale space, AI Magazine 9(2) (1988). 6. Y. C. Tang and C. S. George Lee, A geometric feature relation graph formulation for consistent sensor fusion, IEEE Trans. System Man Cybernet. 22(1 ), 1 l 5-129 (1992). 7. A. Sanfeliu and K. S. Fu, A distance measure between attributed relational graphs for pattern recognition, IEEE Trans. Systems Man Cybernet. SMC-13 (1983). 8. L. Davis, ed., Handbook of Genetic Algorithm. Van Nostrand Reinhold, New York (1991 ). 9. J. A. Miller, W. D. Potter, R. V. Gandham and C. N. Lapena, An evaluation of local improvement operators for genetic algorithms, IEEE Trans. System Man Cybernet. 23(5), 1340-1352 (1993). 10. G. E. Leipins, M. R. Hilliard, M. Palmer and M. Marrow, Genetic algorithm applications to set covering and travelling sales problems, OR/Al:Integration of Problem Solving Strategies (1990). 11. Alan Varsek, Tanja Urbancic and Bodgman Filipic, Genetic algorithms for controller design and tuning, IEEE Trans. Systems Man Cybernet. 23(5), 1320-1339 (1993). 12. P. Majumdar and K. Shahookar, A genetic approach to standard cell placement using meta-genetic parameter optimisation, IEEE Trans. CAD 9 (1990). 13. J. C. Potts, T. D. Giddens and S. B. Yadav, The development and evaluation of an improved genetic algorithm based upon migration and artificial selection, IEEE Trans. Systems Man Cybernet. 24(1), (1994).
About the A u t h o r - - M O N T E K SINGH is currently a Ph.D. student with the Department of Computer Science,
Columbia University, New York. He received the B.Tech. degree in Electrical Engineering from the Indian Institute of Technology, New Delhi, in 1993. His current research interests are in CAD, testing, asynchronous circuits and combinatorial optimization.
About the A u t h o r - - A M I T A B H A CHATTERJEE is currently with Schlumberger Wireline and Testing. He
received the B.Tech. degree in Electrical Engineering from the Indian Institute of Technology, New Delhi, in 1993.
About the A u t h o r - - S A N T A N U CHAUDHURY is an Associate Professor with the Department of Electrical
Engineering, Indian Institute of Technology, New Delhi. He received the B.Tech. degree in Electronics and Electrical Communication Engineering in 1984 and the Ph.D. degree in Computer Science and Engineering in 1989 from the Indian Institute of Technology, Kharagpur. He was awarded the Indian National Science Academy medal for Young Scientists in 1993. He has published more than 40 papers in journals and conference proceedings. His research interests are in computer vision, pattern recognition and multimedia systems.