Object representation based on contour features and recognition by a Hopfield-Amari network

Object representation based on contour features and recognition by a Hopfield-Amari network

Neurocomputing 16 (1997) 127-138 Object representation based on contour features and recognition by a Hopfield-Amari network Alan M.N. Fu*, Hong Ya...

616KB Sizes 0 Downloads 29 Views

Neurocomputing

16 (1997)

127-138

Object representation based on contour features and recognition by a Hopfield-Amari network Alan M.N. Fu*, Hong Yan Department of Electrical Engineering, University of Sydney, NS W 2006, Australia Received date 12 Feburary

1996; accepted

10 March

1997

Abstract In this paper, a type of Hopfield-Amari neural network is built based on a so-called curve bend function (CBF) for recognition of planar shapes (contours). Two kinds of features, the real-valued features and binary features, are defined by means of the CBF for given contours to characterize the shapes. The overlap between features are reduced effectively in the process of a network construction. The experimental results demonstrate that the proposed system is powerful and reliable in solving shape recognition problems. Keywords: Contour feature; Curve bend function; Optimized threshold

Contour recognition;

Object representation;

1. Introduction Hopfield network [8] is a powerful tool to solve pattern recognition problems. The methods can be used in many applications [9,3, lo]. However, for a particular problem, it is not known in advance how long the network will arrive a stable state, and also it is not known whether the stable state is a global one or not. Another Hopfield-type network is the Hopfield-Amari network, which is a synchronous update recurrent associative memory network. The statistical dynamics of the network has been studied by many researchers (for example, [ 1,2,11,4]). An essential dynamical feature of the Hopfield-Amari network is that it always converges to a stable state within about 40 iterations [6]. This property enables us to solve pattern recognition and shape analysis problems from its dynamic process directly [5]. In this paper, the Hopfield-Amari network designed by using the properties of a function called the curve bend function (CBF) [7] is proposed for classifying planar * Corresponding

author. Tel.: +61 2 9351-4824;

0925-2312/97/$17.00 Copyright PII SO925-23 12(97)00026-X

0

fax: +61 2 9351-3847;

e-mail: [email protected]

1997 Elsevier Science B.V. All rights reserved

128

A.M.N.

FM, H. Van/ Neurocomputing

16 (1997)

127-138

shapes. In Section 2, we present the CBF and define the features of a planar shape. The structure of the network is given in Section 3. The experimental results and conclusion are presented in Sections 4 and 5, respectively.

2. Properties of CBF and features of a planar shape A contour

is the 8 connected

I@, v) =

1 0

boundary

of an object represented

by a binary

image:

if (x, y) E object, otherwise.

The points on a contour can be traced and labeled counterclockwise from an arbitrary starting point, and represented by the array Sz = {Sk, k = 0, 1,. . . , M - l}, where M is the total number of points. A contour is a closed planar curve, thus Q can be considered as a periodic function with a period M, i.e. we define SM+~= Si and S_i=SM,, i=O,l,..., M - 1. A portion of a contour, composed of 2J+ 1 points Si, . . . ,S~+JJ, is denoted by Di(J), where J is a positive number called the supported length or step length. Suppose a line segment Si+JHi+J is drawn perpendicular to SiSi+u, and Hi+J is the intersecting point of Si+JHi+J and SiS’+,.

Definition 1. If there exists a positive number k>O such that cos LSi-_ISi-_I+JHi__I+J 5 COS LSiSi+JHi+J 2 COS LSi+lSi+l+JHi+/+J, for I=O, 1,. , k, then L_SiSi+JHi+J iS referred to as a curve bend angle (CBA) of D,(J). It is satisfactory to set k = 5-10.

Definition 2. For a given J, if there is no more than one CBA in each portion Di(J) of a contour, then Di(J) is known as a simple curve segment (SCS), and the J is referred to as a suitable step length. Definition 3. For a given suitable step length J, the type coefficient of Di(J) is defined as ri = 21(Clx,

where I(x,y)

Ciy) -

1

(2)

is given by Eq. (l),

(cix, ciy) is the coordinate

segment SiSi+ZJ. If ri = 1, then curve SizzJ is a concave segment. A reliable below.

description

of a contour

of the centroid

is a convex or line segment,

can be generated

ci of line

otherwise

it

by the CBF which is given

Definition 4. A CBF is defined on a contour D given by G(Si) = ri COS( LSiSi+JHi+J), where c is the type coefficient

Si E Sz,

of Di(J)

given by Eq. (2).

(3)

A.M.N.

Fu. H. Yan/Neurocomputing

Fig. I. Shape

16 (1997)

127-138

29

I

1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1



’ loo

I

200

,

I

300

Y,

400

I

500

600

S Fig. 2. The description

of Shape

I

More detailed discussion on how to yield a reliable CBF can be found in our related work (in press). It is found that J = (0.02-0.04)M is suitable in defining a CBF. In our experiment J = 0.02M. The diagram of G($) in Fig. 2 shows the description of Shape 1. The absolute value of G(L$) measures the magnitude of a CBA, while its sign indicates whether the CBA is an inner or outer angle. Shape 2 in Fig. 3 is the capital character “F” whose description is shown in Fig. 4. The information of angles of Shape 2 yielded by the G(&) is presented in Table 1. It shows that the estimation error of magnitude of an angle by the G(S,) is less than a few degrees if the supported length is selected suitably. Shape 3 in Fig. 5 is a circle. In theory, for a given J, the G(S,) of a circle should be a constant. However, due to noise, G(S, ) may vary slightly (see Fig. 6).

130

A.M.N.

Fu, H. Yani Neurocomputiny

16 (1997)

127-138

I

H E

D

Fig. 3. Shape 2.

0.8

-

0.6

-

0.4

-

0.2

-

0 -0.2

-

-0.4

-

-0.6

-

-0.8

-

-1 c

I

I

I

I

I

200

300

400 s

500

600

’ 100

Fig. 4. The description

of Shape 2.

Table 1 Types and degrees of angles of shape 2 Angle

Type of angle

Angle (deg)

A B C D E F G H

Inner Inner Outer Inner Inner Outer Outer Inner Inner Inner

89.5 89.5 -95.9 89.5 89.5 -95.9 -95.9 89.5 89.5 84.5

I J

angle angle angle angle angle angle angle angle angle angle

A.M.N.

Fu, H. Yan/Neurocomputing

16 (1997)

127-138

131

Fig. 5. Shape 3.

0.4

-

0.2 OP -0.2

-

-0.4

-

-0.6

-

-0.8

-

-1











100150200250~350400450500

Fig. 6. The description







‘,

of Shape 3.

In order to yield a set of reliable dominant points of 52 by G(S& the two procedures for pre-processing of G(Si) described below are applied. l Most angles having values close to 180” are spurious. We have to remove these angles in the pre-processing stage by a thresholding-filter defined below:

G(Z)

if Ill otherwise,

l

>ct,

(4)

where q is a threshold value, and it is chosen to be 0.3 in our experiment. To ensure that no more than one angle exists in each SCS, supposing that there are m angles: Ai,, Ai?,. . . , Ai,,t lie on Di(J), we define

IG(Sl)I =

l/m,$ IG(S,)I

(5)

and

IG(Si,)I=O,j=2 ,..., m.

(6)

A.M. N. FM, H. Yanl Neurocomputing

132

16 (1997)

127-138

After the above pre-processing procedure we obtain an approximate representation of 52, a curve-polygon (CPG) ’ having L corners at which each angle is smaller than 145”. Thus, the problem of contour recognition is reduced to the problem of classification of CPG. A CPG can be characterized by two groups of features which are given below. For simplicity, let &(.J) denote the curve segment Dk,(J)= {Sk,,. ..,&,+u}, and & denote Sk,. ( 1) Real-valued features. (i) The kth corner cosine of CPG is defined as us

= I@&

(7)

(ii) The kth centroid-corner n ^ U2(k) = cos( LO&&+l).

(iii)

cosine of CPG is defined as (8)

where 0 is the centroid of the shape. The relative cosine between kth and jth corners of CPG is defined as

u3(kj) =

COS( L$kkO$j).

(9)

(2) Binary features. (i) The kth corner coefficient vi(k) =

of CPG is defined as

pk>

(10)

where the ik is the type coefficient of B,(J) given by (2). (ii) The kth comer left coefficient of CPG is defined as @(k) = 21(&k-l,

(iii)

j$k_-]) - 1,

(11)

where (&k- 1, jkk_, ) is the coordinate of the centroid The kth corner right coefficient of CPG is defined as

Q(k) = z(%k+i, where

@kk+l,

j&+,)

of &!$_l.

- 1,

.?kk+,)

(12)

is the coordinate

of the centroid

of line segment

sksk+l.

3. Neural network structure and matching schemes Let n, and n, represent the numbers of elements (i.e. vertexes) of sample and model shapes’ CPG, respectively. For a given model shape and sample shape, a twodimensional Hopfield-Amari network with nsnm neurons related to the two shapes can be constructed. In general, the columns of network correspond to the elements of model

’ Two edges of a corner may be line segments ordinary polygon.

or one or both of them are SCS. This is different with an

A.M.N.

FM. H. YanlNeurocomputing

16 (1997)

127-138

133

shape, and the rows correspond to the elements of sample shape. In order to determine the interconnection weights of the network, we consider its energy function defined as 1 n, n,

n, &ll (13)

where I&(t) is a state variable at step t, which converges to 1 if the ith element in the sample shape matches the kth element in the model shape; otherwise, it converges to -1. Wikj/ is the interconnection weight given by (14) where f,( ) (p = 1, 2, 3) are goodness

.Mx,Y)=

i

y

if Mu”(x), $YY))<~ othem;se

l

A (a(~, Y) = 11, p= 1,2,

(15)

2

if (p(ui(ij),

1 h(V, W =

functions defined as

uy(kl))


- 1 otherwise,

l),

(16)

where Zl(X, Y)

=

zz(ij, kl) =

1 if

- 1

A&,
otherwise,

1

if /$,

-

1 otherwise,

(u;(i) + vi(j) = v:(k) + $YO),

(17)

(18)

where superscripts “m” and “s” denote that the features are extracted from the model and sample shapes, respectively, p(x, y) denotes the distance between x and y, the operator “A” in X A Y means that both X and Y hold, and /ji=, X, means all X, (q = 1, 2, 3) are true, and d > 0 is a threshold value, and dl and d2 are weight coefficients which satisfy 4d, fdz-1.

(19)

The condition (19) ensures that Wikjl= l/n,n, when the ith element of sample matches the kth element of model, and the jth element of sample matches the Zth element of model, respectively. Neuron activity updates are governed by

(20) where Sign(x) =

1 - 1

if x > 0, otherwise.

(21)

A.M.N.

134

1

FM, H. Yun/Neurocomputing

16 (1997)

(

127-138

I

Set ail neurons to excited state

2t

Set dt =d 2=0.2

t

3

Set d

I

4

Perform steps 5 to 7 n t=i ,+i

s times

using the network to compare the model with each one of the training shapes

I 5

t Create 2D Hopfield-Amari network based on Eq. (14)

6

1 Sychronously update the neuron activities through 40 iterations

Calculate matching rate R in the optomization process, ,...,i J for the shape in same class as model and in different classes, respectively

8t

I

I

-

(b) if all Rmi -1 and Rs k-l, descrease d (c) if most Rm i-O and KSk-0, increase d

I

Fig. 7. The optimization

process

of Hopfield-Amari

network.

The stable state of the network reflects the similarity between the model and sample shapes. The degree of similarity between two shapes called matching rate denoted by R can be defined as R = CL

rowi + c,“l, as + &I7

COlj >

(22)

whererow~=max{~~~+1:~=1,...,n,}andcol~=max{~~+l:i=1,...,n,}.Obviously, OsR< 1. The achievement of a reliable and accurate matching result, that is a high matching rate when the sample and model are from a same class, and low matching rate otherwise, depends on identifying appropriate thresholds for transforming element features

A.M.N.

Fu, H. YanlNeurocomputiny

16 (1997)

127-138

135

of model and sample CPGs into the neural network weights. This problem is solved through an optimization process using a set of training shapes which are composed of I, shapes from the same class as the model and i, other shapes from different classes. The optimization process consists of nine steps which is illustrated in Fig. 7. After this process, a set of estimated thresholds (i.e. dt, d2 and d) can be defined for the model. A test shape is compared to the model by calculating the feature values for its elements, constructing a network using the optimized thresholds and evaluating the matching rate. This corresponds to Steps 5-7 in the optimization process, and is carried out for each model in turn.

4. Experimental results In our experiments, for each given model the training shapes are composed of 6 shapes, 3 of which are in the same class as the model, while the rest are selected from different classes. With the weight coefficients dl and d2 initialized to be 0.2, the optimal threshold d is 0.08 after the optimization process. Note that the feature u3(kj) given by Eq. (10) is a relative feature related to two comers of the contour and it is more significant for describing a shape than other features given in Section 2. Therefore, we usually set dZ > dl. Also note that for different selection of dl and dz, the corresponding optimal threshold d may vary slightly.

rs t

* sr

Md.A

Md.C

Md.B

Md.D

sp.1

sp.2

sp.3

sp.4

sp.5

Sp.6

sp.7

Sp.8

sp.9

sp.10

sp.13

sp.14

sp.15

Sp.18

Sp.19

Sp.20

sp.11

sp.12

Sp.16

Sp.17

Sp.21

Sp.22

Sp.23

Sp.24

Sp.25

Fig. 8. The shapes used in our experiment.

136

A.M.N. Table 2 Matching

FM. H. Yun/Neurocompuhg

rates between

Rate Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample Sample

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

16 (1997)

127-138

models and samples

Model A

Model B

Model C

Model D

0.00 0.00 0.00 0.00 0.00 0.13 0.13 0.24 0.24 0.24 0.24 0.24 0.24 0.24 0.24 0.00 0.00 0.00 0.00 0.00

I .oo I .oo 1.oo 0.86 0.86 0.00 0.00 0.12 0.12 0.12 0.12 0.12 0.12 0.12 0.12 0.00 0.00 0.11 0.11 0.1 I

0.12 0.12 0.00 0.00 0.12 0.11 0.11

0.11

I .oo 1.oo

0.00 0.00

0.86 0.86 0.86

0.00 0.00 0.00

1.oo 1.oo 1.oo 1.oo

1.oo 1.oo

1.oo 1.00 0.00 0.00 0.19 0.19 0.19 0.24 0.24 0.24 0.24 0.24

0.11 0.11 0.11 0.11 0.00 0.00 0.19 0.10 0.10 0.10 0.19 0.19 0.19 0.19 0.00 0.00 1.oo 1.oo 1.oo 0.00 0.00 0.00 0.00 0.00

The shapes shown in Fig. 8 are used to test the efficacy of the scheme. The results are shown in Table 2. A test shape is considered to be in the same class as a model when their matching rate is larger than certain critical value i?. If ri = 0.8, then samples 1-5, 8-15, 18-20 and 21-25 are classified into the classes as models D, B, C and A, respectively. The rest of samples do not belong to one of the four categories. This result is in agreement with the actual distribution of the shapes in Fig. 8.

5. Conclusion Hopfield-Amari

network

has the following

three features:

(i) the network always arrives at a stable state within 40 iterations, (ii) the initial state of the network can be simply set, with all its neurons (iii) the network is tolerant small errors in the input.

excited,

Thus, the performance of the network for solving pattern recognition problems is assured if the set of features describing the considered object is reliable. Hence, the proposed network model can be used effectively, not only to classify the shapes shown

A.M.N.

FM, H. YanlNeurocomputing

16 (1997)

127-138

137

in this paper, but also to classify objects with more general shape. In this paper, the CBF is used served as a tool to construct the Hopfield-Amari network for solving the shape recognition problem, and our focus here is on how to optimize the network for the CBF. More detailed discussion of comer detection techniques and the CBF can be found in [7] Our system is effective and reliable to classify planar shapes which are composed of several SCS. The most of contours can be considered to be planar shapes composed of several SCS based on the CBF when the supported length J is selected suitably. Thus the proposed network model is effective for universal object recognition. Also training our system is simple because only three parameters (thresholds dl, d2 and d) need to be determined in a training process. We have also defined a set of powerful features of a shape based on its G(S;) to form a related Hopfield-Amari neural network. The overlaps between the features are reduced effectively in the process of network construction. This is because the binary features have less overlap than real valued features.

Acknowledgements The authors paper.

would

like to thank reviewers

for their comments

for improving

the

References [l] S. Amari, K. Maginu, Statistical neurodynamics of associative memory, Neural Networks I (1988) 63-73. [2] D.J. Amit, Modeling Brain Function, Cambridge University Press, Cambridge, 1989. [3] N. Ansari, K. Li, Landmark-based shape recognition by a modified hopfield neural network, Pattern Recognition 26 (4) (1993) 531-542. [4] A.C.C. Coolen, D. Sherrington, Dynamics of fully connected attractor neural networks near saturation, Phys. Rev. Lett. 71 (1993) 3886-3889. [.5] A.M.N. Fu, H. Han, M.G. Suters, Flexible pattern matching using a Hopfield-Amari neural network, Opt. Eng. 34 (8) (1995) 2467-2474. [6] A.M.N. Fu, H. Yan, The distributive properties of main overlap and noise terms in autoassociative memory network, Neural Networks 8 (3) (1995) 405-410. [7] A.M.N. Fu, H. Yan, K. Huang, A curve bend function based method to characterize contour shapes, Pattern Recognition, in press. [8] J.J. Hopfield, Neural networks and physical systems with emergent collection computational abilities, Proc. Natl. Acad. Sci. USA 79 (1982) 2554-2558. [9] J.J. Hopfield, D.W. Tank, Neural computation of decisions in optimization problems, Biol. Cybernet. 52 (1985) 141-152. [lo] W.C. Lin, H.Y. Liao, C.K. Tsao, T. Lingutla, A hierarchical multiple-view approach to three-dimensional object recognition, IEEE Trans. Neural Networks 2 (1) (1991) 84492. [l I] H. Nishimori, T. Ozeki, Retrieval dynamics of associative memory of the Hopfield type, J. Phys. A: Mathe. General 26 (1993) 859-871.

138

A.M.N. FM, H. Yanj Neurocomputing

16 (1997)

127-138

Alan Mingnan Fu obtained a B.Sc. degree from the Department of Mathematics and Mechanics, Zhongshun (Dr. Sunyatsen) University, China in 1963, and become an Assistant Lecturer (1963 - 1979) in the same department. He was a lecturer (1980-1987) and an Associate Professor (1988- 1989) in the Department of Applied Mechanics and Engineering. He is currently a research assistant and working toward Ph.D degree in the Department of Electrical Engineering, University of Sydney. His research interests include neural networks, relaxation, image processing and pattern recognition.

Hong Yan received his B.E. degree from Nanking Institute of Posts and Telecommunications in 1982, M.S.E. degree from the University of Michigan in 1984, and Ph.D degree from Yale University in 1989, all in electrical engineering. From 1986 to 1989 he was a research scientist at General Network Corporation, New Haven, CT, USA, where he worked on developing a CAD system for optimizing telecommunication systems. Since 1989 he has been with the University of Sydney where he is currently a Professor in Electrical Engineering. His research interests include medical imaging, signal and image processing, neural networks and pattern recognition. He is an author or co-author of one book, and more than 150 technical papers in these areas. Dr. Yan is a fellow of the Institution of Engineers, Australia (IEAust), a senior member of the IEEE, and a member of the SPIE, the International Neural Network Society, the Pattern Recognition Society, and the International Society for Magnetic Resonance in Medicine.