Dynamic Pattern Recognition by Neuronic Networks

Dynamic Pattern Recognition by Neuronic Networks

DYNAHIC PATTERn RECOGNITION BY NEURONIC J.C. LEVY (Study performed at the C.E.R.A. ~lETHORKS J. GROLLEAU -78 - Velizy-Villacoublay - FRANCE) ABST...

2MB Sizes 2 Downloads 112 Views

DYNAHIC PATTERn RECOGNITION BY NEURONIC

J.C. LEVY (Study performed at the C.E.R.A.

~lETHORKS

J. GROLLEAU

-78 - Velizy-Villacoublay - FRANCE)

ABSTRACT ----This study deals with dynamic pattern re-

given alphabet.

cognition, one of the most important functions of perception. So, we may consider

Consider an alphabet of N characters, each

our system as a simulation of the nervous

corresponding to one dimension of the pat-

system.

tern space.

Our experiments allow us to think that

A sure recognition for a character (subs-

our system may lead to a concrete realiza-

cripted i) implies for the vector S :

tion ; otherwise, it would be able to explain some

neurophysiolo~ical

Si = 1

processes.

{ S.J

=

vj

0

-J i

If the recognition is not sure, the charac1. BASIC DEFINITIONS

ter to be recognized has a certain probability of corresponding to such a charac-

1.1. Pattern space

ter of the alphabet.

A static image or "pattern" can be defi-

Si

ned by a certain number N of parameters Si varying in the present case from 0 to

00

for the unknown character

ter.

As a concrete example, assume the N parameters to be the deerees of

= probability

to correspond to the i-subscript charac-

A sure recognition is a special case, hut

illumin~tion

generally a number of co-ordinates can be

of the N points of a retina, or the ampli-

zero.

tudes of N components of a spectrum, or the !l characters of a given alphabet,etc ..

Such a pattern is represented by a probability distribution.

This amounts to operating in a N-dimensional vector-space called "pattern space", in which every pattern is

1.3. Dynamic pattern

repr~sented

by a point, i.e. by a vector S whose N components are greater or equal to

Let us define a dynamic pattern as repre-

~ero.

sented by a vector S being a continuous function of time

Set).

We will consider that the system evolution As an example of pattern, consider the

is simulated on a computer; so, an iterative method is necessary, thus involvinc a

choice of a character among those of a

776

sampling in time.

input sirnal ;,(t) (control vector) is !rnm-m.

Thus, a dynaFir. pattern is defined as a sequence, i.e. a discrete series of oatterns represented by the vectors :

Let us

de~i n e

fo~

every subscript i :

comoonent of the control vector (external si~nal). component of the internal reaction of the system.

Xi( t)

Remarks It is convenient, from a mathematical point of view, to consider 9 as the unit of time.

= S.(t)+X~(t) : internal si~nal 1 1

aoplied to element Ei response of element Ei

In the topic~ of biolor,ical pqttern recognition, 6 may be considered as an optimal sampling oeriod. studies allow us to define that period (see appendix). Physiolo~ical

For everv• two e10~ent~ E., E., we ~hall 1 J have ~ weightir~ coefficient Aij translatin ~ tte action of E. on E.;we can write: . 1 L' i=N X.(t) .,

:0

S.(t) + "

~ ~

or :

X(t)

:0

Set) +

-

pet)

' Je define :

A ne~ative input si~nal Y(t), called inhibi tor, varying from 0 to QoC) , 'tlhich is the same for every element.

SP(t)

r .. F.(t) J

1.

i=l or in vector notation

An output sienal Pi (t) between 0 and I

SP(t) = ~ (1)

~(t),

the co-ordin~tes o!' r beinr; r . 1 Then:

whose characteristics (see fig.l) can be tabulated for further use on co~~uters. 1~

A

the elements of A being Aij

Each element will be defined by the following data : A weighting coefficient r . i A positive input signal y.i(t), called exci tator, varying from 0 to ex::> •

The vector pet) (com1"'onents p. (t) , 1 state vector of our syr-tem.

1

1.J

i:ol

The system consists in N elements. (See their characteristics in Appendix and References 1, 3, 4).

Pi(t) = PLXi(t-e), Y(t-e)J,

A..• P.(t) ,

Y( t) = [ S P ( t )

t'1"

Indeed, for a given P(o), the value of ~(t) is known for any t, assumin~ th~

777

J

2 I SP)1

( /j )

~he

first experiments showed that a qua-

The matrix A

bein ~

stored, the whole se-

dratic law wa s necessary. This amounts to

quence can be d etermined by the

k eeping the most significative elements

of the first

by

~eans

tains constant the ~ ents.

nu~ber

e l e ~ent a ry

p a ttern S (t). p

~ain­

of a control system which

~nowled ~ e

As an ex a mple :

of excited ele-

(see fig.2).

All the components of S(t) are zero except S.1 (t). .

2.4.

Itera~iveprocess

All the components of S(t+l) are zero except S.(t+l).

Knowing p(t) and S(t) at time t, the sta-

J

te-vector p(t+9 ) can be deter~ined. Then, the rel a tion betvleen S(t) an d S (t+l) The matrix A specifies the relation between two successive values of the

will te completel y d etermined by the coef-

si~nal

ficient A .. of the ma trix A.

lJ

we may interpret this as an association of ideas in time. Such an association is

If we have only one non-zero comp onent f o r

ma terialized by a vector X (t), called

S(t) durin g each iteration, and if t h e same component is not found twice, all the

internal reaction

coefficients A.. can be determined indeX* (t)

lJ

A P(t)

pendently from one a nother, but the relation :

Actually, we will compute S (t+l) X(t) = X*(t) + S(t) = A p(t) + S(t)

p

( '5 )

A

p

S (t) 0

is still vali d for any p.

_The internal sif!,n.il-l:.__j:~2bta~ned b]l ad~~~~xternal

=

In the general c a se we a re bound to con s i-

signal (control vec-

der A

r

tor) and the internal reaction.

a s a constant matrix; indeeu, A

p

represents the connections b e tween ele-

When all the components of S(t) are zero,

ments, and we assume tho s e connections t o

we have an autonomous system.

be either constant, or slowly evolving in time. This necessary assumption implies a

3. RECOGNITIO N OF A DynA:UC

limit for the inf o rma tion t h e s y ste m is

PATT E R~!

ab le to treat.

A typical sequence is defined by an initial pattern S (t) and a connection ma-

Consider a set of patterns

p

trix A . P

S (t), S (t+l), p

The

sub~cript

p stands for the ca s e wh e n

composin ~

p

... , S (t+L) n

,

a typical sequence.

we have several typical sequences. Let: S(t), S(t+l), ... , S(t+L) be an With these assumptions, let us d ef i n e the

unknown p attern o f t h e same len g th.

r e currence equation S (t+l) = A p

n

S (t)

(6)

D

778

We will define a correspondanc e matrix by the condition : S et) =

~

P

(t )

3. ~.

B~.~~.n~e_

of the system to "In

!::I!1l:.!:l~~I!,~_n'p~.!__ s i Cn..;8. J

S (t) n

Note In the ~eneral case, the correspondance matrix cannot be fully determined by the above condition and t~i s iR not a n objection.

At the fir s t i t era t io~ , er,'l."l t io n ? le ad s to X(t) = Set), pet) beinG ze ro. P(t+1) = P [~t), yet) = P [S(t), Y(t)J = P [~p(t) Sp(t), yet)] .

J

At the second iteratior I f Set) = S (t),Q (t) is the unit matrix, p '\ P

X(t+l)

and then is fully determined.

The coefficients i. n t he mrt in d i agonal which a re between n a nd 1 ch a r a cteri ze the qu a lity of the correspondance between every component of Set) and S (t).

X( t+l ) =A l')

=

A

D

P(t+l) + S(t+l)

pro (t)S>t),Y(t)]+O

\" p

L( l ) ,

(t+l)~S (t) P P (7)

st case 1

V t, ~p(t)

is the unit ~ a t~ ix .

p

The other coefficients correspond to parasitic signals, but an parasitic signal cannot always be related to a component of the effe ctive si ~ nal ; this result s i n indetermjna tion of t hr coeffi~ i entR.

x(t+ 1)

=

Ap{P [Sp (t) , Y(t) ] + Sp(t) }

Th e compon e nts 0f P [Sp (t), Y(t) ] are i ncr e~ s j n ~ functions of the corresponding corrr or e ntR of Sp (t). I n addition P.(t) = 0 I

3.2.1 • .Q..a_~~.-w.h_eJ1. 5.~.o_babj 1 i.t.,LQ i

s tyi b u_t_:l9_n

Let the typical pattern Sp (t) be a welldetermined character in the alphabet let i be its subscript.

Vector P is nearly colinear wit h ve c tor S. Hence, the signal received at the i tera_tion is th~__.tnternal

S.(t)=l

ved~t_t..h~

pI

{ S

.(t) PJ

=

0

't/ji-i

The un known pa ttern is defi ~e d by the probabilities S.(t) or S.(t) of the chaI 1 racters i or j at the ti~e t.

{

Si(t)~l S.(t»O J =

The matrix ~p(t) is undefinp.el, eXCf'Dt f or t he coeffid en t:; o f t~ e i th ~o l\l:nn , ~'lh i c'; are

~ ji

=

Sj (t).

'V

j

(8)

s e:~ ~ q

reinfo!.'_c~-'!...LcQ!lJ.i!,.JT",a_t:.2nl..E.Y..

reaction of the first.} tepation.

s i~ n a l

rece_~ ­

The reinforcement (co~firlT' a tion) is continued in t~e followin ~ iterations, t he ~om­ po~ (> n ts of vector X(t) b eing incr ea sed by such a n a ccumul a tive effect. The s i ze of the components is limiteel for t~·!O r e a son s i) satura tion of pet), who s e components ra~not he ~ re a t e r t~~n 1. it ) in t~r s ame way , ye t ) co ntrols the systo r in S\lC~ ? \;?y to ;.;:ccp SP( t ) ne Cl r SP"1 ( ::; P:' 1 f'i:1": ~ jy en ).

779

~nd

"

s;tlr for t~p p-subscript signal and effective for the q-subscript signal.

case

The ma trix ~ 0 (t) is reduced to its ~ain diagonal whose" elements vary bet'..;e on 0 and 1.

3.4. The case of several For each typical

=

Let us define a resultant matrix by the composition of the elementary mptrices

Equation 8 may then be written

A

So the confirmation of the computed signal may be decreased, made zero or inverted as tho process goes on (such as shown by the experiments) .

J_rd case _

we have a matrix

A • P

where 1 is the unit matrix.

Elements of matrix 1- ~ reduced to its main diagonal are all between 0 and 1. They can only decrease certain components of S (t) and A S (t). p p p

sign~l

store~s~312

=

A1

*

A2

* ... *

1\

j(

P

The sin'plest la\,I of composition is the a dditive law, but the self-or~anizin~ process described below implies more complex laws. Usually, the law of composition is not commutative because the first stored si~n~ls have a "\'Ieight" greater than the follovTin g stored signals (accordin~ to our exceriments) . Summarizing: each element of the resultant matrix is equal or greater than the corres~d~n£ elements of each matrix.

r (t)

Thus, equation 7 may be written:

has non-zero elements not included in the main diaF,onal. Then the signal contains parasitic responses among undesirable components Si(t).

X(t + 1 ) =

A

p[o

\p

(t) S. (t), y (t)] + 0' (t + 1) A S(t ) 0 \ P P P Oa)

This equation is valid for any p, i.e. we can treat ~ll the typical si~nals, at the same time, by means of this equation.

<

If the components Si(t) yet), they induce no responses X.(t) and so they are 1 eliminated.

In equation 7a, every term including ~n element of the matrix A incre ? ses the correspon~ing term of equation 7 written with the matrix Ap.

If the amplitude of the parasitic components are greater than yet), they induce parasitic responses X.(t) deforming the 1 response of the system.

It is rossible to have a greater increase for the effective terms. But responses will occur, which can be effective for the typical sequence p, but parasitic for the typical sequence q.

If the parasitic components are greater than the effective components, they induce an increase in yet), which c~n result in an inhibition on the effective response (fig.2).

In fact, if there is a typical sequence for which the number of effective si ~ nals is the greatest and the number of parasitic

We shall see that in the case of several typical signals, a response can be para-

780

sirnals is the intern~l

the

s~allest,

si~nals

f'or tr,e nt !" pr

th8~

Cor~0n

in hibi ti 0~,

the increase of h~

Zi( t) will

t~'!"':~ 'll

~ctin~

" ~en

f,ster

OCCt;~~

,~en'l('n~(':.

t~e

in

~o,

sense of

reco~nition

l ess th 2n one, ~2y

ne p: liF.~e'l.h

b~

not

t~0re

le

accc"tei

~t 0 r a tions .

the

invr.r~0

;',' r;

of the dy namic pat -

tern is performed . An exneri~en tal verifi cation ~as been done on some simnle eXBm -

i~i)

an r'

pIes .

th~ ~t

oririn of time is not

ie~~ne1 ,

:"',"se v:1ri a tion can te ac -

ce~t: ' ir

; so , v:e c~n say tl-.'"t '!ccPf'tin.:-: R vlri ~ tton of ph'l.se a~ou nts to shift the or~ i';in of tir:-.e : this \':ill ." 1.1 ()v; 11S to ~,.."'ter.

Remarks - - - - _. i) t h e ori~in of ti~e has not to be given , the cumu lative rrnces~ bp~jnnin~ at the occurence of the first rrn tr i X f r(t) h:lVin: non - zero plemcnts on its main dia~onal.

r

Ar

~

:'encf',

tho

~yster:-

A~~its

l"tp~

~n,

a cert,in

v~ ­

ri ation of 50 to 200 % may be accepted .

we obt.qin X(t+l)=

new r'lase vRri,tio"

~cce~t

riRtion between the speed of evo lut~ on of the input sirnal Rnd the storer. tynicaJ sequence . E xperi~ents nroved t h, t R va -

A

pl-A'r sn (t),Y(t j~+

Ar ?[ -;:; (tn +' ) , t)J + fI. " \! (

f\

n

A

( t )

n

Su~mari7:i:1r,

Vie

5an~0_nsider~::;_~i.Y~ ­

:l.ent __a.l_l _ _t_~

.:"'_0:..:'1.!:lCn~.:~

issued frorr:

;j

ty -

r:}..Q..'l..l-'~~l..e_nc~ .!":..:Ln~~~ns _0.;:" ei t 'le r ~J:.!

S (t +l) n

.2.~_y.!"'e ..Q£i.::J.n_9J__b'y. _~ _S:_~.T!.?:L:: __v ~ ri.20o~ L"]_ .~..!~ evol l ~_i.0~_:::.rc.cecl__0.!_!. 2:1~_i nnu_~:i -

~o ,

Sn

vari~ticn

Ccn~iderin~

eliminate the respolses to the other tynical seq~en ­

co rr espondin~

a ~e

a time w'ler: they Clre p~~5e

,

for rany

pro~ r essively

ces. The

~le~ent~

si~nl1s ,

an iHolation of the most powerful will

~h~

we can see t~?t repl'rin~ S (t +l ) by n (t) amounts to consider A as the corp

respond'lnc>! rr.3 trix . If the elc1"ents ::' :1 the f i rst cliagonal of Ap Clre non-zero , the superposition is still don>! . We still hCl ve an increase of the e ffe:t i ve si~nal. even if the un kno wn innut 3i ~ n'll

has a

~hase

adjoin to the m .: ' cle~ents ~hic~ do not co~~e~no::rl to co~nonent3 of t he extern11

la r of onc iterCltion.

~h e

same method, which uses ~atr::'x in~er ­ sion ? nri an iterCltive process, shows ~h ? t

~e

~ay

accept

[~eater

lags, when the tei ~e

~~trix ~ay

a re not

c~rry

~ecurrence

o r orrosite

coeffic~ents

on t!"e

of

n!" ~3e

t;"~

'! 0 c t . 0

co ~ru­

~0t ~ O~

~it~

~

process .

For a variation of n ph ases , the pond a nce T:",atrix is

S ( t) r. '1 s

con:3ide~

ne~li~e ~ble . ~a~e

~

co~re~ -

[Ap ] n

:-Sl

it~3

CO ~Don en t~,

a nj

~e

may

a ::* co;r,ponent - vector ha-

freedom of the system, so we increase its storage capacity and its possibility of classifying different signals (we will see that they create an associative momory).

5. SELF-ORGANIZING PROCESS Matrix Amay be given in advance, but it is interesting to leave the system build it up by itself under the action of the input signal.

In fact, each of the N' elements E. will J be connected with only a small number of the N el~ment5 F.1. ; i.e. for a g iven subscript j, we will just have a small number of subscripts i giving non-zero values for the elements A.. and A.. (fi g .3) ; usually, lJ Jl these subscripts i correspond to effective components of S (t) for times t \'Thich are consecutive or at least very close.

Here is given the simplest of the different algorithms we have used. At every iteration, the element A.. is lJ increased with Aij

&

OA iJ· = r .. o<.p.(t).P.(t+I).(I-A··/AM ) l l J l J . ri So, if the response of element Ei is followed by a response of element E., we inJ crease the action of Ei on Ej so as to keep the memory of the association of ideas in time.

In the systems we have studied, such an element has a self-coupling coefficient A.. which gives a certain remanence to JJ the system. So, it fastens the continuity of a partial sequence, thus materializing its synthesis.

For statistical reasons connected with the definition of the element, the weight of element Ei has to be considered.

It may correspond to a ~tactic association if it represents a mere formal sequence.

~ is a numerical parameter corresponding to a probability, and so is less than one.

It may correspond to a semantic association if it represents a concept corresponding to the considered sequence.

Moreover, the amplitude of A.. is bounded lJ by AM.r i .

If two sequences are tangled, i.e. if they have components in common, it is impossible to maintain a si~nal Pi on one of them and to make zero the same sie nal on the other one, because the common element will obviously always receive a signal.

The algorithm has given excellent results, although connections occur, which will perturb the evolution of the system and limit its possibility of discrimination.

But, if we assign a different element of association to each signal, there is no more objection to a complete discrimination between both elements.

Such parasitic connections are due to the components in common with two different typical signals. Two remedies can be used against them :

Experiments proved that with such a method we could easily discriminate two si gnals having half their elements in common.

i) using different elements of association for two tangled si gnals. Note : in this case, the signals corresponding to the elements of association must be indicated to the system. In a first phase, called learning phase, N*component signals are injected.

While the discrimination is always imperfect on the basic elements, it is perfect on the elements of integration.

782

After that, the system has to discriminate signals reduced to their N elementary components.

elements of the system, the system capacity can be infinite - theoretically The human brain incl¥des 14.10 9 neurones, . th en we h ave 10 3. 10 comb"lnatlons POSS1ble (rough estimate).

ii) Decreasing the weights r i of the elements in common with several signals, such elements being used more often than the others.

But, in a man-built system, we may simulate on a computer but a hundred elements.

Note : that self-organizing process comes from what the neurophysiologists call Hebb's principle.

However, using a hardware Rystem, we could deal with hundreds or thousands of elements.

Such a process builds up a long-stora ge memory - secondary memory - created by the occurence of connections between elements.

APPENDIX - - --

We may interpret the use of elements A.. lJ as a dynamical short-storage memory primary memory - corresponding to reverberating circuits.

IDF.1'IJ_D"YING

ON~ ELEH~NT

BY MEANS OF

NEUROPHYSIOLOGICAL CONSIDERATIONS Theoretically, an element is defined as a "population of neurones" of VON NEUMANN, whose model has been slightly modified (ref.2).

CONCLUSION We have given a mathematical definition of our system, which can be related to considerations on the nervous system.

e-

The la~ MAC CULLOCH, PITTS and then VON NEUMANN held as a constant synaptic lag - is considered here as a random variable.

A mathematical solution cannot yet be given for integra ting the equations and computing the system's evolution; for it is a non-linear, non-boolean and multivariable system.

The probability law ~ (8) is proportional to the law of evolution of the post-synaptic potential, that we consider as a physiological data.

The only research way is an iterative process on computers.

We can easily prove that

Up to now, all these assumptions have been confirmed by experiments dealing with simple data. is an optimal sampling period. The computing time and the storage ca~aci­ ty increase like N2 x L, L being the length of the sequence ; this implies an objection for the computation. I'lhen

\~e

Hence, we can compute the system's state at ti~e t w~th the knowledge of its state at time (t-e), using a correcting factor which is weak and easily computable.

have not a bound on the number of

783

REFERENCES (1)

J.C. LEVY,"Systemes nerveux et Reconnaissance des Formes" L'Onde Electrique - Fevrier 1966.

t

P ><)

V /

y=

~V

0

1/ (2)

J.C. LEVY,"Un modele de neurone considere comme un ensemble statistique d'elements booleens" Compte rendu a l'Academie des Sciences - Seance du 29 Janvier 1968

(3)

J.C. LEVY - D. BERTAUX - J. GROLLEAU "Systeme Autoadaptatif de reconnaissance des formes spatiales et temporelles". (au titre du contrat D.R.M.E. 336/65)

(4)

J.C. LEVY - D. BERTAUX - J. GROLLEAU "La structuration par apprentissa;:;e d'une memoire a associations semantiques". Symposium sur l'informatique medicale et les intelligences artificielles - Toulouse - Mars 1967

0.

/

1/

/

V

/

V

o

D. BERTAUX,"A Study of the logic of orderly - associative learnin~" a presenter a YEREVAN - IFAC Symposium - Sept. 1968

(7)

KISS, "Networks as Models of word storage" Extrait de "!-lachine Intelligence"

(8)

J.C. LEVY,"Un modele de neurone considere comme un ensemble statistique d'elements booleens" - CR Acad. Sciences Paris - 12.2.1968.

r--r

V-

i

l- t-"" V

V

4V el/f oS 1/ , ,/ V V

/

~

-I-

/

V

r- r--ri

"'I ,

I

/

V- -

I

!

I

I

5

10

x 15

20

f"tjure :1

l

yX

Ci

E.t.+1

.Jf~'ral/on

W.K. TAYLOR,"A model of learning mechanism in the Brain" Elsevier publishing company 1965

(6)

V

/V V- 1/ X /./ I--':: V V

E.j.

(5)

V

+V

-t

.lfe'r aflon

;:Igure 2

Figure 3

784

E..;,.+ 1

.t+ 1