Pergamon 0893-6080(95)00058--5
Neural Networks, Vol. 9, No. 1, pp. 53-66, 1996 Copyright © 1996 Elsevier Science Ltd. All rights reserved Printed in Great Britain 0893-6080/96 $15.00 + .00
CONTRIBUTED ARTICLE
Stability and Bifurcations in an Associative Memory Model CHRISTOPHER M. THOMAS, WILLIAM G . GIBSON AND JOHN ROBINSON The School o f Mathematics and Statistics, University of Sydney (Received 17 January 1994; revised and accepted 28 April 1995)
Abstract--The stability and bifurcation structure o f a theoretical autoassociative network is studied. The network consists o f randomly connected excitatory neurons, together with an inhibitory interneuron that sets their thresholds," both the degree o f connectivity between the neurons and the level o f firing in the stored memories can be set arbitrarily. The network dynamics are contained in a set o f four coupled difference equations. Their equilibrium properties are investigated, analytically in certain limiting cases and numerically in the general case. The regions o f parameter space corresponding to stable and unstable behaviour are mapped, and it is shown that for suitable parameter choices the network possesses stable fixed points which correspond to memory retrieval
Keywords--Neural network, Associative memory, Equilibrium states, Stability, Bifurcations, Nonlinear dynamics, Statistical neurodynamics, Chaos. learned connection strengths and temporal correlations between the state of the system and these connection strengths. This is the so-called Level 2 theory; in addition there are sublevels: Level 1, which omits temporal correlations and Level 0, which omits spatial correlations as well. The effect of the inhibitory interneuron is to set a threshold that is a linear function of the firing activity in the network. Our theory, with some extensions, has been applied to the CA3 region of the hippocampus, using parameter values based as far as possible on the known physiology of this region (Bennett, Gibson, & Robinson, 1994). The theory leads to a set of four coupled difference equations that describe the full dynamics of the recall process. For a given set of network parameters it is easy to simply iterate these equations and hence follow the time-course of the system, and this is essentially what we did in our previous work (Gibson & Robinson, 1992; Bennett, Gibson, & Robinson, 1994). This established that, for appropriate choices of the parameters, the system was capable of acting as a memory store with adequate capacity and with good recall from partial or distorted memory patterns as the initial state. From this work, it was apparent that a quantity of central importance was the parameter gl which governs the strength of the signal from the inhibitory interneuron; the correct setting of gl is crucial for the optimum storage and recall of memories. As gl is increased there is an abrupt transition from non-recall to recall, and then an equally abrupt reversal of this behaviour at a larger
1. INTRODUCTION This paper is concerned with the analysis of the longtime behaviour of a particular autoassociative memory model (Gibson & Robinson, 1992). The model has its origins in the biologically motivated model of Marr for memory storage in the hippocampus (Marr, 1971; Gardner-Medwin, 1976; Willshaw & Buckingham, 1990; Treves & Rolls, 1990) and applies methods of conventional statistical analysis (Amari & Maginu, 1988; Amari, 1989; Faris & Maier, 1988). This is in contrast to the parallel line of investigation based on ideas and methods taken from statistical mechanics (Little, 1974; Little & Shaw, 1975; Hopfield, 1982; Amit, 1989); the closest point of connection between the two lines can be found in Golomb, Rubin, and Sompolinsky (1990). Our basic network consists of randomly connected excitatory neurons, together with an inhibitory interneuron. Both the degree of connectivity between the neurons and the level of firing in the stored memories can be set arbitrarily. The memories are stored via a two-valued Hebbian, and evolution from an arbitrary initial state is by discrete, synchronous steps. The theory (Gibson & Robinson, 1992) takes into account both spatial correlations between the
Support under A R C Grant AC9330365 is acknowledged. Correspondence should be addressed to: W. G. Gibson, School o f Mathematics, The University o f Sydney, N.S.W. 2006, Australia.
53
54
value. From the sharpness of these transitions it was clear that the system is undergoing some type of bifurcation between separate lines of stable fixed points, one line corresponding to recall of the memory under consideration, the other to non-recall. This paper presents a detailed investigation of these phenomena, using nonlinear dynamical systems analysis (Devaney, 1989; Wiggins, 1990) supplemented by numerical methods developed for the investigation of phase-plane behaviour (principally AUTO86: Doedel, 1986). We investigate the conditions under which the memories are stable fixed points of the dynamical system and analyse the bifurcations that can occur, using gl and various functions ofgl as bifurcation parameters. Our treatment is closest in spirit to Amari's analysis of the bifurcation structure of a network of randomly connected neurons (Amari, 1971); some equations are formally similar, although their interpretation is different. The stability properties of continuous-time networks have been comprehensively reviewed by Hirsch (1989). In the case of discrete-time networks the picture is less complete. For memory models based on statistical mechanics that have symmetric connections one can construct Lyapunov functions ("energy" functions) which guarantee convergence to stable fixed points (asynchronous updating) or fixed points and two-cycles (synchronous updating) (Hopfield, 1982; Amit, 1989; Goles & Vichniac, 1986; Marcus & Westervelt, 1989; Marcus, Waugh, & Westervelt, 1990). For asymmetrically connected networks, such as ours, there is no general theory. The early work of Amari (1971) on a randomly connected network (which does not store memories) has been mentioned above. There are other investigations on small networks of two or so neurons, or on networks with special connectivities (Wang, 1991; Blum & Wang, 1992; Chapeau-Blondeau & Chauvet, 1992). The present work is significant in that it treats the case of an arbitrarily large memory-storage network with asymmetric connections. It is important to show that more biologically realistic networks have desirable properties similar to those of the artificially constrained ones. The plan of the paper is as follows: we start (Section 2) by summarizing the basic model, and give the four coupled equations that govern the evolution of the system. We then briefly analyse the case where the connection strengths are purely random, and do not represent the storage of memories (Section 3). The remainder of the paper is concerned with networks that do store memories. Although the most accurate theory is Level 2, it is difficult to proceed far analytically with the four coupled difference equations it involves. Thus we start with Level 1, and even then we further simplify the equations by treating the limit of a very large
C . M . Thomas, IV. G. Gibson and J. Robinson
network (when the number of neurons goes to infinity). We are now able to make considerable progress by purely analytical means, and prove a number of properties rigorously. The link between these results and the Level 2 finite network case is provided by numerical work, principally using either straightforward iteration or the computer package AUTO86 (Doedel, 1986). This establishes that Level 2 is qualitatively similar to Level 1, and justifies the detailed analytic treatment of the latter because of the insight it gives into the more accurate theory. We have written the paper in such a way that a general picture of the stability and bifurcation properties of the network can be gained from the figures and the discussion surrounding them. The mathematical details are often best understood by reference to the appropriate figure. Only the
°"t
----~3
7
•
<
)
) )
) 0
e O <1 ¢ )
excitotory neuron
inh;b;tory neuron ineffective excitotory synopse effective excitotory synopse inhibitory synopse direction of slgnol
FIGURE 1. The basic network, consisting of n excitatory neurons (open circles) and one inhibitory interneuron (tilled circle). The excitatory neurons make random connections with each other, the probability of any one connection existing being c. Before learning, these connections are ineffective; after learning, a subset of them becomes effective and in the final state of the network there are excitatory synaptic connections whose strengths are taken to be unity (open triangles) and others whose strengths have remained at zero (open circles). The inhibitory interneuron receives input from all the active excitatory neurons, and in turn sends an inhibitory signal to each of them; no learning occurs here, and all synaptic strengths are fixed. The initial state of the system is set by a firing pattern coming onto the excitatory neurons from some external source, shown by the lines entering from the left. Once the initial state has been set, the external source is removed. The network then updates its internal state cyclically and synchronously.
Associative M e m o r y
55
statements of the mathematical results are given in the main text--all proofs have been placed in the Appendix.
2. T H E A S S O C I A T I V E M E M O R Y M O D E L Full details o f the model were given in Gibson and Robinson (1992) (see also Bennett, Gibson & Robinson, 1994); here we give a (self-contained) s u m m a r y of the theory. The basic network (Figure 1) consists o f n excitatory neurons, each o f which can exist in one o f two states: firing (on) or silent (off). The state of the whole network at time t is given by an ndimensional vector X(t) whose ith component is if neuron iis on at time t, if neuron i is off at time t.
1
Xi(t)=
0
where E denotes expectation, xt is thus the average activity, at time t, in those neurons that should fire when the target m e m o r y Z ° is recalled; Yt is the average activity in those neurons that should be silent when Z ° is recalled. Thus xt and Yt correspond to the proportions o f "valid" and "spurious" firings, respectively, for the recall o f Z°; for perfect recall at time t, xt = 1 and Yt = 0. The total average activity in the network is zt = ax, + (1 -- a)yt,
so, on average, nzt neurons are firing at time t. During recall the network starts from the input state X(0) and updates synchronously at the discrete times t - - 1 , 2 , . . . , thus generating a sequence of patterns X(t), t = 0, 1 , . . . . The excitatory input to the ith neuron at time t is nhi(t) where 1
Initially, before any learning or storage of memories takes place, these n neurons are connected by a matrix W where Wq = 1 if n e u r o n j sends a collateral to neuron i and Wij = 0 otherwise. The diagonal elements are zero and the other elements are assigned at r a n d o m according to P(Wij = 1) = c where P(-) denotes probability and c E [0, 1] is the connectivity. The network stores m+ 1 patterns {ZP:p = 0, 1,... ,m} where Z~/= 0 or 1, according to the clipped Hebbian prescription
J/j = rain
1,
=
.
(i)
(2)
n
h,(t) = ~ = w,j&j~(t). This neuron then fires if hi(t) exceeds a certain value T(t) where nT(t) is the threshold on an excitatory neuron. It is assumed that this threshold is set by an inhibitory interneuron that receives the entire pattern (or at least a large enough sample of it) as input and then provides an inhibitory input to each neuron. The strength o f this inhibition is taken to be linearly dependent on the average activity S(t)/n o f the pattern, where S(t) = ~ = l Xj(t) is the total activity. Thus
=
The m e m o r y elements are assigned at r a n d o m according to P(Zf = 1) = a, where a E [0, 1] is the activity of the network. The total connection strength from n e u r o n j to neuron i is then given by JijWij; this means that in order for a non-zero connection to form between two neurons there must have been an original connection and it must also have been strengthened by the learning process. The theory describes the recall o f a typical stored pattern that, without loss o f generality, can be taken to be the pattern Z °. It is convenient to renumber the neurons so that
Z~i =
1
for i=
0
for i = n a +
1,...,na,
l,...,n.
That is, Z ° is a vector consisting of na ones followed by n - na zeros. Z ° will be referred to as the target pattern and is taken to be fixed. Define
s(t) r ( t ) = go + gl - - , n
where go and gl are constants. Note that T(t) is dependent on t; that is, the value of the threshold can vary f r o m one recall step to the next. (We m a k e two further remarks about inhibition. The first is that the formalism is easily extended to include m a n y inhibitory interneurons; provided each is assumed to act in a linear fashion and to have inputs from a large enough sample o f the excitatory neurons, this set o f neurons is equivalent to a single inhibitory interneuron. The second is that there is physiological evidence that some inhibitory interneurons, and in particular those in the CA3 region of the hippocampus, do act in this linear fashion. Further discussion o f both these points can be found in Bennett, Gibson, & Robinson, 1994.) The network thus updates according to X i ( t + 1) ---- H ( h i ( t ) - T ( t ) ) ,
x, = E X i ( t ) , yt = E X i ( t ) ,
i = 1 , . . . , na, i = na + 1 , . . . ,
n,
where H(x) = 1 for x > 0 and H(x) ----0 otherwise.
56
C. M. Thomas, W. G. Gibson and J. Robinson
T h e initial input pattern X(0) is taken to be a r a n d o m distortion o f the target pattern Z °. Specifically, we define X(0) by
P(Xi(O) =
( 1) = {
i = 1 , . . . ~na~ i=na+l,...,n,
Xo,
tYo,
where x0 and Y0 are p a r a m e t e r s whose values are to be specified. F o r the particular case x0 = 1, y0 -- 0, the input X(0) is exactly the target m e m o r y Z °. T h e d y n a m i c evolution of the system is described by a set of four coupled nonlinear difference equations:
(t)'~
: re,
x,+,
= ~ ta---~-~) ,
Yt+l
: (I)t - - - ~ ) ,
-oo
(6)
normal
e_t2/2
distribution
dt.
In these equations x, and y, are the quantities of interest, being the average firing levels for correct and spurious cells respectively at time-step t; ~ and ~ are subsidiary variables that do not have a direct o b s e r v a t i o n a l meaning. All the quantities on the r i g h t - h a n d sides can be expressed in terms of xt, yt, t t X t, Y r W e introduce: ~i =
[1 -- at1 -- (1 - a)i)] m,
p = 1 -61, 1 - 261 + 1 -
62
61
7 = 62 - 6~, ,),r :
1 -- 351 + 362 -- 63 :
1-
cpYlt~ Yt}
+ n2(l - a)2c23,(y;)2 '
(
q
l -- cp -'t + n(1 -- a)cpy' t 1 -- cp xt/ + n2a2c27(x',)2 + 2n2a(1 -- a ) c 2 7 x l y ,
+ n2(1 - a)2cZ,y(y')2.
~ {e'o(t)~
jz
/ [nal ( t)]2=nac( l
(4)
Yt+t = ' ~ k a ~ ( t ) ) ,
1 ~(z) = - ~
The quantities E ' l (t) and E ' , ( t ) in eqns (5) and (6) are given by the same equations, p r o v i d e d p is replaced by p'. The d e n o m i n a t o r s o f the n o r m a l distributions functions in eqns (3) and (4) are standard deviations ax (t) and a , ( t ) given by
[nan(t)]2=nacpx't
[E'l(t)\
where gp(z) is the standard function:
ET(t) = go + gtax, + g l ( l - a)yt.
(3)
{E.(t)'~
,
where E T ( t ) is the expectation of the threshold:
p,2.
61
T h e n u m e r a t o r s of the n o r m a l distribution functions in eqns (3) and (4) are (conditional) expectations given by
The quantities a ' 1(t) and a ~ ( t ) in eqns (5) and (6) are given by the same equations, provided p is replaced by p ' and 7 by 7'T h e above are the so-called L e v e l 2 equations; they include b o t h the spatial correlations between the learned connections strengths Jij and the t e m p o r a l correlations that develop between the states o f the system and the Jij's. Setting x ' t = xt, Y't = Yt, P' = P and 7 ' = 7 gives the L e v e l 1 equations that neglect the t e m p o r a l correlations. I f in addition we set 7 = 0 then we have the L e v e l 0 equations, which neglect the spatial correlations as well. We note that the Level 2 equations define a m a p (I)(2) : [0, 1] 4 ~ [0, 1]4 and the Level 1 and Level 0 equations define m a p s (#(i) : [0, 1]2 --~ [0, 1]2 where i = 1 or O.
3. A UNIFORM
NETWORK
WITH
RANDOM
CONNECTIVITY As a first step in the analysis of the network described in Section 2 we consider briefly the case where all Jij = 1. A l t h o u g h such a u n i f o r m network actually stores no memories, it m a y be thought o f as the limiting case in which, in a n e t w o r k of fixed size n, we a t t e m p t to store m + 1 m e m o r i e s with m - - c~. With the learning rule (1), P(Jiy : 1) --~ 1 as m --~ oo with a fixed. H o w e v e r the neurons are r a n d o m l y connected because of the matrix Wij, and the analysis is not trivial. As m --~ oo we have p , p ' --+ 1 and %~/' --- 0 and the Level 2 eqns (3)-(6) reduce to !
El (t) = acxt + (1 - a)cpy; - ET(t),
(7)
E,(t) = acpx'1 + (1 - a)cpy; - ET(t),
(8)
Xt+l :3dr+
1 :Yt+l
= {,-(c
=92
t
vn
=Yt+!
- g,)axt + (c - g,)(1 - a)yt - go~ . . . . . . . . . . . . . . . x/c(1 - c)axt + c(1 - c)(! - a)y, )
Associative M e m o r y
57
In terms of the total activity z t , as given by eqn (2), this is
0.08
(9)
Zt+ l ~" ~O(Zt) ,
0.06
where o(z) =
n, z) -
=
V/-n(c -- g l )
zt 0.04
+ (10)
0.02
v~go -c)
=
01) 0.0
Equation (9) may also be derived directly, using the approach in Section 4.1. of Gibson and Robinson (1992). The above formalism contains the full dynamics of the system, the time evolution of the network being found by iterating the map ~o. Here, we are only interested in the long-term behaviour of the network, which is given by the asymptotic behaviour of ~0. This depends on the values of the parameters ot and fl, and numerical calculations reveal three main types of behaviour, corresponding to the bifurcation regions shown in Figure 2 which was produced with the aid of the computer program AUTO86. In one region of the aft-plane (the extinction region) the activity of the network converges to zero,
1.0 0.5 UNSTABLE
ore l
~ 0.0 STABLE
-0.5 EXTINCTION -1.0 - 10
-8
-6
-4
-2
0
I - 10.1
I -10.0
I -9.9
I -9.8
FIGURE 3. The period doubling bifurcation for the activity z as a varies from - 9 . 8 to -10.1 with (3 = - 5 a - : . For each value of a, the function ~ was iterated 2400 times and the final 400 iterates were plotted. As a decreases the dynamics become chaotic, with some small windows of periodic behaviour.
regardless of the initial activity. One expects this behaviour for large negative values of the parameter fl, since go is proportional to - f l and so the network is heavily damped. In a second region the network is stable: for small initial z the activity may converge to zero, but for larger starting values the activity converges to a stable equilibrium value z* > 0. In a third region the network has no stable equilibrium value of the activity and a range of behaviours is observed there. For some choices of a and fl the network has a stable cycle of length two, but numerical iteration reveals apparently chaotic behaviour for other values. Figure 3 shows the cascade of period-doubling bifurcations observed for a particular choice of path in the aft-plane. This is reminiscent of the period-doubling route to chaos in unimodal maps (Devaney, 1989) and has been found in many theoretical neural networks (Aihara, Takabe, & Toyoda, 1990; Renals & Rohwer, 1990; Wang, 1991; Chapeau-Blondeau & Chauvet, 1992). Amari (1971) analysed a network with random connection weights and thresholds. He obtained an equation for the activity of the network of the form
2
Ot FIGURE 2. The stability regions for s network with random connectivity. The quantities a and 13 are defined in terms of the parameters of the model by eqns (10) and (11). There are three regions: EXTINCTION, where the only stable fixed point is z = 0 and so the activity of the network goes to zero, regardless of the initial activity; STABLE, where there is a stable fixed point z* > 0 to which the activity will converge for a large enough initial value; UNSTABLE, where there is no stable equilibrium value of the acfivity and the network can have a variety of behaviours, ranging from stable orbits of period 2 to chaotic behaviour. The diagram was produced numericsily, using AUTO88.
zt+l = 2¢(Wzt-
O ) - 1,
(12)
where W and O are constants, and gave a complete analysis of this system. Our analysis of eqn (9) is related to Amari's analysis of eqn (12) but is not as complete because the nonlinear argument a v / z + fl/x/~ is more complicated than the linear expression W z - O . We have confined ourselves to the (biologically plausible) case fi < 0 and using elementary real analysis we have been able to show rigorously
58
C. M. Thomas, IV. G. Gibson and J. Robinson
that the stability regions are qualitatively as given in Figure 2. In particular, we have shown that there exists a n u m b e r ac ~ -3.4822 such that the following holds (Thomas, Gibson, & Robinson, 1994): I f ct > ac then there is a number 131 < 0 such that if 13 < j31 then iterates 0f99 converge to zero on (0, 1], and if 131 < 13 <~0 then the system has one stable equilibrium point. I f ct < C~c then there are numbers 131 < 132 < 0 such that if13 < 131 then iterates o f go converge to zero on (0, 1], /f 131 < 13 32 then the system has one stable equilibrium point, and if 132 < 13 <<.0 the system is unstable. In Figure 2 the lower curve corresponds to 131 and the upper curve (for 13 < 0) to 132. Clearly, the values 131 and 132 depend on t~, and it can be shown analytically that 131 ~ 0 (and hence 132 ~ 0) as t~ ---4 --OO.
One m a y also examine the behaviour of the system eqn (9) in the limit n --~ co. In this case qo(z) --+ ~b(z) where
~(z)=
0 1/2 1
if if if
- - go < O, (c-gl)z-go=O, (C - - g l ) Z
(13)
(c-gn)z-go>O.
If go = 0 then the system ~b has a stable fixed point at z = 1 i f g l < c and a stable cycle 0 ~ 1/2 if gl > c. I f go > 0 then ~b has a stable fixed point at z = 0 and also a stable fixed point at z = 1 if go < c - g l .
4. THE B E H A V I O U R OF LARGE NETWORKS STORING MEMORIES We turn now to the general case where the Jij are not all 1 but are given by the storage prescription eqn (1). As pointed out in Section 2, the equations for the various levels of the theory define maps • :[0,1] 2--, [0,1] 2, or in the case of Level 2, [0,114--~ [0, 1]4, and the successive states of the network are given by the iterates of these maps. F o r finite values of n these systems are difficult to analyse. However, we are primarily interested in very large networks and the equations simplify in the limit as n --, oo. In the following, superscripts of ~ denote the level of the theory (0, 1 or 2) and ~ denotes the limit of as n -~ cx~, with m and a fixed. To avoid confusion we will write ~(2) but omit the brackets for Levels 0 and 1 and simply write &0 and cI,l, respectively. We are able, using mostly analytical methods, to give a fairly complete description of the behaviour of ~,1. Because tI)1 and its partial derivatives converge uniformly, the stability and fixed point behaviour for large finite n corresponds to that of the limiting case. Even in this limit the Level 2 equations are hard to analyse; cI,(2) exhibits very complex dynamical
behaviour for some parameter values. Numerical computations show that its fixed point behaviour is qualitatively similar to that of (I)1 but that quantitative differences exist in certain parameter regions. Some of these differences are illustrated in Figure 4, where the steady state values of x and y, as given by Levels 0, 1 and 2, are plotted as functions of the inhibition parameter gl for a particular choice of the other parameters. (The parameters have been chosen to highlight differences; for actual storage purposes one would use a much smaller value for a and a much larger value for m - - c o m p a r e Bennett, Gibson, & Robinson, 1994.) The main differences are a reduced window of recall for the higher levels, and a more abrupt transition in the case of Level 2. The true behaviour of such a system would be best predicted by the Level 2 equations (see Gibson & Robinson, 1992, for comparison with computer simulations using a 3000-neuron network), but the Level 1 equations are mathematically simpler and are useful as a step on the way to the accurate theory. (In fact, it turns out that there are considerable qualitative similarities between Level 2 and Level 1.) Also, in the case of large sparse networks (n large, a small) there is less difference between the levels and even Level 0 can be a good approximation (Bennett, Gibson, & Robinson, 1994). In the following, we consider the network in the limit as the number of neurons n goes to infinity. This limit is to be taken with m and a fixed. However, we also wish to examine the behaviour as m grows large, and in this case it is necessary to assume that ma 2 does not become too large. This follows because in the progressive recall process we are dealing with two normal distributions (the inputs to the m e m o r y and nonm e m o r y neurons, respectively), the separation of whose means depends on 1 - p . Since, for m large, 1 - p ,.~ e -ma2 recall can only occur if e -ma2 is not too close to zero. Thus below, when we look at the behaviour of ~1 for different values of m or a, we are assuming that ma 2 is kept equal to a constant k. We do not explicitly discuss the Level 0 case since it can be considered as a sub-case of Level 1 in the limit 3' ---+O.
4.1. Level 1 Theory for Large Network Storing Memories F r o m eqns (3)-(6), ~1 _ (~I, ~ ) where
~I(x,Y)=~(x/~ ( c - - g l ) a x + ( c p - - g l ) ( 1 --a)y --go
x ~/c(1 - - c ) a x + c p ( l -- cp)(1 -- a)y+n7c2(1 --a)Ey2
),
Associative Memory
59
[ (x,y) = • ~
1.0
~- ...--Level2
X
(cp
)(
(I-- a )y_._)) -- go
-_.__gg, a._x__+
"~.
0.8
Level 1
¢cp(1 - cp)(ax + (1 -- a)y) + n'yc2(ax + (1 -- a)y)2 )
As n ~ oo, ¢,' --, ~ ' , where ~1 = (~I,~1) and
Col(x,y) =
((cp v/~c(ax+(l ~ go~]" - g,)(ax
+ (1 -- a ) y ) -
4.1.1. Level 1, with intrinsic threshold go = 0. We are able to give a rather complete analysis o f ~1 when go = 0, since in this case ~b~(x, y) = ( b ( ( c p - g , ) / x / ~ c ) , which depends only on the parameters, and the behaviour of the system is determined by the single equation x
=
~(Ax + #),
(14)
cp-g, v~ c ,
(15)
#=
where
A=
(c-g,)a
v~c(l -a)@(#)"
(16)
Equation (14) is formally similar to eqn (12) which was studied by Amari (1971), so a parallel analysis can be applied. In fact, if W - A / 2 and O = - A / 2 - # the functions ~P(Ax + #) and 2 g p ( W x - O) - 1 are topologically conjugate under the mapping 7 - ~ 27-- 1 and therefore have the same dynamical properties when iterated (Wiggins, 1990). Although the equations are similar, they occur in different contexts and must be interpreted differently. Amari's network does not store memories, and his W and O are free to vary, whereas in our analysis A and # are complicated functions of the other parameters given by eqns (15) and (16). Following Amari (1971), we call the system monostable (bistable) if it has one (two) stable equilibrium value(s), and no other attractors. Lemmas 1 and 2 are minor modifications o f his results. Bifurcations occur when the fixed points given by • (Ax* + #) = x* also satisfy [d~(Ax* + #)/dx[ = 1. This leads to the bifurcation lines
~ |l
i.
/
!
t i
!!
ii
0.4
! !
i i
0.2
''''~'
oo
....
0.40
In both cases the mapping and its first partial derivatives converge uniformly for y bounded away from zero.
/
0.6
v~C(1 - a)y ~o~(x,y) = #~
'"
Level 0 I
='u
gl)ax + (c,,- g,)(1- .)y- go)
.i
-.~.T ~ I
; /
i l'-_ f Y : : _ - ! 0.42
0.44
0.46
0.48
0.50
gl FIGURE 4. Steady-state b e h a v i o u r of the network, as predicted by the three levels of theory, a s a function of the Inhibition strength p a r a m e t e r gl. The c u r v e s are found by iterating eqns (3)-(6) until a steady state is reached. Parameter v a l u e s used a r e n u m b e r of n e u r o n s n = 10s, n u m b e r of stored m e m o r i e s m = 100, connectivity c = 0.5, activity level a = 0.11, Intrinsic threshold go = 0, initial state p a r a m e t e r s xo = 0.9, Yo = 0.0. In each case, the u p p e r c u r v e g i v e s x (the fraction 04 valid firings) and the l o w e r c u r v e g i v e s y (the fraction of spurious firings). Level 0 (broken lines) Indicates pertect recall (x = 1, y = 0) for almost the w h o l e r a n g e 0.4 < gl < 0.5. (in the r a n g e gl ~<0.4, y m a k e s a transition f r o m I to 0 In 0.35 < gl < 0.36 and x r e m a i n s at 1.) Level I ( d o t - d a s h lines) g i v e s a reduced recall w i n d o w and also s h o w s less than pertect recall at each end of this window; however, there is a r e a s o n a b l y abrupt transition between the r e g i o n s of (partial) recall and no recall. Level 2 (solid lines) further reduces this w i n d o w and the transitions b e c o m e sharper. Also the b e h a v l o u r of y Is different in that there is now a p r o n o u n c e d drop in the n u m b e r of spurious firings at the value of g~ w h e r e recall c o m m e n c e s . (Level 1 In the flnite-n case also gives a drop, though it is small for n = 10s.)
= ± /loga2/2 -
(17/
that are defined for A2 t> 27r. LEMMA 1
(i) I f IAI < x / ~ then the system (14) is monostable regardless of the value of #. (it) I f [AI > x / ~ then the system (14) is bistable if #+<#<#-
and is monostable otherwise. (iii)/flAI < - v ~ then the system (14) has a stable cycle of length two (and no other attractors) if #_
(18)
and is monostable otherwise. Figure 5 shows the regions of the (A,/z) plane described by Lemma 1 (compare Figure 4 of Amari,
60
C. M. Thomas, IV. G. Gibson and J. Robinson
10
1.0
x.
S
\. \,
3÷0.7
\.\#+ 5 -
MONOSTABLE
\. PERIODIC (period 2)
\
\,
\ \
\
\
0.6
\
I
/~
/
o
\
x
\ \
#-
\
0.4
\ \ -5 0.2
-10 -10
I -5
I 0
I 5
10
A FIGURE 5. The stability regions for e network storing memories. The analysis is based on the Level I equations in the limit as the number of neurone n becomes large. The Intrinsic threshold go is taken to be zero. The quantities k and Ix ere not free to vary, but ere related to the remaining parameters in the model by eqns (15) end (16). The regions are given explicitly by eqn (17) and Lemma 1. They ere: MONOSTABLE, in which x (the average valid tiring activity In a memory) has only one stable fixed point; BISTABLE, where x has two stable fixed points; PERIODIC, where x has a stable cycle of period 2 and there are no other attractors. The system undergoes a saddle-node bifurcation at the boundary of the bistable region and a period-doubling bifurcation at the boundary O4the periodic region. Memory recall can only occur in the hatched region k > x / ~ , IX+ < ix < 0.
1971). As # crosses the curves #_ or #+ the system
undergoes a saddle-node bifurcation ifA > x / ~ , and a period-doubling
bifurcation
,k < - - ¢ ~
(Wiggins,
1990). Lemma 1 describes the stability properties of the system, hut we also need information about the location o f the fixed points and this is provided by the following lemma: LEMMA 2. Suppose )~ > x / ~ . I f # < #+ then there is a single stable equilibrium value x* o f the system (14) satisfying x* < ~_ = ~(-x/log(A2/27r)).
0.0 -3.(
I
-2.5 t #+
-2.0
1" -1.5 #-
-1.0
FIGURE 6. Bifurcation behaviour o4 the network according to the Level 1 equations in the large-n limit. The locus of fixed points of x is shown as a function of ix, for the case k = 4. Solid lines represent stable fixed points and broken lines unstable ones. The changes in stability occur at the bifurcation points
(ix, x) = (Ix+, ~ + ) ~ ( - 2 . 3 7 , 0 . 8 3 )
and
(it, x) =
(it_, ~ _ )
(-1.~,0.1r). For ~ < IX+ and IX> It_ there is only one stable tixed point for each value O4 Ix; for ix+ < ix < IX_ there are three fixed points for each value of IX, the middle one being unstable and the other two stable. This behaviour is lypical tor > v ~ ~ 2.51 (see Lemmas 1 and 2 and Figure 5).
Figure 6 shows, for the case A - - 4 , the locus o f fixed points o f eqn (14), as # varies (cf. Figure 2 o f A m a r i (1971)). The b e h a v i o u r o f y = ¢p(#) as a function o f g] is simple, and Figure 7 is typical; y decreases as gl increases, y > 0 . 5 for gl < c p , and y < 0 . 5 for g] > cp. I f m a 2 is kept fixed then y - - . 1 on [0, cp) and y -~ 0 on (cp, oo) as a -~ 0. The behaviour o f x is m o r e complicated. We first establish that recall is not possible unless cp < gl < c. F o r suppose gl < cp; then # > 0 giving y > 0 and no chance o f recall. (The actual behaviour o f the network in this region is monostable, as follows f r o m L e m m a 1 since A > 0 and # > / z _ whenever #_ is defined.) O n the other hand, if gl > c then )~ < 0 and # < 0 giving x < 0.5 and so recall cannot occur in this region either. (The actual behaviour o f the
network in this region is either monostable or I f Iz > #_ then the system has a single stable equilibrium value x** > qS+ -- 3(x/log(A2/27r)).
I f #+ < # < #_ then the system equilibrium values x* and x** satisfying x* < qb_ < 6p+ < x**.
has
stable
periodic, as determined by L e m m a 1. It is shown in the A p p e n d i x that if g] > c is large enough, then the system has a stable cycle o f length two and no other attractors.) Therefore we consider the case cp < gl < c, giving )~ > 0 and # < 0. It follows from L e m m a s 1 and 2 that recall can only occur if )~ > v / ~ and # > #+. F o r suppose )~ < v/2~; then since q~(Ax + #) is an increasing function o f x the system has one stable equilibrium value x* < f f ( x / ~ + #) which is n o t close to 1 when y = ~ ( # ) is small (that is, if # is large and negative). Therefore we must have
Associative Memory
61 cp
gl 0.40 1.0 ' ( a )
1.0
-- --
0.8
rn = 100
0.8
rn = 1000
0.6
m = 10,000
0.4
0.6
0.42
0.44
0.46
0.48
0.50
%
0.2
.'y
0.0 -1.0
0.4
-1.5 -2.0 0.2
\
!
-2.5 I
-3.0 0.0 0.0
I 0.1
-3.5 0.2
0.3
0.4
gl FIGURE 7. The behaviour of y (the average spurious firing rate) as a function of the inhibition strength parameter gl according to the Level 1 equations In the large-n limit. The parameter values are c = 0.5, go = 0, m a 2 - k = 0.5 and results are shown for three values of m, namely 100, 1000 and 10,000, corresponding to activities a = 0.0707, 0.0224 and 0.0071, respectively. The value of cp is approximately constant, at 0.197; for g: < cp, y > 0.5 and for g > cp, y < 0.5. As m Increases, the curves approach a step function.
A> ~ and (by L e m m a 2)/z > #+ also, otherwise the only stable fixed point is x * < ~ _ . These conditions are necessary for recall, but do not guarantee that it will occur. Clearly the best conditions for recall are w h e n / z is as large negative as possible and A + # is as large positive as possible; that is, #+ < # << 0 and )~ + # >> 0. It may happen that if a is too large then these conditions are not met for any value o f gl- F o r example, numerical calculations show that A has a m a x i m u m value of a b o u t 1.3 < v / ~ for m = 200 and a = 0.1. However, we show in the Appendix that if a is small enough then the conditions for recall are met for a range of values of g] E (cp, c). To see the effect o f the connectivity parameter on the behaviour of the network it is instructive to parameterize g~ in the region of recall as g] = (1 - r)cp + rc, r E [0, 1], so that 1 -p
#(~) = - ~
(19)
v~'
~ ( r ) = (1 - ~-)1 - p a 1 v~ l a~(~,(~))"
(20)
This parameterization shows that the essential behaviour of the network does not depend on the value of the connectivity c, other than to alter the range of g] for which recall is possible.
-4.0 0.40
I
/z+ \ \ . I
!
I
0.42
0.44
0.46
'%' "1~
0.48
*
0.50
gl FIGURE 8. The beheviour of the fixed points as a function of the inhibition strength parameter gl for the case m = 100, a = 0.11, c = 0.5, go = 0, according to the Level 1 theory in the large-n limiL The upper figure (a), produced using AUTO86, shows the locus of the fixed points of x, with the solid lines corresponding to stable fixed points and the broken line to unstable ones. The upper solid line represents m e m o r y retrieval. The lower figure, (b), shows ix on the same gl-SCalo. Also shown are ix_ and ix+, as given by eqn (17). Comparison of (a) and (b) shows that a change from monostsble to bistable behaviour occurs when ix enters the region between ix_ and ix+, in agreement with L e m m a 1. These graphs should be compared with the upper Level 1 curve of Figure 4, which is for the same parameter values except that n = 10 s rather than oo.
The above analysis shows that for recall we must have/z > #+. I f # < #_ as well then, by L e m m a I, the system is bistable. Some of the types o f behaviour that occur are illustrated in Figures 8 and 9, which show # _ , #+ and # for some values of the parameters, along with the corresponding behaviour of the fixed points. 4.1.2. Level 1, with intrinsic threshold go > 0. When go > 0 the limit ~l does not reduce to a single equation and we are not able to give a complete analysis o f the system. We prove in the Appendix that if g0 > 0 then (0, 0) is a stable fixed point of ~l for all choices o f the other parameters and if gl is large then all points in [0, 1]2 are attracted to (0, 0). Thus we understand the behaviour of ,I~1 for (x,y) ,~ (0,0). On the other hand the convergence of (b I (x, y; go) (and its derivatives) to ~1 (x, y; 0) as go ~ 0 is uniform for (x,y) away from (0, 0) so in this region, and for small values of go, the fixed point behaviour of ~l(x,y;go) is similar to that of ~ l ( x , y ; 0 ) . Small values of go have little effect on recall. Numerical computations give results in agreement with this analysis, and also show that
62
C. M. Thomas, W. G. Gibson and J. Robinson
gl 0.30 1.0
0.35
91
0.40
0.45
m
J
0.50
0.40 1.0
0.8
0.8
0.6
0.6
0.4
0.4
0.42
0.44
0.46
0.48
0.50
(a)
0.2
0.2
J 0.0
0.0 0
0.10
-1
0.08
/
(b)
-2 -3
y
\
-4
0.04
/
I
/~+k.\ J /
0.02
-5 -6 0.30
0.06
\ I
I
0.35
0.40
%
0.0
I
0.45
0.50
0.40
gl
I
0.42
~--"I
~
- -
0.44
I
I
0.46
0.48
0.50
gl
FIGURE 9. The behaviour 04 the fixed points as a function of the inhibition strength parameter gl for the case m = 40, a = 0.14, c = 0.5, go = 0, according to the Level 1 theory in the large-n limit. The upper figure (a), produced using AUTO06, shows the locus of the fixed points of x, with the solid lines corresponding to stable fixed points and the broken line to unstable ones. The upper solid line represents memory retrieval. The lower figure, (b), shows ix on the same gl-scale. Also shown are ix_ and ix+, as given by eqn (17). Comparison of (a) and (b) shows that a change from monoatable to bistable behaviour occurs at the Intersections of ix with IX+ and with ix_, in agreement with Lemma 1. The principal differences with Figure 8 are that there is now a monostable region in which retrieval occurs, and the transition to recall is more gradual in this region.
FIGURE 10. The behaviour 04 the fixed points as a function of the inhibition strength parameter gl for the case n = 106, m = 100, a = 0.11, c = 0.5, go = 0, according to the Level 2 theory. The figures were produced using AUTO86. The upper figure, (a), shows the locus of the fixed points of x, with the solid lines corresponding to stable fixed points and the broken line to unstable ones. The upper solid line represents memory retrieval. The lower figure, (b), shows the corresponding picture for y. [Note the different scales on the ordinates of (a) and (b).] Figure (a) can be compared with Figure 8a for Level 1 (though note that Figure 8 Is drawn for n = cx~); Figure 4 shows the behaviour 04 the Iterated equations for the same parameter values.
tx, y,x ,,Y ,,) =~(cp'(ax' +(1-a)y')-gl(ax x/~c(ax' + (1 -
^(2),
~o4
taking go positive and sufficiently large removes all unstable behaviour (for example, cycles of period
2).
4.2. Level 2 Theory for Large Network Storing Memories From eqns (3)-(6), ~,(2)= (%12),~°2^(2),%^(2),%~2)) where ~b12)(x, y, x',y')
= ¢(cax+cp(1-a)y'-g,(ax+ xfic(l - a)y'
fc lax' + I' -al 'l
-+--6 c
These are difficult to treat analytically. Numerical investigation indicates that their behaviour is qualitatively similar to that of Level 1. In particular, there is a sudden transition to a region o f recall as gl increases, corresponding to a transition from a region of monostability to one of bistability. Compare Figure 10a which shows the fixed points o f x as a function of gl for Level 2 with Figure 8a which is the corresponding graph for Level 1. (The parameters are the same, except that Figure 10 is for n = 106 whereas Figure 8 is for n = cxz; redrawing Figure 8 for n--- 106 causes very little change.) The main difference is that Level 2 predicts a more sudden transition to recall and a smaller recall region than does Level 1, as was already apparent from Figure 4. Another difference, also evident in Figure 4, is the large downwards jump in y at recall; Figure 10b shows the bifurcation responsible for this, and Figure 7 shows that no such phenomenon occurs for Level 1
(I-a)y)-go)
^(2), ~0 2 ~x,y,x t ,y t,)
+ I'
)
@)(x, y, x', y') = @(cax + cp'(1-- a)y' -- gl(ax + (1-- a)y) -- go)~
+(l-a)y)-go) a)y')
Associative Memory
in the n --, o0 limit. (A jump does occur in the Level 1 case for finite n, though it is much smaller.) In some parameter regions iteration o f the Level 1 equations results in all points of [0, 1]2 being attracted to a fixed point near (1, 0), corresponding to recall. This is predicted by the analysis o f ~,l above (see Figure 9), and is also observed for finite n. F o r Level 2 we have never found a fixed point, corresponding to recall, to which all points in [0, 1]4 are attracted. Any fixed point that corresponds to recall seems always to be accompanied by another possible mode of behaviour o f the network, whether it be another stable fixed point, or non-periodic behaviour of some type. For example, although Figure 10 indicates that for gl ~ 0.475 there is only one stable fixed point of if(E), iterating with an initial point o f (0.5, 0.5, 0.5, 0.5) reveals what is either non-periodic behaviour or a cycle with a very long period. The effect of taking go > 0 in the Level 2 case is similar to that o f Level 1. If go > 0 then the Level 2 equations always have a stable fixed point at (0, 0, 0, 0), small values o f go > 0 have little effect on recall other than to shift the region of possible recall to the left, and taking go sufficiently large positive removes any instability of the network. 5. C O N C L U S I O N We have investigated the stability properties of a particular autoassociative neural network which incorporates a number o f biologically realistic features. The first case treated was that o f a network storing no memories, but with the neurons randomly connected with all connections being o f the same strength. We found three regions of parameter space, corresponding to extinction, stable and unstable behaviour (Figure 2). In the unstable region, a variety of behaviours could occur, including a period-doubling route to chaos (Figure 3). F o r networks storing memories, we commenced with the Level 1 theory, which includes the spatial correlations that occur because the connections between the neurons are formed from a c o m m o n set o f memories. (This does not correspond to a "real" network, in the sense that it is not possible to simulate a network in which spatial correlations are important, but temporal ones are not. However, Level 1 forms a useful theoretical intermediary on the way to the full Level 2 equations.) In the limit as n ~ oo and with go = 0 the Level 1 theory reduces to a single difference equation and it is possible to give an almost complete analytical treatment. There are now three regions of parameter space, corresponding to monostable, bistable and periodic behaviour (Figure 5). Memory recall occurs typically in the bistable region: one line o f fixed points corresponds
63
to recall; the other line corresponds to non-retrieval. These two lines are separate, so the transition from one to the other is abrupt. This is desirable in a memory system: the network should either recall a memory almost exactly, or else go to a state with low correlation with all the memories. Memory recall may also occur in a subregion o f the monostable region (Figure 5). This may seem somewhat anomalous, as it means retrieval occurs from any initial state; however, we note that this behaviour is immediately removed if go > 0 since then (0, 0) becomes a stable attractor, and also that it does not occur for the more realistic Level 2 theory, even for go = 0. The finite-n behaviour is similar to the n = oo case, with less abrupt transitions but similar fixed points. Again, this is expected because of the uniform convergence o f the relevant functions and their derivatives, and is confirmed by numerical work. Finally, we investigated the Level 2 theory, which includes the temporal correlations that develop between the current state o f the network and the connections as a result of the progressive recall process. Even with the simplifications n ~ oo and go = 0 there are still four coupled difference equations, and a rigorous analytical treatment is not possible. Numerical work indicates that the basic fixed point structure is similar to the Level 1 case (Figure 10) and shows why the transition to recall is more sudden for Level 2. However, there are significant differences, such as the observation (based on numerical work) that a monostable region does not seem to exist: a fixed point corresponding to recall is always accompanied by another possible mode of behaviour. Again, taking go > 0 adds a stable attractor at (0, 0) and for sufficiently large go this removes oscillatory behaviour. Some comments can be made regarding the parameters c (connectivity), a (activity) and m (number of stored memories). As noted in Section 4.1.1, the actual value o f c does not alter the qualitative behaviour o f the system. It will have a quantitative effect; for example, since recall cannot occur ifgl is outside the range (cp, c) the setting o f g l becomes more delicate as c is reduced. The only requirement we have put on m and a is that ma 2 remains finite, so that the separation parameter 1 - p ~ e -'~2 does not go to zero. More detailed investigation of the storage capacity of the network would require further specification, for example the functional relationship between a and n. A partial treatment o f memory capacity was given in Gibson and Robinson (1992), with some analytical results for absolute capacity and numerical results for relative capacity. Also, the capacity o f the model in relation to the hippocampus was studied in Bennett, Gibson, and Robinson, (1994). We have commenced a
64
C. M, Thomas, W. G. Gibson and J. Robinson
systematic study o f both memory and information capacity in this model and the results will be the subject of a future paper. The main conclusion of this work is that a memory storage network that incorporates a number of biologically realistic features (sparse coding, low connectivity, asymmetric connections, threshold setting by an inhibitory neuron) has been shown, with some degree o f rigour, to have a number of desirable attributes. These include stable fixed points corresponding to memories, well separated from attractors that do not correspond to retrieval of the target memory. Further, in the biologically realistic case where the intrinsic threshold (go) is positive these attractors tend to become stable fixed points also. As the models parameters (especially gl) are varied there are sharp transitions between memory-storage and non-memory-storage behaviour and these transitions can be characterized as the bifurcations of a nonlinear dynamical system.
REFERENCES Aihara, K., Takabe, T., & Toyoda, M. (1990). Chaotic neural networks. Physics Letters A, 144, 333-340. Amari, S. (1971). Characteristics of randomly connected thresholdelement networks and network systems. Proceedings of the 1EEE, 59, 35-47. Amari, S. (1989). Characteristics of sparsely encoded associative memory. Neural Networks, 2, 451-457. Amari, S., & Maginu, K. (1988). Statistical neurodynamics of associative memory. Neural Networks, 1, 63-73. Amit, D. J. (1989). Modeling brain function: the worm of attractor neural networks. Cambridge: Cambridge University Press. Bennett, M. R., Gibson, W. G., & Robinson, J. (1994). Dynamics of the CA3 pyramidal neuron autoassociative memory network in the hippocampus. Philosophical Transactions of the Royal Society of London B, 343, 167-187. Blum, E. K., & Wang, X. (1992). Stability of fixed points and periodic orbits and bifurcations in analog neural networks. Neural Networks, 5, 577-587. Chapeau-Blondeau, F., & Chauvet, G. (1992). Stable, oscillatory, and chaotic regimes in the dynamics of small neural networks with delay. Neural Networks, 5, 735-743. Devaney, R. L. (1989). An introduction to chaotic dynamical systems. Redwood City, CA: Addison-Wesley. Doedel, E. (1986). AUTO: Software for continuation and bifurcation problems in ordinary differential equations. Applied Mathematics Report, California Institute of Technology, Pasadena, CA. Fails, W. G., & Maier, R. S. (1988). Probabilistic analysis of a learning matrix. Advances in Applied Probability, 2,0, 695-705. Gardner-Medwin, A. R. (1976). The recall of events through the learning of associations between their parts. Proceedings of the Royal Society of London B, 194, 375-402. Gibson, W. G., & Robinson, J. (1992). Statistical analysis of the dynamics of a sparse associative memory. Neural Networks, 5, 645-661. Goles, E., & Vichniac, G. Y. (1986). Lyapunov functions for parallel neural networks. In J. S. Denker (Ed.), Neural networks for computing (pp. 165-181). New York: American Institute of Physics.
Golomb, D., Rubin, N., & Sompolinsky, H. (1990). Willshaw model: Associative memory with sparse coding and low firing rates. Physical Review A, 41, 1843-1854. Hirsch, M. W. (1989). Convergent activation dynamics in continuous time networks. Neural Networks, 2, 331-349. Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, U.S.A., 79, 2554-2558. Little, W. A. (1974). The existence of persistent states in the brain. Mathematical Biosciences, 19, 101-120. Little, W. A., & Shaw, G. L. (1975). A statistical theory of short and long term memory. Behavioural Biology, 14, 115-133. Marcus, C. M., & Westervelt, R. M. (1989). Dynamics of iteratedmap neural networks. Physical Review A, 40, 501-504. Marcus, C. M., Waugh, F. R., & Westervelt, R. M. (1990). Associative memory in an analog iterated-map neural network. Physical Review A, 41, 3355-3364. Mart, D. (1971). Simple memory: a theory for archicortex. Philosophical Transactions of the Royal Society of London B, 262, 23-81. Renals, S., & Rohwer, R. (1990). A study of network dynamics. Journal of Statistical Physics, 58, 825-848. Thomas, C. M., Gibson, W. G., & Robinson, J. (1994). Stability and bifurcations in an associative memory model. School of Mathematics and Statistics Report 94-29, University of Sydney, Sydney, Australia. Treves, A., & Rolls, E. T. (1990). Neuronal networks in the hippocampus involved in memory. In L. Garrido (Ed.), Statistical mechanics of neural networks (Vol. 368, pp. 81-95). Lecture Notes in Physics. Berlin: Springer. Wang, X. (1991). Period-doublings to chaos in a simple neural network: an analytical proof. Complex Systems, 5, 425-441. Wiggins, S. (1990). Introduction to applied nonlinear dynamical systems and chaos. New York: Springer. Willshaw, D. J., & Buckingham, J. T. (1990). An assessment of Marr's theory of the hippocampus as a temporary memory store. Philosophical Transactions of the Royal Society of London B, 329, 205-215.
NOMENCLATURE
m
W
zp
the number o f principal (excitatory) neurons in the network the number o f memories stored in the network. (Strictly, m + 1 memories are stored, since the
numbering starts from 0.) the connectivity, being the probability that two neurons have an intrinsic connection the activity, being the probability that a randomly chosen neuron in a memory pattern is active the connectivity matrix, whose elements Wij are unity if there is an intrinsic connection from neuron j to neuron i, and zero otherwise One of the memory vectors, p = 0, 1 , . . . , m. It is of dimension n, and its entries are 0s and Is
Associative Memory
J
x(t)
h(t)
T(t) go,g~
65
the matrix o f learned connections: Jij is unity if the ith and jth neurons are simultaneously active for any one o f the m + 1 memory patterns, and zero otherwise the vector giving the state o f the system at the discrete time t = 0, 1 , 2 , . . . . It is of dimension n, and its entries are 0s and ls the vector o f inputs to the neurons at time t the intrinsic threshold on a neuron at time t parameters in the assumed linear form o f the threshold function
O) If[ I4~ < ~
then the system is monostable regardless of the
value of O. (ii) I f W < ~
then the system is bistable when
and is monostable otherwise. (iii) I f W < - ~ / ' ~ then the system is monostable if
and otherwise has a stable orbit of period 2 with no other attractors. If W = A/2 and O = - A / 2 - / ~ then the functions ~(Ax +/~) and 2~(Wx - e ) - 1 are topologically conjugate (under the m a p ~- ~ 2T - 1). W h e n expressed in terms o f A and/~, the results o f A m a r i are equivalent to L e m m a I. A direct p r o o f o f L e m m a 1 m a y also be given using the techniques of A m a r i (1971).
T(t)
s(t) e (t) ai(t) It
Yt
zt
p, p~ 7, 7 / a,/3 #,A
(I'(2) (x, y, x', y )
x* , y* , z*
the total number of neurons firing at time t expectations o f (input-threshold) at time t standard deviations of ( i n p u t threshold) at time t the average valid firing rate; naxt is the expected number o f valid firings at time t the average spurious firing rate; n(1 - a ) y t is the expected number of spurious firings at time t the average total firing rate; nzt is the expected number of neurons firing at time t functions o f a and m, defined in Section 2 functions o f 81 and 82, defined in Section 2 functions o f ~1, 62 and ~3, defined in Section 2 functions of n, c, go,g~ defined in Section 3 functions o f a, c, p, 7, g~ defined in Section 4 the two-dimensional map defined by the progressive recall equations in the Level 0 (i = 0) or Level 1 (i : 1) approximations the four-dimensional map defined by the progressive recall equations in the Level 2 approximation the above maps in the limit n ~ ~x~ fixed points o f maps normal distribution function APPENDIX
Proof of Lemma 1 Amari (1971) proved the following results about the system (12):
Proof of Lemma 2 Define ~(/~) implicitly by ~(lz) = @(A~ ÷ p). T h e n ~(/~) has three branches on (/~+,/~_) and is single valued elsewhere, a n d
which is never zero and is undefined only if
'k
/
The branch defined for p ~÷ is small for /~ large negative, is increas~defined on (-oo,/~_] and takes the value eP(-~/logA2/21r) at /z_. This proves the first part o f the lemma. The rest is similar. Behaviour of A and p for cp > 0 a n d /z+ < # << 0. For convenience we use the expressions [eqns (19) and (20)] in which A and # depend on m and a via the quantity (1 - P)/v~. A simple calculation shows that
¢-q for small ma 3 and some constant K which is independent o f a a n d m. If we fix any r ~ (0, 1) then #(r) is like - 1/mC-m--~and so ~(~-) << 0. Moreover the inequality (b(-~) < ~
1
1
~ e -(2/2,
~ > 0,
(A1)
shows that A(r) increases more rapidly than a e-'~/~*3/m~m~ ~ and hence m o r e rapidly than any power o f 1/a as a ~ 0./"This shows that A(~')+ p ( r ) > > 0 for small a. We have only to show that # > #_. N o w • ( ~ ~ 1 since A >> 0 and
for some constant M > 0 and for a small. But, as remarked above, A goes to oo more quickly than - # as a --~ 0, so for a small we have
~,(~) > ~,÷(~-).
Behaviour of A and p for gl > c . We prove that if g] > c is large e n o u g h a n d a is small then inequality (1) holds. Since/z < 0 and /~+ > 0, we have only to prove that # > #_. We continue to write gl = (1 - T ) c p + r c , but, now with T > 1. By eqn (21), p- <-~÷1/l~v/~-~ and, so it is enough to prove " / - - - /~-> t- h ~ a +t 1 / ~ , or /z2 < log(A:/27r) - 2 + l/log(A2/2~r). The last term on the right
66
C. M . Thomas, W. G. Gibson and J. Robinson
hand side is positive and it is sufficient to 1,2 < log(A2/2~r) _ 2. By eqn (21),
show that
d (x, y) = ¢ (t~(x, y)
oc.)
so logO(~) < - log ~
- log(-#) - #2/2,
and so
o
log ~
-2=log
(
")
( l - r ) 2(1-p)2 7 ~(I
- 2log ##(p) - 2
where I,(x,y) = (c - g,)ax + (cp - g,)(1 - a)y - go and 12(x,y) = (cp --g,)(ax+ (1 - a)y) --go. Let B, = {(x,y) • [0, 11~: x,y < e}. Choose el > 0 small enough so that lj(x,y)<-go~2 whenever (x,y) • B~). There is a constant C > 0 such that if (x,y) • B~ and e < e~ then
> l ° g ( (1-'r)2(l-p)2'~ (l--~a) 2a2 ) + 2 1 o g ' ~ + 21og(-p) +/a x -- 2 > #2 <
for large r, proving the lemma. Stability of the quiescent state for go > O, Level 1. Here we prove that ifg0 > 0 then (0, O) is a stable fixed point of 4>I for all choices o f the other parameters, and that ifg, is large enough then all point in [0, 1]2 are attracted to (0, 0). Now
~I (x,y) = • (lt (x, y)
/¢(c('--c)ax+cp(l-cp)(l--a)y)/n+Tc2(l-a)2y2),
where we h ac~e made use of eqn (21) to get the second inequality. Since x/2 e- / ~ = o(e) as e ~ 0, there exists e2 ~ c then ~(x,y) < -go for all (x,y) E [0, 1]2. Provided only that gt > c, a choice o f e2 can be made independently ofgt such that all (x, y) in B~2 are attracted to (0, 0). But for g~ sufficiently large, • t(x,y) E B~ for all (x,y) q~ B~2, proving the lemma.