Neural Networks, Vol. 9, No. 1, pp. 25-40, 1996
Pergamon
Copyright © 1996 ElsevierScienceLtd. All righta reserved
0893-6080(95)00100-X
Printed in Great Britain 0893-6080/96 $15.00+ .00
C ONTRIBUTED A R T I C L E
A Network of Chaotic Elements for Information Processing SHIN ISHII,1 KENJI FUKUMIZU2 AND SUMIO WATANABE2 mATRHuman Information ProcessingResearch Laboratories, and 2RicohCo. Ltd (Received 4 January 1993; revised and accepted 11 July 1995)
Abslract--A globally coupled map (GCM) model is a network of chaotic elements that are globally coupled with each other. In this paper, first, a modified GCM model called the "globally coupled map using the symmetric map ( SGCM)'" is proposed. The S-GCM is designed for information-processing applications. The S-GCM has attractors called "'cluster frozen attractors", each of which is taken to represent information. This paper also describes the following characteristics of the S-GCM which are important to information-processing applications: (a) the S-GCM falls into one o f the cluster frozen attractors over a wide range of parameters. This means that the information representation is stable over parameters; (b) represented information can be preserved or broken by controlling parameters; (c) the cluster partitioning is restricted, i.e. the representation of information has a limitation. Finally, our techniques for applying the S-GCM to information processing are shown, considering these characteristics. Two associative memory systems are proposed and their performance is compared with that of the Hopfield network.
Keywords--Chaos, Nonequilibrium dynamics, Globally coupled map, Spatiotemporal chaos, Chaotic neural network, Associative memory. 1. INTRODUCTION
Somers, 1991). A nonequilibrium neural network model was also proposed by Aihara et al. (1990); it was deduced from experiments with squid giant axons. This model's spatiotemporal complexity is not generated by the network structure, rather by the dynamics of each single neuron. Neural network models with asymmetric connections have also been investigated theoretically (Amari, 1972; Sompolinsky & Kanter, 1986; Sompolinsky et al., 1988). The major motivation of the above-mentioned studies is to "mimic" biological neural networks; further breakthroughs are needed to properly achieve this. From a technical viewpoint, it is important to implement nonequilibrium information processing such as that in the human brain. For example, let us think about the associative memory (Kohonen, 1977), which is a key technology utilized in many technical fields such as pattern recognition and database retrieval. Hopfield (1984) proposed a neural network approach to the associative memory in which an association process corresponds to minimizing the network's Lyapunov function. In this sense, his network employs equilibrium dynamics. His model also employs the covariance learning rule (Hebb, 1949). Since his work, many associative memory systems based on nonequilibrium neural network models have been proposed, many of which employ the covariance learning rule. For
Recently, there have been many studies on artificial neural network models with nonequilibrium dynamics. They have been encouraged by recent biological experimental results of mammalian brains. In the cat visual cortex, for example, stimulus-specific synchronized oscillations have been reported by Eckhorn et al. (1988) and Gray and Singer (1989). Sakarda and Freeman (1987) reported that, in the rabbit olfactory bulb, limit cycle activities occur for perceptible specific odors but chaotic activities occur for novel odors. Based on the above-mentioned biological results, the spatiotemporal complexity in recent nonequilibfium neural network models has mainly been attributed to the network's asymmetric connections, i.e. excitatory and inhibitory connections (Lie & Hopfield, 1989; Yao & Freeman, 1990; Grossberg &
Acknowledgements: This work was done while the author was affiliated with the Information and Communication R&D Center, Ricoh Co., Ltd. The first author would like to thank Masaaki Sato of ATR Human Information Processing Research Laboratories for his valuable comments. Requests for reprints should be sent to Shin Ishii, ATR Human Information Processing Research Laboratories, 2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-02 Japan; E-mail:
[email protected]
25
26
S. Ishii, K. Fukumizu and S. Watanabe
example, Nara et al. (1993) proposed an associative memory system based on an asymmetric neural network. Adachi et al. (1993) proposed a system based on the model proposed by Aihara et al. (1990). On the other hand, from a physical viewpoint, studies have been done on nonlinear coupled oscillators (Kuramoto, 1991). The main interest of these studies has been the complex phenomena of certain physical systems such as spin-glasses, although they also intend to analyze the spatiotemporal complexity o f biological neural networks. Kaneko proposed several models which are based on coupled chaotic elements. Each element evolves in time according to a logistic map, and the couplings are of the nearest neighbor type (Kaneko, 1984), or of the global coupling type (Kaneko, 1990). Here, the latter model (Kaneko, 1990), which is called the "globally coupled map (GCM)", is briefly introduced. The G C M is defined as:
xi(t +
N
1) ---- (1 -
e)f(xi(t)) +N Z. f(xj(t))
(1)
2=1
f(x) = 1 - ax 2
x C [-1, 1],
(2)
where xi(t) denotes the ith unit's value at time t, and N the number of units. Each unit's dynamics is almost entirely given by the logistic map; the portion described as a summation in eqn (1) is defined as feedback from the average of all of the units. This G C M model has many interesting characteristics, one of which is that attractors called "cluster frozen attractors" are observed. The G C M has two parameters, a and e. When these parameters are set to specific values, all units split into a small number of clusters, and then the units belonging to the same cluster come to follow an identical orbit. The number of clusters sometimes becomes large depending on the parameter values. In this paper, we propose a modified G C M model which can be applied to information processing. Since Kaneko's original model is not well suited to information processing, we employ a cubic function called the S-MAP instead of the logistic map. Our main target in this paper is in the application of a spatiotemporal chaotic system to information processing. However, information processing is primarily a process done by biological neural systems. Therefore, our motivation also lies in the role of chaos in biological neural networks, although the proposed model is possibly quite distant from actual neural systems.
2. G C M M O D E L U S I N G A S Y M M E T R I C M A P 2.1. Model Description In Kaneko's original G C M model (Kaneko, 1990), each unit's chaotic motion is produced by a logistic map. In our model, a "symmetric M A P (S-MAP)" is employed instead o f the logistic map. [Model S-GCM] N
xi(t + 1) = (1 - e)f(xi(t)) +~[ ~
(3)
f(xj(t))
j=l
f ( x ) = ax 3 - ax + x,
x E [- 1, 1].
(4)
Equation (3) is the same equation as in the GCM; xi(t) denotes the ith unit's value at time t, and N the number of units. Each unit's dynamics is almost entirely given by the cubic function S-MAP described by eqn (4); the portion described as a summation in eqn (3) is defined as feedback from the average of all of the units. Figures la and lb show the function shape of the logistic map and the S-MAP, respectively. Figures 2a and 2b show their respective bifurcation diagram over parameter a. The S-MAP has a symmetric function shape as shown in Figure 1b, resulting in the symmetric bifurcation diagram as shown in Figure 2b. Our S-MAP is a cubic function and has two extrema in its range when a > 2. On the other hand, the logistic map is a quadratic function and has one extremum. Therefore, the S-MAP has at most two attracting periodic orbits, while the logistic map has at most one attracting periodic orbit (Devaney, 1989). When the S-MAP has two attracting periodic orbits, these orbits exist symmetrically as Figure 3 shows. Later we will show a technique to take an S - G C M
0.5
f(x) o
2.0
-0,5~
-1
-015 - - - -
0.5
1
FIGURE l a . Logistic map, f(x) = 1 - u x 2, a = 1 . 4 a n d ~x = 2 . 0 . The logistic map i s a quadratic function, and has one extremum in its range.
Chaotic Associative Memory
27
0.5
0.5
f(x) o
f(X) o
-0,5
-0.5
-1 -1
-0.5
0
0.5
1
-1
#\
/
\
/ -015
/ 0
×
0,5
X
FIGURE l b . S-MAP, f(x)= a x ~ - a x + x, a = 3.0 and a = 4.0. The S-MAP is a cubic function, and has two extrema in its range when a > 2.
0.8
FIGURE 3. Two two-cycle periodic orbits of S-MAP. When the SM A P has two attracting periodic orbits, they exist symmetrically on its range.
attractor as a binary representation, where, roughly speaking, these two orbits are regarded as representing 1 and - 1 . The symmetry of the S-MAP allows this coding technique to work well.
"
0.6 i 0.4 I 0.2
2.2. Cluster Frozen Attractors
X 0 -0.2 -0.4 -0.6 -0.8
-1
1.2
1.4
1.6
1.8
2
Q~
FIGURE 2a. Bifurcation diagram of logistic m a p (1.0 ~ a
~2.0).
0.8 0.6
i
0.2 X 0 -0.2 0.4
-0.8 -1 2.5
3
3.5
4
o~
FIGURE 2b. Bifurcation diagram of S - M A P (2.5 ~ a
Our S-CdZM has two parameters, a and e. When these parameters are set to specific values, the system falls into attractors called "cluster frozen attractors". When the system is attracted into one o f the cluster frozen attractors, the units belonging to the same cluster come to take an identical orbit. Here, each "identical orbit" is typically a two-cycle periodic orbit like those shown in Figure 3, and sometimes four-cycle, eight-cycle, and s o o n , 1 depending on the parameter values and the initial state. It can also be a chaotic orbit. Figure 4a shows the time series of all the units when their initial values are randomly set within the range of - 1 to 1, where a = 3.4 and e = 0.1. Figure 4b shows the same time series for every two steps (t = 2, 4, 6,...). Even if all the units initially vary in their values, they come to split into four clusters and the units belonging to the same cluster come to take an identical orbit, i.e. a two-cycle periodic orbit in this case. The attractors may vary even if the parameter values are fixed. F o r example, with a = 3.96 and e = 0.25, the S - G C M takes five-cluster frozen attractors, six-cluster frozen attractors, and so on. This variation stems from the variation of initial states.
~4.0).
D u e to t h e s y m m e t r i c s h a p e o f t h e S - M A P , s o l u t i o n s c a n h a r d l y exist in t h e S - G C M .
fixed-point
28
S. Ishii, K. Fukumizu and S. Watanabe
(a)
0.8
i x
°'i f m
-0.4 -0"2 I -0.6 -0.8 I
-10
5
1=0
1'5
2*0
25
3=0
35
I
40
45
50
t
0.8
(b)
0.6 0.4 0.2 x
0 -0.2 -0.4 -0.6 -0.8 -1
0
1=0
1/2
2=0
FIGURE 4. Four-cluster frozen attractor (er = 3.4, • = 0.1). An example of a four-cluster frozen attractor. (a) The time series of all the units are plotted when their inlUal values are randomly set. (b) The same time series as in (a) are plotted for every two steps (t = 2, 4, 6,...). Note that the number of units is 100, and there are 100 lines drawn In each figure. Even if all the units Initially vary in their values, they come to sprit into four clusters and the units belonging to the same cluster come to take an identical orbit, i.e. a two-cycle perledic orbit.
3. C H A R A C T E R I S T I C S O F S - G C M In this section, we describe some o f the S-GCM characteristics which are important to informationprocessing applications. In order to apply the SG C M to information processing, we determine its input/output method as follows. 2 Input to the SG C M corresponds to its initial state, and output from the S-GCM corresponds to the state after some period of time. Information is thus "processed" by a dynamical system S-GCM. 2 Further description will be given in Section 4.1.
3.1. Phase Diagram of S-GCM
Spatiotemporal features of the S-GCM attractors are mainly determined by its parameter values; a is the bifurcation parameter o f the S-MAP and indicates the strength o f each unit's chaos, e indicates the strength of the couplings. Therefore, roughly speaking, as a becomes large, the S-GCM becomes chaotic, and as e becomes large, the S-GCM becomes coherent, or stable. Figure 5a shows a rough phase diagram of the S-GCM, where the attractors are classified according to their spatiotemporal features.
Chaotic Associative Memory
29
Coherent
0.4
~
-.
Glassy(.2)
....... ..........frO.d,=) ¢
,.on,
0.2
LLE=O
0.1
...... "'"~
0.0 '""'"'"'"""" ""'"" 3.0 3.2
Tur.buleni"" Turbule/~i
........ f'"''" 3.6
3.4
0.05 10
3.8
3.1
4.0
c~
3.2 3.3 ' 3,4
3.S 3.6 G
3.7
3.8
3.9
FIGURE 5c. LLE contour map of S-GCM. A contour map of the LLE shown in Figure 5b. The thick lines indicate that the LLE is equal to 0. The LLE is smallest in the middle of the ordered(2)
FIGURE 5a. Phase diagram of S-GCM. A rough phase diagram of the S-GCM o v e r parameters. The parameters are changed f r o m a = 3.0 to 4.0 by 0.02 and e = 0.0 to 0.4 by 0.02. Parametric points are classified according to the spatlotemporel features of the attractors, which are obtained for 100 random Initial slates.
phase. In the turbulent phase and the partial ordered phase, the LLE is positive. We can nlearly observe the correspondence between this figure and Figure 5a.
N = 200.
G C M . 3 In addition, the S - G C M has a fairly large ordered(4) area and ordered(8) area, while the G C M seems to have neither o f them. This is because the SM A P can have two attracting periodic orbits, and every unit tends to follow one o f them. The number of clusters primarily increases in the m a n n e r of 2, 4, 8, 16,... as a increases or e decreases. This is evident when e is fixed at 0 and a increases. Hence, the ordered(2) area is followed by the ordered(4) area, which is followed by the ordered(8) area. However, as e increases, the borders between these areas become m o r e vague, thereby producing some mixed areas like (2, 3) or (3, 4). These mixed areas are not found in Figure 5a because their borders are too vague to describe.
The definition of each phase is the same as in the G C M (Kaneko, 1990).
3.1.1. Coherent Phase. When a is small and e is large, all the units fall into the same orbit. This phase is similar to that of the G C M , but the region is much smaller than that in the G C M . 3.1.2. Ordered Phase. In this phase, the system falls into cluster frozen attractors. This phase is divided into several areas by the dominant number o f clusters. In Figure 5a, "ordered(2)" [or abbreviated to just (2)] means that, in that area, attractors tend to be frozen with two clusters. Cluster frozen attractors can also be found in the G C M . However, our SG C M has a m u c h larger ordered(2) area than the
3.1.3. Partial Ordered Phase. In the glassy phase and the intermittent phase, attractors fall into a large n u m b e r of clusters in some cases, and fall into a small n u m b e r o f clusters in other cases. Namely, the SG C M attractors vary depending on their initial states. Glassy(2), where two-cluster attractors are dominant, and glassy(4) areas clearly exist as shown in Figure 5a.
1.5~ 1 0.5 0 -0"I_ -
0.4
1
.
5
~
0,3
4
0
3
3.1.4. Turbulent Phase. When a is large small, " c h a o s " is stronger than " o r d e r " , unit follows its own chaotic orbit. The phase can be classified with respect to the
and e is and each turbulent preserva-
FIGURE 5b. Largest Lyapunov exponent of S-GeM. The largest Lyapunov exponent (LLE) of the S-GeM over parameters. The parameters are changed trom a = 3.0 to 4,0 by 0.02 and e = 0.0 to 0.4 by 0.02. When parameters are set to those values, the LLE Is averaged over 100 randomly chosen Initial states end after t = 2048. N = 200.
a Actually, the GCM is in ordered(2) within 0.13 < e < 0.21 at
o~=aoo..ml.4, but the S-GCM is in ordered(2) within 0.11 < c < 0.34 at a = c ~ ~ 3.3. Here, aoo denotes the accumulation point of the logistic map's period-doublings, and a'oo the accumulation point of the S-MAP's period-doublings.
30
S. Ishii, K. Fukumizu and S. Watanabe
bility o f information, which will be discussed in the next subsection. In the information-processing systems proposed in Section 4, we regard each o f the ordered(4) attractors as an output from the system. Therefore, the fact that our S-GCM has a large ordered(4) area is important to output stability and output recurrence. In this sense, the S-GCM is more suitable to informationprocessing applications than the G C M . Figure 5b shows the S-GCM's largest Lyapunov exponent (LLE) A for various parameter values. Each LLE value is averaged over 100 sets of randomly taken initial states and it is calculated after t = 2048 to ignore the initial transient period. Figure 5c shows the L L E contour map. The thick lines indicate that the L L E is equal to 0. By investigating Figures 5a, 5b, and 5c, we can observe the following facts. Roughly speaking, the L L E is large when a is large and e is small. • The L L E is smallest in the vicinity o f the middle of the ordered(2) area. Due to the symmetric shape of the S-MAP, two-cluster frozen attractors are most stable. • In the glassy area, the L L E is relatively large and its value is positive, which means the system complexity is high. •
In the turbulent area and the partial ordered area, the L L E is large and its value is positive, which means that the S-GCM in those areas is chaotic. We can also observe the correspondence between Figure 5a and
Figure 5c. For example, the turbulent phase and the partial ordered phase almost correspond to the area where the L L E is positive.
Preservability
3.2. Information
An N-dimensional binary coding function C is defined as follows, which converts a state vector x E [ - I , I] N to a binary vector C(x) c { - 1 , 1}N ( 1
C(x)i
i f x i >~X*
- 1 otherwise"
(5)
where x* denotes the stationary point of the S-MAP, which is equal to 0. Using this binary coding function C, an S-GCM state can be translated into N-bit binary representation. Next, the average information preservation rate R is defined as: R = tim tim R~t(t) t~OO M ~ o o
1
M
!
(6)
N
(7)
where h denotes the sample index, M the number of samples, N the number of units, xh(0) the hth initial state vector, and xh(t) the hth state vector at time t. Information preservation rate R indicates the system's long-term correlation and also corresponds to the expectation of the order parameter q of the
1.0-
@
~ - - epsilon = 0.00 epsilon = 0.10 - -o- • epsilon = 0.20
0.5
--
,
#
0.0 1.2
I
I
I
I
1.4
1.6
1.8
2.0
el FIGURE 6a. information preservation rate of GCM. ~t is changed from 1.2 t o 2.0 b y 0 . 0 4 in each case o f ~ = 0.0, 0 . 1 , 0 . 2 . In every case, at e certain value of or, R suddenly goes down to 0, which is due to the band merge of the logistic map.
Chaotic Associative Memory
31
1.0-
- ~ 0
+
~ - - epsilon = 0.00 ---o--- epsilon = 0.10 - - o - - epsilon = 0.20
0.5
-
-
g~
0.03.0
0.0'0 I
I
3.2
3.4
"0 " 0 " 0 " 0 "
I
I
I
3.6
3.8
4.0
OL FIGURE 6b. Information preservation rote of S-GCM. ~t is changed from 3.0 to 4.0 by 0.04 in each case of e = 0.0, 0.1,0.2. In the cases where e = 0.0,0.1, at o certain value of a, R suddenly goes down to 0, which Is due to the bond m e r g e of the S-MAP.
spin glass model EA (Edwards & Anderson, 1975). When R is close to I, the system preserves most o f the initial information. On the other hand, when R is close to 0, the system loses the correlation with the initial state. Experimental results show that in most cases o f parameter values, RM(t) becomes constant after several steps, and does not change any more. This implies that the S - G C M output is determined first in several steps. Rate R is calculated over parameters o~ and e both in the G C M and in the S-GCM, where N = 200 and M = 100. Figures 6a and 6b show the results. In the G C M , R decreases gradually from a b o u t 0.6 as o~ increases. Then R suddenly goes down to 0 at a certain value of a, where the state has no correlation with the initial state. This rapid change in the preservation rate is due to the band merge of the logistic map. Actually, in the case of c - - 0 . 0 , the turning point value, i.e. a ~ 1.56, corresponds to the value where the band merge occurs. 4 In the case of = 0.0, the temporal increase in R is caused under the influence of the three-period window of the logistic map. This rapid change is also observed in the S-GCM. 5 Rate R increases gradually from about 0.65 as o~ increases, then R suddenly goes down to 0 at a certain value o f a, where the state has no correlation with the initial state. This rapid change in the
4 See Figure 2a. In Figure 5a, the dotted line in the turbulent phase denotes this changing point.
preservation rate is due to the fact that the band merge of the S - M A P occurs 6 and the orbits representing each code are mixed with each other. When R is large, the system preserves m o s t of the initial information, even if the system is in the turbulent phase. When R is close to 0, the state has no correlation with the initial state and the system is chaotic with respect to its binary representation. As already mentioned, our S - G C M is applied to information processing by corresponding its initial state to the input and its attractor to the output. The existence of these two modes in the information preservation rate shows that our model has two processing modes, where information is preserved or destroyed. Either o f them can be specified merely by controlling the parameters. A high information preservation rate is important for information-processing applications. I f it is low, it means that the system is "noisy". In general, R is higher in the S - G C M than in the G C M , which implies that our model is less noisy than the G C M and the initial input information is m o r e meaningful in our model than in the G C M . This is because the SM A P is likely to have two attracting orbits, and each unit is attracted to one of these two orbits at the early stage of transition. However, R ~ 0.65 is apparently not so large. Therefore, we adopt the following initial coding technique. The N-dimensional function V converts a binary vector I E { - 1 , 1} N to a state vector V ( I ) e [-1, 1]~v. 6 See Figure 2b.
32
S. Ishii, K. Fukumizu and S. Watanabe
V(1)i = { x + + r a n d x-+rand
ifli=l ifli=-I
(8)
0. 06
where x + and x - denote the two two-cycle periodic solutions of the S-MAP, namely, f ( x +) = x - and f ( x - ) = x +; rand is a small random value. In order to input a binary vector I to the S-GCM, we set its initial state to be V(I). Since a unit having a value in the vicinity of x + or x - easily takes a two-cycle periodic orbit, our model can preserve almost 100% of the initial information 7 with this initial coding technique. In the applications shown in Section 4, we actually use this coding technique.
i
0,4
0,2
x~(t)
o -0.2 -0.4
;ilm;~ -0.6 -0.8 -1
i 0.4
013
0.2
L 0.5
i 0.6
i 0.7
0,8
//
3.3. Cluster Partitioning In the S-GCM, unit partitioning among clusters is restricted as in the G C M (Kaneko, 1990); if one cluster incorporates a small number of units and the other a large number o f units, the attractor can no longer exist. F o r example, when an S-GCM with 100 units is to be frozen with two clusters, it is highly probable that 50 units are incorporated into each cluster. It is less probable to get an attractor having biased unit partitioning, for example, where one cluster incorporates 30 units and the other 70 units. This feature is examined as for two-cluster frozen attractors. First, we define a parameter u = 1 7 ] / N representing the partitioning bias of a two-cluster frozen attractor, where Ni denotes the number of units in the first cluster 8 and N the number of units. When u is close to 0.5, the partitioning is balanced, and as u becomes distant from 0.5, the partitioning becomes biased. Figure 7 shows the values of the first unit x] on attractors as u changes, where N = 500, a = 3.65 and ~ = 0.25. 9 Here, the value of the parameter u can be controlled using a coding technique mentioned in the previous subsection. When v is close to 0.5, it is easy to set t, to that value. However, as u becomes distant from 0.5, it is hard to get those attractors, since two-cluster attractors become unstable. Then the system is likely to be stable with three or more clusters. In Figure 8, the rate of initial states that are frozen with two clusters is plotted against u. The system's complexity increases as the partitioning becomes biased. When the complexity becomes too large, the system increases the degree o f freedom by increasing the number o f clusters. Further argument is done by means o f Lyapunov analysis.
FIGURE 7. Cluster bifurcation for 2-clustar attractors. Xl(t) = (t = 1 0 0 0 , 1 0 0 1 , . . . , 1 0 3 2 ) is plotted against p at ~r = 3.65 and E = 0.25; 200 sets of initial states are prepared to obtain this figure. If p is not so distant from 0.5, we can make two-ctuster frozen attractors with that bias value by the c e d i n g t e c h n i q u e described in Section 3.2. However, as J, becomes distant from 0.5, we can hardly make such attractors. This is why no p o i n t is found on both sides of the figure. This figure implies that the S-GCM provides more complexity as p becomes distant from 0.5. N = 500.
xl(t)
When the S-GCM is frozen with two clusters, the system dynamics can be represented as a twodimensional dynamical system such as the following: Xl(t + 1 ) = ( l - - ~ - ) oN2 f ( X l ( t ) ) E N 2+ ---~-f(X2(t))
X 2 ( t + 1)
eN, =~-f(Xl(t))
+ ( 1 - - - _e.N, ~ - ) f ( 2 ( t ) ) ,X
(9)
(1o)
1
0.8 o
2
:~o.6 =
?
~
0.4
0.2
0
°2
"~
013
01,
01x
010
017
°8
b'
7 This does not hold when the partitioning is biased, as will be discussed in the next part. s We define the first cluster as the cluster with the first unit. 9 The S-GCM with these parameter values is in the ordered(2) phase.
FIGURE 8. Rate of two-cluster attractors. The rate of initial states that are frozen with two clusters is plotted against v at a = 3.65 and e : 0.25. The coding technique described in Section 3.2 is used. A s p becomes distant from 0.5, the system is unlikely to be frozen with two clusters. N = 500.
Chaotic Associative Memory
33
partitioning is hardly obtained implies that in the SG C M the representation o f information is restricted. An example is shown in Section 4.3.2.
'l
4. I N F O R M A T I O N - P R O C E S S I N G APPLICATIONS
(3
" -1
'~'~
("
Given the above-mentioned characteristics, our SG C M can be applied to information processing. -3
-4
i 4.1. General Strategy
-32
o'.3
0'.,
&
016
017
0.6
1:
9. Second-order Lyapunov numbers. The second-order Lyapunov numbers are plotted against i, at ot = 3.65 and = 0.2,5. W h e n ~, is close to 0.5, the largest absolute Lyapunov number pm,,x is smaller than 1, which means two-cycle periodic orbits are stable. As i, becomes distant from 0.5, Pine= Increases, and at a certain value of ~,, p m ~ exceeds 1. T h i s value of I, ( ~ 0.382, 0.618) corresponds to the blfurcalion point from two-branch to four-branch In Figure 7. FIGURE
where Xl(t) and X2(t) denote the value of the first cluster and the second cluster at time t, respectively and N] and N2 denote the number of units belonging to the first cluster and the second cluster, respectively. We represent this dynamical system as G : X(t) ~ X ( t + I), where X(t) =_ (Xl(t),Xz(t)). The second-order Lyapunov numbers, which indicate the stability o f G's two-cycle periodic orbits, are the eigenvalues of DG2 ()0, where DG2 (f~) = DG( G( ~') )DG( f~). Here, DG is the Jacobi matrix o f the dynamical system G. In Figure 9, the second-order Lyapunov numbers are plotted against v at a = 3.65 and e = 0.25. When the partitioning parameter v is close to 0.5, the largest absolute Lyapunov number #max is smaller than 1, which means two-cycle periodic orbits are stable. However, as v becomes distant from 0 . 5 , / ~ exceeds 1. This exceeding point, v2-~4 ~ 0.382, 0.618, corresponds to the bifurcation point from two branches to four branches in Figure 7. Namely, at v = v2-~4, a perioddoubling bifurcation occurs. Thus, the system provides more complexity by period-doublings of the cluster orbits as v becomes distant from 0.5 as shown in Figure 7, and when the period-doublings are accumulated, two-cluster attractors can no longer exist, and the system provides its complexity by increasing the degree o f freedom, i.e. the number of clusters. Accordingly, the spatiotemporal complexity of the system is determined by the parameters a and and by the cluster partitioning as well. In our information-processing applications proposed in Section 4, cluster frozen attractors are taken to represent information. The fact that biased cluster
We regard the S-GCM as an information-processing system which processes an N-dimensional binary vector I E { - 1 , 1}~ to an N-dimensional binary vector O E { - 1 , 1}~. Because our dynamical system has [-1, 1]~ as its range, conversion is needed. I _~v x(O) s-fl~cu x( n - ~ ¢ O;
(11)
C and V were defined in Section 3.2. Using this input/ output conversion, our S-GCM at a = 3.4, a = 0.1 is likely to be frozen with four clusters, if the input pattern I does not contain too many 1 nor too many - 1. When the parameters are set to those values, each output pattern O is almost equal to the input pattern I; the S-GCM process is an identical mapping. With some parameter values in the turbulent phase, say a = 3.9 and E = 0.05, no correlation between I and O is found. In this case, each unit shows a chaotic motion; the S-GCM process is a conversion from an input pattern to a random output pattern. Accordingly, our S - C ~ M has two information-processing modes, i.e. identical mapping and randomizing, or preserving and destroying. The above-mentioned two processing modes are "global" ones. However, if we control the parameter values unit-wisely, we can "locally" switch these two modes one by one. This is the main idea for applying our S-GCM to information processing. In what follows, we propose several techniques to control this local mode switching.
4.2. Associative Memory Controlling a Let {~1,~2,... ,~ml~k E {1,--1} s} be a set of memorized patterns, where (/k denotes the ith element value in the kth memorized pattern and m the number of memorized patterns. Associative memory systems must be able to output the target memorized pattern from an input that is relatively close to the target. A covariance matrix of the set of memorized patterns is defined by
34
S. Ishii, K. Fukumizu and S. Watanabe l
m
aq = _~k~_1~ .
(12)
Our associative memory systems employ this covariance matrix, namely, the learning method is of the conventional covariance type (Hebb, 1949). The following system (Ishii et al., 1993a) is an associative memory system obtained by modifying the S-GCM. [System 1] N
xi(t + 1) = (1 - e)fi(xi(t)) + Nj~_~I.= fj(xj(t))
J}(x) =f(x; i) = aix 3 - s i x + x.
(13)
(14)
This system is different from the original S-GCM in that the strength of each unit's chaos ai is diverse. The evolution of a~ is defined as: ,~', = ,~, + ( ~ , - ~ n ) t a n h ( 3 e , )
(15)
N
Ei = - x i y ~ trqxy,
(16)
j=l
where ami,, 3, and e are constant parameters. In the experiments below, they are set as: Otmin= 3.4, /3----2.0, and E = 0 . 1 . In eqn (15), each ai is controlled to be between ami, and a m a x - - - = 4.0. The evolution of the parameter ai described in eqns (15) and (16) is done once every 16 time steps, i.e. t = 1 6 , 3 2 , 4 8 , . . . . In this sense, we describe this evolution without using t. Although eqn (15) resembles the sigmoidal output function employed in the Hopfield network (Hopfield, 1984), its dynamics is quite different. In the Hopfield network, an output of the sigmoidal function directly affects the unit's value, as will be described by eqn (17). In our system, eqn (15) only controls the value of parameter oti, which is the bifurcation parameter of the S-MAP. In this sense, eqn (15) affects the unit's value xi in an indirect manner, i.e. through Ot i.
FIGURE 10. Sample patterns. Each pattern is a binary pattern consisting of 100 units. The mark "o" corresponds to 1, and the mark " . " corresponds to - 1 .
Let us show an example association process. When [System 1] memorizes five 100-bit binary patterns shown in Figure 10, and the input binary vector I is a 35% reversed pattern of " A " , it associates " A " after scores of transitions. Figures 1 la and 1 lb show this association process. In Figure l lb, the abscissa denotes the association time t and the ordinate the time-series of every unit's value. In this figure, highly chaotic motions are observed at the early association state. As time elapses, these motions become quiet, and the association is completed when the system falls into a four-cluster frozen attractor. At this time, its binary representation O is successfully equivalent to " A " as Figure 1 la shows. Why does this system work well? An interpretation of the mechanism is introduced. Since E = ~-~iEi is equivalent to the conventional energy function, we call Ei the ith partial energy. Our [System 1] searches for a local minimum of the energy function by making each partial energy small and negative as follows. If Ei is high and positive, which means the ith unit value xi does not suit the covariance matrix [trq], a~ increases according to eqn (15), and the unit becomes disturbed. During the course of this disturbance, the unit-wise processing mode is changed from the preserving mode to the destroying mode. Before this mode change occurs, the unit is preserving its input, but once the mode is changed, the unit is disturbed enough to make a chaotic
FIGURE 11a. The input binary vector is a 35% reversed pattern of " A " . As time elapses, the system state becomes close to "A", and finally the system retrieves " A " .
Chaotic Associative Memory
35
- - ' 1 "
¥
T
T
10
15
20
~
"7
I"
"T
35
40
45
0.2 x
0 -0.2 _o.,
-0.6 - ~ -0.8 -1
0
5
25
30
50
ff16 FIGURE 11b. Time series of values. The time series of every unit's value is plotted every 16 time steps (t = 16, 32, 4 8 . . . ) . Highly chaotic motions are observed at the early stage. As time elapses, these motions become quiet, and the system falls Into a four-cluster frozen attractor.
motion, which enables the unit to search for the proper state within the range. When the unit suits the covariance matrix, Ei becomes small and negative, and c~i becomes small. In this case, the unit-wise processing mode is changed from the destroying mode to the preserving mode, i.e. the disturbance fades away. When every unit's partial energy becomes small and negative, it finally becomes oti = O~min for all i. At this time, [System 1] is equivalent to the original S-GCM, and the system is driven into an attractor, which will be a four-cluster frozen attractor. Thereby the output 0 is equal to the memorized pattern
required. Figure l lc shows the time-series of every unit's partial energy El. This figure shows that, as time elapses, all of the partial energy values become negative, which makes the total energy Y~iEi a minimum. Figure 1 ld shows the time-series of every unit's a value. As this figure shows, at the early stage of association, some of the ai values are large, which make the unit search for a proper state. However, as time elapses, they become small, and finally all of them come to be equal to amin, which make the system equivalent to the S-GCM and frozen with four clusters.
0£
0.E
0.~
0.' I.U
(
-0.;
--0.z
-0.{
-0A
0
5
10
15
20
25
30
35
40
45
50
t/16 FIGURE 11c. Time series of partial energy. The time series of every unit's partial energy El is plotted every 16 time steps (t = 1 6 , 3 2 , 4 8 . . . ) . As time elapses, all of the partial energy values become negative.
36
S. Ishii, K. Fukumizu and S. Watanabe i
i
i
i
i
i
4
39
3.8
O~
3.7
3.6
3.5
3.4 0
5
10
15
20
25
30
35
40
45
50
t/16 FIGURE 11d. Time series of a . The time series of every unit's a l value is plotted every 16 time steps (t = 16,32,48,...). At the early stage, some of the a t values are large. As time elapses, they become small, and finally ell of them come to be equal to a,,i, = 3.4.
In this experiment, parameter values are determined to be amen = 3.4, e = 0.1, so that the system can be driven into one of the four-cluster frozen attractors.l° As mentioned in Section 3.3, the original S-GCM is unlikely to output biased patterns with parameter values in the ordered(2) phase. If we use parameter values in the ordered(4) phase, this defect is much improved, since the system's degree of freedom increases; as the degree of freedom increases, the system can provide more complexity.
4.3. Experiments
In the following experiments, our [System 1] is compared with the Hopfield network (Hopfield, 1984). Here, we use the asynchronous mean field theory equation (Peterson & Anderson, 1987) as the analog Hopfield network, which employs continuous activation values and asynchronous updating, such as the following:
(17) where weight matrix [a/j] is given by eqn (12). 4.3.1. M e m o r y Capacity. It is well known that the memory capacity of the simplest Hopfield network can theoretically be estimated at 0.138N (Amit et al., 1987). Here, "memory capacity" means how many 10 A s F i g u r e 5a shows, S - G C M w i t h a = 3.4 a n d e = 0.1 t a k e s four-cluster frozen attractors.
random patterns can be memorized by the network of N elements. Figure 12 shows the simulation results of the memory capacity Pm~, both in the Hopfield network and [System 1]. Each network succeeds in memorizing p random patterns, if the probability of bitwise flip after the transient period is less than 1.5% when the input I is one of the memorized patterns. Each experiment is done for 100 sets of randomly prepared patterns to be memorized. The memory capacity of the Hopfield network is estimated at 0.125N and agrees well with the theoretical result. On the other hand, our system has a much larger memory capacity, and is estimated at 0.186N. According to a theoretical analysis (Amit et al., 1987) of the simplest Hopfield network, if the probability of bitwise flips exceeds 1.5% before reaching an attractor, all memories become useless, i.e. a phase transition occurs. In our system, however, we cannot observe such a distinguished phase transition; memories become useless much more slowly as the number of memories increases. Therefore, the above-mentioned criterion for estimating memory capacity is only for comparison. The "effective" memory capacity of our system is actually larger than the value estimated here; even when the number of memorized patterns exceeds the capacity, basins of attraction do exist, although they are relatively narrow. Further discussion will be presented in the next paper (Ishii, 1994). 4.3.2. Basin Volume. In this subsection, we compare our system with the Hopfield network in terms of "success rate", which indicates how successfully the
Chaotic Associative Memory
37
0.3-
II
0.2
[System 1] - • Hopfield
° • "m.
o. "°-.
. -m
o
...... •
. . . . . . .
•
0.I
0.0
, '
'
'
'
'
°
'
,
,
,
•
,
,
[
I
100 N u m b e r of Units (N)
10o0
FIGURE 12. Memory capacity. Simulation results of the memory capacity both in the Hopfleld network and in [System 1]. The memory capacity of the Hopfleld network Is estimated at 0.125N. The memory capacity of our system is estimated at 0.186N, which is 50% larger than that of the Hopfleld network. Each network succeeds In memorizing p random patterns, If the probability of bitwise flip Is less than 1.5% when the input Is one of the memorized patterns. Each experiment is done for 100 sets of randomly prepared patterns to be memorized.
network can associate a target pattern, when the distance o f the input is known. Each network incorporates 100 units and memorizes the five alphabet patterns shown in Figure 10. The distance of the input pattern I from a target pattern ~ is determined using the initial overlap:
4.4. Associative Memory Controlling
Now another associative memory system is shown (Ishii et al., 1993b). [System 2] N
xi(t + 1) = (1 - , i ) f ( x i ( t ) ) + ~ 1
: f(xi(t) )
(19)
N
ol(I, ~) = ~ - ~
f (x) = a x 3 - otx + x
(20)
Q = ei + (ei - era=) tanh(~Ei)
(21)
(18) t
When the initial overlap is large (ol .~ l), the input I and the corresponding initial state x(0) are close to the target pattern and the association is easy. On the other hand, when the initial overlap is small (ol ,~ 0), the initial state is far from the target pattern and the association is difficult. Figures 13a, 13b, 13c, and 13d show the results obtained for the target pattern " A " , " J " , "P", and " T " , respectively. As these figures show, our system has a higher success rate than the Hopfield network except in the case of target "J". Since " J " is a biased pattern, incorporating 25% 1 and 75°/0 - 1 , it is hard for our system, which is a modified system of the SG C M , to represent it. Accordingly, in most cases, our system has a higher success rate, i.e. a larger basin volume for each memorized pattern than the Hopfield network.
where a = 3.65 and ~ = 2.0. The definition of Ei is the same as in [System 1]. In eqn (21), each ei is controlled to be between emi~ = 0.0 and em~ = 0.2. As in [System 1], ei evolves once every 16 time steps. This system conducts the association process by controlling each ej, the strength of the interaction between the /th unit and all other units. When the partial energy is low, the interaction around the unit is activated and the unit-wise processing mode is set to the preserving mode. When the partial energy is high, the interaction is inactivated and the unit-wise processing mode is set to the destroying mode so as to search for the proper state. This system is almost equal to [System 1] in ability as an associative memory system. In order to understand the difference between [System 1] and [System 2], Figures 14a and 14b sketch
38
S. Ishii, K. Fukumizu and S. Watanabe 1.0 -
1.0 -
0.8-
0.8-
11/"
////~ li/I 0.6
0.6.
0.4.
//
/i
-¢- [System 1] -i-- Hopfield
e
,~ 0.4
0.2
0.2.
0.0
0.0 0.0
012
014
0'.6
0'.8
1'.0
0.0
0'.2
01.4 Initial
Initial overlap
(a)
0'.6
0!8
1'.0
overlap
(b)
1.0 -
1.0 j~ J
0.8-
0.8-
_•
// 0.6-
./,,
[System 1] Hopfield
// 0.4,
a
0.6i /
0.4-
d
al
/ /
// 0.2-
/
0.2
0.0
m/
0.0 0.0
0'.2
014 016 Initial overlap
018
1'.0
(c)
0.0
0!2
0!4 Initial
016
0!8
1'.0
overlap
(d)
FIGURE 13. Success rate. The success rate is shown when the target pattern is (a) "A", (b) "J", (c) "P", and (d) "T". Each network incorporates 100 units and memorizes the five alphabet patterns shown in Figure 10. As these figures show, our system has a higher success rate than the Hopfield network in the cases of "A", "P", and "T". However, in the case of "J", the success rate of our system is lower.
out the mechanism of [System 1] and [System 2], respectively. In [System 1], each parameter ai is unitwisely controlled so as to vary between the ordered(4) phase and the turbulent phase along the dotted arrow shown in Figure 14a. II As a / g o e s up, it exceeds the
ll Note that O~min = 3.4 ~
point where the initial information is broken. In Figure 14a, this point is described as a thick dotted line in the turbulent phase, which corresponds to the band merge point, as mentioned in Section 3.2. When the unit's partial energy is low, the initial information provided to the unit is likely to be preserved. When the unit's partial energy is high, the initial information is destroyed and the proper code is searched for.
Chaotic Associative Memory
39 0.4 /
0,4
(1,2)
'
......... (1,2)
....,.--"
...,,.-""
0.3
0,3
, . ,°-°'"
-°'"""'""°'"""
~'".~
Ordered (2)
[
................
..-'"'"'".'.""-" ~
Ordered (2)
0.2
0.2 ..-'"" x
0.1
0.1'
'"'"'i'" 0.0 3.0
,.......~
,........Turbulent
,.,-'"'"'"
..........
00 , - i .... 3.2
3.4
3.6
3.8
4.0
3.0
.'""
~ "
." .Tu~'uient
.A-""
....... 3.2
3.4
3.6
3.8
4.0
a
FIGURE 14a. [System 1] controlling a. In [System 1], ~r is controlled so as to very between ordered(4) phase and the turbulent phase. The dotted arrow schematically denotes our control method. The thick dotted line in the turbulent phase denotes the band merge points. On the left side of the dotted line, the system's mode is preserving. On the right side of it, the system's mode is destroying. When an association is successfully completed, [System 1] comes to be equivalent to the S-GCM with parameter values marked by " x " , where the system is frozen with four clusters.
An important point here is that this control is done unit-wisely, although Figure 14a is a global phase diagram. In [System 2], each parameter ei is unitwisely controlled so as to vary between the ordered(4) phase and the turbulent phase along the dotted arrow shown in Figure 14b. 12 As ei goes down, it exceeds the band merge point where the initial information is broken. Accordingly, both [System 1] and [System 2] unit-wisely employ the two processing modes o f the S-GCM. These mechanisms prove that our systems work well as associative memory systems.
4.5. Other Models
In both o f the above-mentioned associative memory systems, parameter values are set so that the system's equilibrium state is one of the four-cluster frozen attractors o f the S-GCM. The main reason for this is to deal with biased patterns as mentioned in Section 4.2. However, another advantage is as follows. If all the units are to be classified into four clusters, a two-bit code can be assigned to each unit, such as ( 1 , 1 ) , ( 1 , - 1 ) , ( - 1 , 1 ) , and ( - 1 , - 1 ) . This technique enables us to construct a double memory system. Since our S-GCM has a fairly large ordered(4) area and ordered(8) area, it is easy to assign a two- or three-bit code to each unit. Since these cluster divisions are based on the branch distinction in the period-doublings of the S-MAP, it is theoretically possible to make an infinite-bit memory system. 12 N o t e t h a t emi, = 0. ~< ~i ~< ~ . , ~ = 0.2 a n d c~ = 3.65.
FIGURE 14b. [System 2] controlling e. In [System 2], ~ is controlled so as to vary between the ordered(4) phase and the turbulent phase. The dotted arrow schematically denotes our control method. When an association is successfully completed, [System 2] comes to be equivalent to the S-GCM with parameter values marked by " x " , where the system is frozen with four clusters.
Moreover, these branches are hierarchically structured. In this sense, such a multi-memory system is a kind o f hierarchical memory system. Actually, we have already implemented a "double associative memory" system. The S-GCM model turned out to be useful in such new memory systems.
5. C O N C L U S I O N In this paper, we proposed a globally coupled map model for information-processing applications. Our S-GCM is obtained by modifying Kaneko's G C M model so that an S-MAP is employed instead of the logistic map. In the S-GCM, cluster frozen attractors are observed, each o f which is taken to represent information. Next, we described the characteristics o f the S-GCM which are important to informationprocessing applications. (a) The S-GCM falls into one of the cluster frozen attractors over a wide range of parameters. This is important to informationprocessing applications because o f output stability and recurrence, and this feature makes it possible to apply our model to more intelligent memory systems, such as multi-memory systems. (b) The information can be preserved or broken by controlling parameters. By switching these two processing modes, our model can be applied to actual information-processing systems. (c) Cluster partitioning is restricted. This means that the representation o f information has a limitation. This feature may be disadvantageous in some cases, but this can be avoided by converting information representation. F o r example, in the case o f associative memory systems, it is enough to
40
S. lshii, K. Fukumizu and S. Watanabe
convert biased memorized patterns into them that are not so biased. We have also considered some actual informationprocessing applications, i.e. associative memory systems. Although these systems employ the conventional covariance learning rule, their association ability is better than that of the Hopfield network. This implies that our systems' dynamics is superior to that of the Hopfield network. In our systems, chaotic motions at the early stage of association allow the system to partially escape from spurious memories, i.e. local minima of the associative memory. Further discussion about this will be presented in the next paper (Ishii, 1994). Accordingly, in our systems, an efficient search is achieved with strong chaos at the early stage of association. However, chaos gets suppressed at the end of the association. This "annealing"-like mechanism seems to match our intuition concerning memory retrieval in biological neural networks. We now intend to apply our model to other information-processing problems, such as combinatorial optimization problems. REFERENCES Adachi, M., Aihara, K., & Kotani, M. (1993). An analysis of associative memory dynamics with a chaotic neural network. Proceedings of the International Symposium on Nonlinear Theory and its Applications, Hawaii, Dec., 1993 (pp. 11691172). Aihara, K., Takabe, T., & Toyoda, M. (1990). Chaotic neural networks. Physics Letters A, 144, 333-340. Amari, S. (1972). Characteristics of random nets of analog neuronlike elements. IEEE Transactions on System, Man and Cybernetics, SMC-2, 643--1557. Amit, D. J., Gutfreund, H., & Sompolinsky, H. (1987). Statistical mechanics of neural networks near saturation. Annals of Physics, 173, 30-67. Devaney, R. L. (1989). An introduction to chaotic dynamical systems, 2nd ed. Reading, MA: Addison-Wesley. Eckhorn, R., Bauer, R., Jordan, W., Brosch, M., Kruse, W., Munk, M., & Reitboeck, H. J. (1988). Coherent oscillations: A mechanism of feature linking in the visual cortex? Multiple electrode and correlation analysis in the cat. Biological Cybernetics, 60, 121-130. Edwards, S. F., & Anderson, P. W. (1975). Theory of spin glasses. Journal of Physics, FS, 965-974. Gray, C. M., & Singer, W. (1989). Stimulus-specific neuronal
oscillations in orientation columns of cat visual cortex. Proceedings of the National Academy of Sciences of the USA, 86, 1698-1702. Grossberg, S., & Somers, D. (1991). Synchronized oscillations during cooperative feature linking in a cortical model of visual perception. Neural Networks, 4, 453-466. Hebb, D. O. (1969). The organization of behavior. New York: John Wiley. Hopfield, J. J. (1984). Neurons with graded response have collective computational properties like those of two-state neurons. Proceedings of the National Academy of Sciences of the USA, 81, 3088-3092. lshii, S. (1994). Eliminating spurious memories using a network of chaotic elements. ATR Technical Report, TR-H-106, also submitted to Journal of Intelligent & Fuzzy Systems. Ishii, S., Fukumizu, K., & Watanabe, S. (1993a). Associative memory using spatiotemporal chaos. Proceedings of the International Joint Conference on Neural Networks, Nagoya, Oct., 1993, (pp. 2638-2641). Ishii, S., Fukumizu, K., & Watanabe, S. (1993b). A globally coupled map model for information processing. Proceedings of the International Symposium on Nonlinear Theory and its Applications, Hawaii, Dec., 1993, (pp. 1157-1160). Kaneko, K. (1984). Period-doubling of kink-antikink patterns, quasiperiodicity in antiferro-like structures and spatial intermittency in coupled logistic lattice. Progress of Theoretical Physics, 72, 480~86. Kaneko, K. (1990). Clustering, coding, switching, hierarchical ordering, and control in a network of chaotic elements. Physica D, 41, 13%172. Kohonen, T. (1977). Associative memor3~-a system-theoretical approach. Berlin, Heidelberg: Springer-Verlag. Kuramoto, Y. (1991). Collective synchronization of pulse-coupled oscillators and excitable units. Physica D, 50, 15-30. Lie, Z., & Hopfield, J. J. (1989). Modeling the olfactory bulb and its neural oscillatory processing. Biological Cybernetics, 61, 379--392. Nara, S., Davis, P., & Totsuji, H. (1993). Memory search using complex dynamics in a recurrent neural network model. Neural Networks, 6, 963-973. Peterson, C., & Anderson, J. R. (1987). A mean field theory learning algorithm for neural networks. Complex Systems, 1, 995-1019. Sakarda, C. A., & Freeman, W. J. (1987). How brains make chaos in order to make sense of the world. Behavioral and Brain Sciences, 10, 161-195. Sompolinsky, H., & Kanter, I. (1986). Temporal association in asymmetric neural networks. Physical Review Letters, 57, 28612864. Sompolinsky, H., Crisanti, A., & Sommers, H. J. (1988). Chaos in random neural networks, Physical Review Letters, 61,259-262. Yao, Y., & Freeman, W. J. (1990). Model of biological pattern recognition with spatially chaotic dynamics. Neural Networks, 3, 153-170.