Episodic Associative Memories

Episodic Associative Memories

NEUROCOMPUTING ELSEVIER Neurocomputing 12 (1996) 1-18 Episodic Associative Memories Motonobu Hattori *, Masafumi Hagiwara Departmentof Electri...

1MB Sizes 0 Downloads 89 Views

NEUROCOMPUTING

ELSEVIER

Neurocomputing

12 (1996) 1-18

Episodic Associative Memories Motonobu

Hattori

*, Masafumi

Hagiwara

Departmentof ElectricalEngineering, Faculty of Science and Technology,Keio University, Yokohama-shi,223, Japan

Received 15 March 1994; accepted 26 January 1995

Abstract In this paper, we propose Episodic Associative Memories (EAMs). They use Quick Learning for Bidirectional Associative Memory (QLBAM), which enables high memory capacity, and Pseudo-Noise (PN) sequences. In the learning of the proposed EAMs, PN sequences are used as one side of training pairs. To store an episode, a scene of the episode is stored with a PN sequence, and the next scene is stored with the PN sequence which is shifted with one bit. Such a procedure enables episodic memory. The proposed EAMs can recall episodic associations by shifting the PN sequence one by one. The features of the proposed EAMs are: (1) they can memorize and recall episodic associations; (2) they can store plural episodes; (3) they have high memory capacity; (4) they are robust for noisy and incomplete inputs. Keywords:

Episodic Associative Memories (EAMs); Bidirectional Associative (BAM); Quick Learning for BAM (QLBAM); Pseudo-Noise (PN) sequences

Memory

1. Introduction A typical computer of von Neumann type contains one central processing unit (CPU> and some memory units. A memory unit stores data in memory cells with one dimensional addresses and accesses stored data by using those addresses. In such an architecture, however, access to a memory has become a bottleneck which prevents the high speed processing. To break the limit of von Neumann type, various computer architectures have been proposed. Associative memory proces-

* Corresponding

author. Email: [email protected]

0925-2312/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved SSDZ 0925-2312(95)00022-4

2

M. Hattori, M. Hagiwara /Neurocomputing

12 (1996) l-18

sor, which enables super parallel processing and access by memorized contents, is one of them [18-201. On the other hand, in order to mimic the mechanism of human memory, much research on associative memories has been carried out in the field of artificial neural networks and a lot of neural network models have been proposed [1-13,21241. Associative memories using neural networks have the following features to solve the bottleneck: (1) they use super parallel processing; (2) they are easy to self-organize; (3) their ability can be improved in proportion to the number of components such as neurons and synapses; (4) they are structurally stable because they store information into synapses in a distributed form; (5) they can recall stored data even if the input is corrupted by noise or the input is incomplete. However, there are probably three big problems when we want to mimic a human memory by neural networks. The first problem is memory capacity. Most of conventional associative memory models suffer from low memory capacity because they use Hebbian learning. Unless the training vectors are orthogonal, Hebbian learning does not guarantee the recall of all training pairs. To improve the memory capacity, Oh and Kothari have proposed Pseudo-Relaxation Learning Algorithm for Bidirectional Associative Memory (PRLAB) [12,13]. Although it can guarantee the recall of all training pairs, it requires many learning epochs due to the improper initial weights: random initial weights are used. For the purpose of reducing the learning epochs, we have already proposed Quick Learning for Bidirectional Associative Memory (QLBAM) algorithm [23,24]. In the QLBAM algorithm, Bidirectional Associative Memory (BAM) is firstly trained by Hebbian learning and then trained by the PRLAB. The QLBAM can not only reduce many learning epochs, but also improve memory capacity more than ten-fold in comparison with Hebbian learning. The second problem exists in the process of recall. The human brain can memorize plural matters simultaneously and recall them. However, it is very difficult to make this function in associative memory models because of the interference of stored items. Thus, most of the conventional associative memory models enable only one to one associations. As one solution of this problem, we have already proposed Improved Multidirectional Associative Memory [22]. It enables multiple associations even when training sets include common terms. The third problem also exists in the process of recall. When one clue is given, the human brain recalls something which relates to the clue one after another. Hirai and Ma modeled a process of problem-solving by Human Associative Processor (HASP) [51. However, since the capacity of HASP is not large, it seems difficult to apply HASP to real world applications. In this paper, we propose Episodic Associative Memories (EAMs) which enable episodic associations. The proposed EAMs use Bidirectional Associative Memory (BAM) learned by the QLBAM algorithm which has high memory capacity. In

M. Hattori, M. Hagiwara /Neurocomputing 12 (1996) l-18

w

w

Fig. 1. Structure of BAM.

addition, Pseudo-Noise (PN) sequences [16,17] are used as one side of training pairs in learning. The proposed EAMs can recall episodic associations by shifting a PN sequence one by one. Moreover, plural episodes can be stored by using different PN sequences. In Section 2, we briefly explain the Quick Learning for BAM (QLBAM) algorithm. In Section 3, the proposed Episodic Associative Memories (EAMs) are explained. Computer simulation results are shown in Section 4.

2. Quick Learning for Bidirectional Associative Memory In this explained for BAM (QLBAM)

section, first the Bidirectional Associative Memory (BAM) [l] is briefly and then the explanation of the Pseudo-Relaxation Learning Algorithm (PRLAB) [12,13] is followed by that of the Quick Learning for BAM [23,24].

2.1. Bidirectional Associative Memory (BAM) The BAM consists of two layers of neurons with feedback and symmetric synaptic connections between layers (Fig. 1). The BAM behaves as a heteroassociative pattern matcher, storing and recalling pattern pairs. The pattern pairs are stored as bidirectionally stable states of the BAM. The BAM allows the retrieval of stored data associations from incomplete or noisy patterns. When presented with a noisy pattern, the recall process of the BAM can correctly reconstruct the pattern through a sequence of successive updates until it reaches a stable state. The feedback mechanism of the BAM helps to filter the noise as a pattern goes through successive updates like reverberation in a human brain.

M. Hattori, M. Hagiwara / Neurocomputing 12 (I 996) l-18

4

Information is stored in a BAM in the form of weights associated with synaptic connections. In the original model, a correlation matrix was used to store a set of vector pairs {G#?, Y(k9}k=l,,,,,p: (1)

where T represents the transpose operation. The use of such a correlation matrix for storing information in a neural network model is often called Hebbian learning. Suppose that the X-space is composed of N neurons and the Y-space is composed of M neurons, the recall rule is represented as follows. At the ith neuron of the first layer, the next state Xi is determined from the current Y by i=l

,...,

N

(2)

Similarly, at the jth neuron of the second layer, the next state y! is determined from the current X by q’=+

EwjXi i i=l 1

j= l,...,M

where ~$0 represents the threshold logic function defined by +(x)

=

(’

1

for X’O for x < 0

If the input sum to each neuron equals 0, the neuron maintains its current state. A global state is called stable if X,l= Xi and q’ = q for i = 1,. . . , N and j = 1,. . . , M. The correlation matrix W superimposes the information of several patterns on the same memory medium. However, unless the training vectors are orthogonal the superimposition may introduce noise in the system and the recall of all training pairs is not guaranteed. Therefore, Hebbian learning based memory models generally have very low memory capacity. 2.2. Pseudo-Relaxation Learning for BAM Oh and Kothari have proposed Pseudo-Relaxation Learning Algorithm for BAM (PRLAB) which guarantees the recall of all training pairs [12,13]. In this subsection, the explanation of the PRLAB follows the mathematical background of it. 2.2.1. Relaxation method

First, consider a consistent system of m linear inequalities xi-w+b’20

for i=l,...,m

(5)

M. Hattori, M. Hagiwara / Neurocomputing 12 (1996) l-18

5

where xi E R”, w E R” and b’ E R. Each inequality defines a halfspace in R”: H’:x’.w+b’>O

Therefore, follows: c=

(6)

the feasible solution set of (5) is given by a convex polyhedron

;Hi

as

(7)

i=l Let

d(w,

Hi)

d,,(w)

denote the Euclidean distance between w and Hi and define = max{d(w, Hi) I i = 1,. . . , m}

To find a solution of the system of inequalities, iteratively performs the following operation:

the relaxation

procedure

xi.w4+bi

w4+‘=w4_A

(xi/2

(9)

x’

by choosing a sequence of Hi such that d(w, H’) > 0. The sequence of points {wq) is called a relaxation sequence. The relaxation factor A is a constant between 0 and 2, and the starting point w” E R” is chosen arbitrarily. At each iteration wq moves toward the halfspace along the direction of the normal to the plane of Hi. The method is often called under-relaxation if A E (0, 11, over-relaxation if A E (1, 2), or the projection method if A = 1. The various relaxation procedures are illustrated in Fig. 2 in a two dimensional case. If A E (0, 2) and the relaxation sequence is chosen properly, the relaxation procedure is known to converge geometrically. The proof of the geometric convergence of the relaxation procedure is based on the following lemma. Lemma 1 (Agmon) [14]. Let w4 E R” and w4 e Hi. Let wqi’ be a new point given by the relaxation procedure (9). Then, if A E [O, 21 and VW * E C IWQ+l--w*

I21

Iwq-w*

12-A(2-A)d2(wq,

Hi)

where equality holds only for A = 0 or w * on the boundary of Hi. Agmon (141, Motzkin and Schoenberg [15/ have proven the convergence of the maximal distance relaxation method, where the relaxation sequence, as defined in (9),

(a) under-relaxation

(b) projection Fig. 2. Various

relaxation

(c) over-relaxation procedures.

M. Hattori, M. Hagiwara / Neurocomputing 12 (1996) l-18

6

is such that d(w4 H’) = dmax(wq). However,

as seen in Fig. 2 (a) or (b), this method may require infinitely many iterations to reach a solution. Even the over-relaxation method behaves similarly.

2.2.2. Pseudo-relaxation method The maximal distance relaxation method requires infinitely many iterations, and this method is not well suited for a neutral network implementation. This is because each weight change operation requires global information in order to select the maximal distance hyperplane. Instead of choosing wq+l as the orthogonal projection of wq on the halfspace H for which d(wq, H) = dmax(wq), the pseudo-relaxation method cycles through the sequence of halfspaces and performs the relaxation procedure (9) if d(wq, H’) > S’ for some predetermined 6’ > 0. Although this method does not necessarily give a solution for (5), when it terminates, wq is in 6’-neighborhood of Hi, i.e. Vi d(w4, H’) I 6’.

Theorem 1. Let {w”} be the sequence generated by the relaxation procedure (9) when the pseudo-relaxation method is used. If A E (0, 21, then the sequence {w”} always terminates. The proof of this theorem can be found in [12]. The pseudo-relaxation method turns out to be an efficient technique to solve the system of linear inequalities defined in (5) if the solution set is full dimensional, i.e. not contained in any hyperplane. The pseudo-relaxation method is applied as follows to find a solution to a system of linear inequalities (5): (1) Let Hj be the halfspace defined by xi*w + b’-5,~ 0, where ti > 0. Let C,= f7EIH;‘, and a,,, = max,d(H’, H;‘?. The halfspaces H’ and H;’ are defined by parallel hyperplanes and d(H’, Hi) is used to denote the perpendicular distance between these hyperplanes. Note that C, # fl as long as C is full dimensional and the S,,, is sufficiently small. In particular, if C is a convex polyhedral cone, C, is clearly not empty with any choice of &. (2) Apply the pseudo-relaxation method for the system {H;‘) using 6’ = d(H’, H;‘). terminates at wq in finitely many steps. (3) As proved in [ll], pseudo-relaxation Then, the point wq becomes a solution to the system of inequalities (5) because d(wq, Hi) I 6’ and 6’ = d(H’, H;‘). Fig. 3 shows an intuitive idea, in a-two dimensional case, why the pseudo-relaxation method quickly finds a point in H1 n H2. As seen in Fig. 3(a), the maximal

!@;o;

;-;;

(a) relaxation

Fig. 3. Pseudo-relaxation

c (b) pseudo-relaxation

procedure.

M. Hattori, M. Hagiwara / Neurocomputing 12 (1996) l-18

7

distance relaxation method goes from the starting point w”, which is chosen arbitrurily, to the corner point in infinitely many steps if A E (O,l]. On the contrary, as illustrated in Fig. 3(b), from the same starting point the pseudo-relaxation method contributes a natural effect of opening the ‘solid angle’ of the convex polyhedral cone H1 f~ H 2. 2.2.3. Pseudo-Relaxation Learning Algorithm for BAM Consider an N-M BAM with N neurons in the first layer and M neurons in the second layer. Let Wij be the connection strength between the ith neuron in the first layer and the jth neuron in the second layer. Let Oxtbe the threshold for the ith neuron in the first layer and 8,. be the threshold for the jth neuron in the second layer. Let I/= {LX@), Y(k))]k=, ___ be a set of training vector pairs. Suppose that each training vector is ‘dekribed by a bipolar mode. Namely, X’Q E (- 1 l}N Y(k) E { - 1 l}‘? The veciors kr V are guaianteed to be recalled if the following system of linear inequalities are satisfied for all k = 1,. . . , p. Y/k)>O

for j=l,...,M

(10)

for i = 1,. . . , N

(11)

Namely, the goal of learning in a BAM is to find a feasible solution to the system of inequalities. Therefore, the pseudo-relaxation method can be applied to the learning in a BAM. If inequalities (10) and (11) have a feasible solution, then for any positive 5 the following system of linear inequalities has a solution: fW;,Xji)-S, i i=l

q’kj-grO

for j=l,...,M

( 12)

I

PRLAB can be described as a pseudo-relaxation procedure applied to the system of inequalities defined by (12) and (13). PRLAB examines each training pair (X (k) Yck)) one by one systematically and changes the weights and threshold values if inequalities (10) and (11) are not satisfied. It iterates through the training pairs using the following adaption rules: For the neurons in the first layer, AWij = -

A@,=

-4 l+M

+ &,g

Sfs”) - [Xjk))yck) ’ (Sx)

- [X’k))

if S$)Xjk)

if Sjyr)Xi(k)5 0

_<0

(14) (15)

M. Hattori, M. Hagiwara / Neurocomputing 12 (1996) I-18

8

and for the neurons in the second layer,

(16) (17) where s%’ = E W;:jy’k’ - 8Xi) and S$’ = ~ ~jXi’k’ - eq. j=l

i=l

PRLAB always finds a solution in finite many steps if the system (10) and (11) is consistent. However, PRLAB needs many learning epochs because the initial values of weights and thresholds are chosen randomly.

2.3. Quick learning for BAM (QLBAM) To reduce the learning epochs which are required by the PRLAB, we have already proposed the QLBAM which uses two stage learning [23,24]. In the first stage, the BAM is trained by Hebbian learning and then trained by Pseudo-Relaxation Learning Algorithm for BAM (PRLAB) [12,13]. The QLBAM substitutes Hebbian learning for the random initial weights which are used by the original PRLAB. Namely, the BAM is trained by a correlation matrix in the first stage: The connection weights from the first layer to the second layer are learned by, w =

X’W

t

y(k)

(18)

k=l

and the connection weights from the second layer to the first layer are learned as follows,

(1% k=l

where XT is the transposed matrix X, and p is the number of training pairs to be stored. In the second stage, the BAM is trained by PRLAB described above (Eqs. (14)-(17)). As described in Section 2.1, the recall of all training pairs are generally not guaranteed by Hebbian learning (Eqs. (18),(19)). However, it is possible that some of training pairs can be stored by Hebbian learning when the number of training pairs is small. Even if any of training pairs can not be stored by Hebbian learning, some parts of training pairs may be stored. Therefore, the weight updates are reduced by using Hebbian learning in the first stage compared to choosing the initial weights randomly. This contribution by Hebbian learning in the first stage can reduce learning epochs in the QLBAM.

M. Hattori, M. Hagiwara /Neurocomputing

12 (1996) 1-18

The remarkable features of the QLBAM are: (1) it requires much fewer learning epochs than PRLAB; (2) it guarantees the recall of all training pairs; (3) it is robust for noisy inputs; (4) it can much improve the memory capacity in comparison learning. The other characteristics of the QLBAM are found in [24].

9

with Hebbian

3. Episodic Associative Memories In this section, first the features of Pseudo-Noise (PN) sequences [16,17] are briefly described and then behavior of the proposed Episodic Associative Memories (EAMs) is explained. 3.1. Pseudo-Noise (PN) sequences PN sequences are mainly used in spread spectrum communications. quences have properties like a random noise and they can be generated shift registers. Therefore, the generation process of PN sequences is not but deterministic. In the learning of the proposed EAMs, maximal-length sequences quences), which are one kind of PN sequences, are used. M-sequences number of suitable properties which are useful to the proposed EAMs. these properties are given here 116,171.

PN seby using random (m-sehave a Some of

Property 1. An m-sequence contains one more + 1 than - 1. The number of ones in the sequence is i(L + 11, where L is the sequence period. Property 2. The periodic given by

autocorrelation

function

RP,,,(I) is two valued and is

(20) where I is any integer. Property 3. The modulo-2 sum of an m-sequence and any phase shift of the same sequence is another phase of the same m-sequence (shift-and-add property). Property 4. m-sequence. m-sequence, (1) 1 run (2) 1 run

Define a run as a subsequence of identical symbols within the The length of this subsequence is the length of the run. Then, for any there are of ones of length r, of minus ones of length r - 1,

10

M. Hattori, M. Hagiwara / Neurocomputing 12 (1996) 1-18

(3) 1 run of ones and 1 run of minus ones of length r - 2,

(4) 2 runs of ones and 2 runs of minus ones of length r - 3, 6) 4 runs of ones and 4 runs of minus ones of length r - 4, (r) 2’-3 runs of ones and 2’-3 runs of minus ones of length 1, where r is the degree of a m-sequence CL = 2’ - 1) and this denotes the number of the shift registers. In the learning of the QLBAh4, if the training pairs have the same patterns, the systems (10) and (11) become inconsistent. In PN sequences, the same pattern never appears if the pattern is shifted. This is the reason why PN sequences are suitable to the learning of the QLBAM. 3.2. Behavior of the proposed Episodic Associative Memories (EAMs) In the learning of the proposed EAMs, PN sequences are used as one side of training pairs. Namely, to store an episode, a scene of the episode is stored with a PN sequence, and the next scene is stored with the PN sequence which is shifted with one bit. Such a procedure enables episodic memory. In the proposed EAMs, shifts of PN pattern represent a time concept in an episode. The other episodes are memorized using different PN sequences. When a scene of an episode is given to a layer of the proposed EAMs, a PN pattern corresponding to the scene is recalled on another layer. Then sequential scenes of the episode are recalled by shifting the PN pattern one by one.

Layer-2

Layer-1 (1)

0

@

(1’)@-•@ (1”) 0-m Shift one bit. 2 ._ k

(2)

@*-@ Shift one bit.

(3)

014-a Shift one bit.

(4)

@-@ Shift one bit.

I

(5) (6)

@-@ Shift one bit. @-@

Fig. 4. Recall example of the proposed EAM.

M. Hattori, iil. Hagiwara /Neurocomputing 12 (1996) 1-18

11

Fig. 4 shows a recall example of the proposed EAM. Suppose that the stored episode contains six scenes: (A, B, C, D, E, F) and the corresponding

PN patterns are:

where PNi is generated by shifting PNi_ I withone bit (See Fig. 6, Layer2). Thus the stored training set is: {(A

PN,), (B, PN,), (C, PN,), (R

PN,), (E, PN,), (F, PN,)}.

When a corrupted scene A’ is applied to the proposed EAM as an initial state (Fig. 4(l)), the network can correctly recall a corresponding PN pattern, PN, (Fig. 4(Y)) and a correct pair is reconstructed (Fig. 4(1”)). Then only shifting the PN pattern one by one, sequential scenes appear (Fig. 4(2)-(6)).

4. Computer simulation results In this section, we demonstrate the effectiveness of the proposed Associative Memories (EAMs) by computer simulation.

Episodic

4.1 Episodic associations In this simulation, two episodes were stored for the proposed EAM by using different PN sequences. Then corrupted patterns were applied to the proposed EAM. Fig. 5 shows two training episodes used for the simulation. The upper episode in Fig. 5 was stored by shifting the following PN sequence: (l,l,

l,l,

-l,l,

-l,l,

1, -1,

-l,l,

-1,

-1,

-1)

and the lower episode was stored by shifting a different PN sequence: (l,l,

1, 1, -1,

-1,

-l,l,

-1,

-l,l,

1, -l,l,

-1).

In the learning of the QLBAM, parameters were set to A = 1.8 and 5 = 10.0. The noisy inputs, which were corrupted by randomly reversing each bit with a probability of lo%, were applied to the proposed EAM.

Episode 1:

Episode 2: Fig. 5. Training sequences of two episodes.

12

M. Hattori, M. Hagiwara /Neurocomputing 12 (19%) l-18 Layer-1

Layer-2

Layer-1

iz7

Shift one bit. .r

Layer-2

Shift one bit.

B .c-

E

k

Shift one bit.

Shift one bit.

Shift one bit.

Shift one bit.

Shift one bit.

Shift one bit.

Shift one bit. (6)

(a) Corrupted pattern of ‘A’was applied to the proposed EAM.

(b) Corrupted pattern of ‘a’ was applied to the proposed EAM.

Fig. 6. Simulation result by the proposed EAM. Both ‘A’ and ‘a’ were corrupted by randomly reversing each bit with a probability of 10% and then applied to the proposed EAM.

Fig. 6 shows the result of this experiment. In Fig. 6(a), a corrupted pattern of ‘A’ was applied to the proposed EAM as an initial state. As shown in this figure, the corresponding PN pattern was recalled correctly from the corrupted pattern and the whole sequence was recalled by shifting the PN pattern. Fig. 6(b) shows the result when a corrupted ‘u’ was applied to the same EAM as another initial state. As seen in this figure, the correct episode was recalled by the proposed EAM. We carried out similar simulation using the conventional learning algorithm, Hebbian learning, for comparison. Fig. 7 shows the result of this experiment. The network learned by only Hebbian learning could not recall the correct episodes because the memory capacity by Hebbian learning is extremely low compared with that by the Quick Learning for BAh4 (QLBAM). Actually, any of the training episodes in Fig. 5 could not be stored by Hebbian learning. In contrast, the memory capacity of the proposed EAM is large: The EAM could store two long sequences shown in Fig. 8 using two PN sequences whose period was 63. The network learned by only Hebbian learning could not store any of the patterns under the same simulation condition.

M. Hattori, M. Hagiwara / Neurocomputing 12 (1996) I-18 Layer-1

Layer-2

Layer-1

Layer-2

(1)

(1)

(1”)

YY Shift one bit.

Shift one bit.

.r-Y . -Shift one bit.

$

(2) (3)

Shift one bit. n Shift one bit. (5) Shift one bit. 7 (6)

(6)

(a) Uncorrupted pattern of ‘A’was applied.

(b) Uncorrupted pattern of ‘a’was applied.

Fig. 7. Simulation result by the network learned by only Hebbian learning.

Fig. 8. Two examples of long sequences.

13

M. Hattori, M. Hagiwara /Neurocomputing 12 (19%) 1-18

14

Layer-2

Layer-1

Layer-1

Layer-2

(1)

(1)

YT Shift one bit.

Shift one bit. w

j

Shift one bit.

Shift one bit.

Shift one bit.

77 Shift one bit.

Shift one bit.

Shift one bit.

Shift one bit.

Shift one bit.

(6)

(a) Incomplete pattern of ‘A’was applied to the proposed EAM.

(6)

(b) Incomplete pattern of ‘a’was applied to the proposed EAM.

Fig. 9. Simulation result by the proposed EAh4. Incomplete patterns of ‘A’ and ‘a’ were applied to the proposed EAM. Gray squares represent lacking bits.

4.2. Recall from incomplete patterns

In this simulation, two episodes shown in Fig. 5 were stored for the proposed EAM as the same way as in Section 4.1. Then incomplete patterns were applied to the proposed EAM. In this experiment, about 42% of bits were lacking and the values of lacking bits were set to 0. Fig. 9 shows the result of this experiment. In Fig. 9(a), an incomplete pattern of ‘A’, which had a lacking part on the left side, was applied to the proposed EAM as an initial state. As seen in Fig. 9(a), the corresponding PN pattern could be recalled correctly and the whole sequence of the episode was recalled. In Fig. 9(b), an incomplete pattern of ‘a’, which had a lacking part on the upper side, was applied to the same EAM as another initial state. Fig. 9(b) shows that the correct episode could be recalled by the proposed EAM, even if the input was incomplete. 4.3. Noise reduction effect We compared the noise reduction performance of the proposed EAM with that of the network learned by only Hebbian learning using the upper episode in Fig. 5.

M. Hattori, M. Hagiwara / Neurocomputing 12 (1996) l-18

15

Proposed

z

60 _

6:

50_

5 a

30’!-++

\\o

/ \

a: ‘i 40 -

_ -

20 -

_>A

YHebbian learning

IO -

,I .‘.‘.‘.‘.‘.‘.‘.‘.I_

0 0

2

4

6

8 10 12 14 16 18 20 Noise Level (%)

Fig. 10. Sensitivity to noise: upper episode in Fig. 3 was stored in the proposed EAM by using the QLBAM algorithm.

Parameters were A = 1.8 and 5 = 10.0 for the proposed EAM. Each scene of the episode was corrupted by a noise and then applied to the networks as an initial state. Fig. 10 shows the relation between noise level and perfect recall rate. The results were based on 200 trials of the experiment. As shown in Fig. 10, the proposed EAM, which was learned by the QLBAM algorithm, showed much better noise reduction performance than the network learned by Hebbian learning. In this experiment, the network learned by Hebbian learning could store only (A, Plv,) and (C, Plv,) pairs because of low memory capacity, even if the noise level was 0. 4.4. Feasible dimension of inputs In this simulation, to find the feasible dimension of inputs, we examined the relation between the learning epochs and the dimension of inputs. In this experiment, one episode which consists of random generated scenes was stored using three PN sequences: the periods are 15, 31 and 63. The number of scenes in an episode was the length of PN sequence. For example, when the PN sequence whose period is 15 is used, the corresponding episode includes 15 scenes. The results were based on 100 trials of the experiment with A = 1.8 and 5 = 10.0. In the QLBAM algorithm, maximum learning epoch was 500 epochs. If the learning did not converge in 500 epochs, it was counted like a successful trial with 500 epochs, as suggested in ([21], p. 121). The results are shown in Fig. 11. As seen in Fig. 11, when the dimension of inputs exceeds the length of PN sequence, the QLBAM converges with very small learning epochs. In addition, we could see that the QLBAM always converged

16

M. Hattori, M. Hagiwara / Neurocomputing 12 (1996) I-18 500 450 3

400

g

350

.g

300

2

250

“0

200

0

10

20

30

40

Dimension

50

60

70

8t

of inputs

Fig. 11. Relation between learning epochs and dimension of imputs. One episode which consists of random generated scenes was stored using three PN sequences: periods are 15, 31 and 63. The number of scenes in an episode was the length of PN sequence.

within 500 epochs under this condition. Therefore, we can say the dimension of inputs is desired to be larger than the length of PN sequence.

5. Conclusions Episodic Associative Memories (EAMs) have been proposed and examined. In the proposed model, PN sequences are stored as one side of the training pairs with an episode using the Quick Learning for BAM (QLBAM) algorithm. The proposed EAMs can recall episodic associations by shifting the PN sequence one by one. The features of the proposed EAMs are: (1) They can memorize and recall episodic associations. (2) They can store plural episodes. (3) They have high memory capacity owing to the QLBAM algorithm. (4) They are robust for noisy and incomplete inputs. Since the proposed EAMs have the above remarkable features, they can be applied to real world applications such as databases, artificial intelligence, animations and so on. In addition, because BAM is a special case of MAM (Multidirectional Associative Memory) [lo], the proposed EAMs can be applied to multilayer networks.

Acknowledgment The authors would like to thank Prof. Masao Nakagawa of Keio University and reviewers for helpful comments.

M. Hattori, M. Hagiwara /Neurocomputing

12 ils:j&) 1-18

17

References [l] B. Kosko, Bidirectional Associative Memories, IEEE Trans. Syst. Man. Cybem. 18 (1) (1988) 49-60. [2] N. Nakano, Association - a model of associative memory, IEEE Trans. Syst. Man. Cybem. 2 (3) (1972) 380-388. [3] Y. Hirai, A model of human associative processor (HASP), IEEE

[4] [5] [6]

[7] [8] [9]

Trans. Syst. Man. Cybem. 13 (5) (1983) 851-857. Y. Hirai, Mutually linked HASP’s; a solution for constraint-satisfaction problems by associative processing, IEEE Trans. Syst. Man. Cybern. 15 (3) (1985) 423-442. Y. Hirai and Q. Ma, Modeling the process of problem-solving by associative networks capable of improving the performance, Biol. Cybernet. 59 (1988) 353-365. D.E. Rumelhart, J.L. McClelland and the PDP research group, Parallel Dbtributed Processing (MIT Press, 1986). Y.F. Wang, J.B. Cruz, Jr. and J.H. Mulligan, Jr., Two coding strategies for bidirectional associative memory, IEEE Trans, Neural Networks 1 (1) (1990) 81-91. Y.F. Wang, J.B. Cruz, Jr. and J.H. Mulligan, Jr., Guaranteed recall of all training pairs for bidirectional associative memory, IEEE Trans. Neural Networks 2 (6) (1991) 559-567. X. Zhuang, Y. Huang and S. Chen, Better learning for bidirectional associative memory, Neural

Networks 6 (1993) 1131-1146.

[lo] M. Hagiwara, Multidirectional associative memory, Proc. IJCNN 1 (Washington D.C., 1990) 3-6. [ll] H. Oh and S.C. Kothari, A new learning approach to enhance the storage capacity of the Hopfield model, Proc. IJCNN (Singapore, 1991) 2056-2062. [12] H. Oh and S.C. Kothari, Adaption of relaxation method for learning in bidirectional associative memory, Tech. Rept. TR-91-25, Dept. of Computer Science, Iowa State University, 1991. [13] H. Oh and SC. Kothari, A pseudo-relaxation learning algorithm for bidirectional associative memory, Proc. IJCNN 2 (Baltimore, 1992) 208-213. [14] S. Agmon, The relaxation method for linear inequalities, Canadian J. Math. 6 (3) (1954) 382-392. [15] T.S. Motzkin and I.J. Schoenberg, The relaxation method for linear inequalities, Canadian J. Math. 6 (3) (1954) 393-404. [16] R.E. Ziemer and R.L. Peterson, Digital Communications and Spread Spectrum Systems (Macmillan,

1985). [17] M.K. Simon, J.K. Omura, R.A. Scholz and B.K. Levitt, Spread Spectrum Communications, Vol. I (Computer Science Press, 1985). [18] H. Kitano, S.F. Smith and T. Higuchi, GA-l: A parallel associative memory processor for rule learning with genetic algorithms, Proc. ICGA (1991) 311-317. [19] T. Higuchi, T. Furuya, K. Handa and N. Takahashi, IXM2: A parallel associative processor, IEICE Japan J75-D-I (8) (1992) 615-625. 1201 S. Abe and Y. Tonomura, Scene retrieval method using temporal condition changes, IEICE Japan J75-D-II (3) (1992) 512-519. 1211J. Hertz, A. Krogh and R.G. Palmer, Introduction to the Theory of Neural Computation (AddisonWesley, 1991). [22] M. Hattori, M. Hagiwara and M. Nakagawa, Improved multidirectional associative memories for training sets including common terms, Proc. IJCNN 2 (Baltimore, 1992) 172-177. 1231 M. Hattori, M. Hagiwara and M. Nakagawa, Quick learning for bidirectional associative memory, Proc. WCNN 2 (Portland, 1993) 297-300. 1241 M. Hattori, M. Hagiwara and M. Nakagawa, Quick learning for bidirectional associative memory, IEICE Japan E-77D(4) (1994) 385-392.

18

M. Hattori, M. Hagiwara /Neumcomputing 12 (1996) I-18

Masafnmi Hagiwara is an Associate Professor of Keio University. He was born in Yokohama, Japan on October 29, 1959. He received the B.E., M.E., and Ph.D. degrees from Electrical Engmeering in Keio University, Yokohama, Japan, in 1982, 1984 and 1987, respectively. In 1987, he became a research associate of Keio University. Smce 1995 he has been an Associate Professor. From 1991 to 1993, he was a visiting scholar at Stanford University. He received Niwa Memorial Award, Shinohara Memorial Young Engineer Award, IEEE Consumer Electronics Society Chester Sal1 Award and Ando Memorial Young Engineer Award in 1986,1987,1990 and 1994, respectively. His research interests are neural networks, fuzzy systems, and genetic algorithms. Dr. Hagiwara is a member of the IEICE, IEEE, JNNS and INNS.