Pattern Recognition Letters 19 (1998) 1087±1102
A discrete sequential bidirectional associative memory for multistep pattern recognition Donq-Liang Lee
1
Department of Electronic Engineering, Ta-Hwa Institute of Technology, Chung-Lin, Hsin-Chu, Taiwan 307, ROC Received 1 October 1997; received in revised form 6 May 1998
Abstract The Discrete Chainable Bidirectional Associative Memory (DCBAM) has been proposed for multistep pattern recognition. However, it is found by our experiment that the state (environment output) of the DCBAM may converge to limited cycles. Hence, the application of the DCBAM is limited. We show in this letter that it is possible to realize multistep retrieval by using conventional Discrete Bidirectional Associative Memories (DBAMs). The proposed method is called Bidirectional Sequential Encoding Method (BSEM) and is dierent from the two approaches proposed by Zhou and Quek (1996). Since the multistep retrieval problem can be transformed to a bidirectional linear separation problem, a modi®ed Minimum Overlap Algorithm (MMOA) is proposed to train the weight matrix of the DBAM. Because the encoding (learning) of the weight matrix is constructed from a dierent (sequential) view point, we give the DBAM a new name called Discrete Sequential BAM (DSBAM). Computer simulations show that, comparing with DCBAM, DSBAM yields better performance on capacity as well as recall capability. Ó 1998 Elsevier Science B.V. All rights reserved. Keywords: Bidirectional associative memory; Neural network; Multistep retrieval
1. Introduction Associative memories, characterized by information storage and recall, have given rise to much interest in recent years, because of their wide range of applications in areas such as content addressable memory (CAM) and pattern recognition. The discrete Bidirectional Associative Memory (DBAM) which is a generalization of the discrete Hop®eld model (Hop®eld, 1982, 1984) was ®rst proposed by Kosko (1987, 1988). The DBAM is a minimal two-layer nonlinear feedback network in which information between two neuron ®elds Fx and Fy ¯ows bidirectionally through connection matrix and its transpose MT . An important attribute of the DBAM is its ability to recall one stored pattern from its noisy or partial input. In other words, when presented with an erroneous memory pattern, the state of the DBAM involves to the equilibrium representing the correct memory pattern. Afterwards many alternative techniques for singlestep retrieval using DBAM have been proposed (Wang et al., 1991; Leung, 1993; Lee and Wang, 1993; Oh
1
Fax: +886 3 5922774; e-mail:
[email protected].
0167-8655/98/$ ± see front matter Ó 1998 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 8 6 5 5 ( 9 8 ) 0 0 0 8 6 - 5
1088
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
and Kothari, 1994; Wang et al., 1994). However, the problem of multistep retrieval has received little attention in literature. To resolve this, Zhou and Quek (1996) proposed a Discrete Chainable Bidirectional Associative Memory (DCBAM). The DCBAM can be referred to as an extension of Kosko's DBAM because only additional chaining feedbacks between two neuron ®elds are augmented. The DCBAM is capable of multistep retrieval-chaining. However, it is found by our experiment that the state of the DCBAM may converge to limited cycles (see Section 3). In other words, undesired oscillation may occur. The main purpose of this letter is to show that it is possible to realize multistep retrieval by conventional DBAM structure. A Bidirectional Sequential Encoding Method (BSEM) is proposed to encode the weight matrix. The proposed technique is dierent from the two methods mentioned by Zhou and Quek (1996). Since a multistep retrieval problem can be transformed to a bidirectional linear separation problem, a Modi®ed Minimum Overlap Algorithm (MMOA) is proposed to train the weight matrix of the DBAM. Because the encoding (learning) of the weight matrix is constructed from a dierent (sequential) view point, we give the DBAM a new name called Discrete Sequential BAM (DSBAM). Computer simulation shows the validity of the proposed DSBAM. Moreover, since any matrix is bidirectionally stable (Kosko, 1988), no oscillation state exists. Hence, the capacity and recall capability are both increased. 2. Review of previous results (DBAM and DCBAM) 2.1. DBAM DBAM is a minimal two-layer (two neuron ®elds) nonlinear feedback neural network. We denote one ®eld as Fx with state X and the other ®eld as Fy with state Y. Considering that there are m training pairs m T T f
X s ; Y s gs1 , where X s xs1 ; xs2 ; . . . ; xsn ; Yi y1s ; y2s ; . . . ; yps and each xsi or yjs is a bipolar value (i.e., 1 or )1). The connection matrix W mij of the DBAM is conventionally determined by using the outerproduct rule (Kosko, 1987, 1988), m X T X s
Y s ;
1 W s1
where T denotes the transpose of a vector or matrix. Given any initial state
X
0; Y
0, a ®nal state can be obtained by iteratively updating the network state through the connecting weights as follows: ( ) m X wij xi
t ÿ 1 ; j 1; . . . ; p;
2a yj
t sgn i1
xi
t sgn
( m X
) wij yj
t ÿ 1 ;
i 1; . . . ; n;
2b
j1
where sgnfag 1 for a > 0; sgnfag ÿ1 for a < 0 and sgnfag remains unchanged for a 0. Starting at any initial state
X
0; Y
0, each update cycle (Eq. (2a) or Eq. (2b)) lowers the energy E
X ; Y ÿX T WY :
3
Kosko (1988) proved that, in a ®nite number of iterations, the network dynamic (Eqs. (2a) and (2b)) converges to one of the ®xed points whose energy is a local minimum. Apparently, this ®xed point is desired to be one of the training pairs. Such a property makes the DBAM a powerful tool for pattern recognition and information retrieval. However, DBAM is unable to provide multistep retrieval by using a matrix W as Eq. (1).
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
1089
2.2. DCBAM As an extension of DBAM, the DCBAM (Discrete Chainable BAM) has been proposed (Zhou and Quek, 1996) for multistep retrieval. The dynamic of DCBAM is described by the following equations. For each neuron in Fy ; n X mij x0i
t ÿ 1;
4a net input: yj00
t i1
net output: environment output:
yj0
t
Cfyj00
tg;
yj
t
4b
sgnfyj00
tg;
4c
where Cfag is the same as sgnfag except that Cfag 0 if a 0. Similarly, for the neuron i in Fx , the following state equations are given: n X mij yj0
t ÿ 1;
5a net input: x00i
t j1
net output: environment output: feedback signal:
x0i
t
Cfx00i
t rxi
tg; xi
t sgnfx00i
t rxi
tg;
5b
5c
rxi
t yi00
t ÿ 1:
5d
The dynamic described by Eqs. (4a)±(4c) and (5a)±(5d) is called forward recalling mode. In the backward mode, one has to feedback the intermediate association states obtained at Fx
ryj
t x00j
t ÿ 1 into the neurons in Fy (Zhou and Quek, 1996). Only the forward mode is discussed in the remainder of this paper. Moreover, since there exist chaining feedbacks between Fx and Fy , the dimensions n and p must be the same, n p. Zhou and Quek (1996) show that the DCBAM is capable of automatic multistep retrieval. For example, let fV 1 ! V 2 ! ! V m g denote a pattern sequence of length m which is to be stored in a DCBAM. They show that DCBAM is able to retrieve the pairs f
V kÿ1 ; V k g; k 2; . . . ; m, once the m ) 1 training pairs f
X 1 ; Y 1
V 1 ; V 2 ;
X 2 ; Y 2
V 2 ; V 3 ; . . . ;
X mÿ1 ; Y mÿ1
V mÿ1 ; V m g have been encoded in W by Eq. (1), and an initial pattern X
0 is presented to the Fx . The idea behind the additional chaining feedbacks is to feed the intermediate output (e.g., Y) back to the input layer (e.g., X) and then use the updated input pattern to reverberatively recall the next pattern. However, it is found by our experiment that the state of a DCBAM may converge to limited cycles (see Section 3). In other words, undesired oscillation may occur.
3. Stability of the DCBAM Unlike the DBAM, the state of a DCBAM may converge to limited cycles. It means that the sequence
X
t; Y
t may not converge to ®xed points. Let us examine the following example. Example 1. Suppose we want to store a transient sequence of states leading to an attractor V 1 ! V 2 ! V 3 ! V 4; T
T
T
where V 1 ÿ1 1 1 ÿ1 1 , V 2 ÿ1 1 ÿ1 ÿ1 ÿ1 , V 3 1 ÿ1 ÿ1 ÿ1 ÿ1 , T V 4 1 1 ÿ1 1 ÿ1 . Here V 4 is desired to be the only stable attractor. The others can be referred to as transient (unstable) attractors. From the paper of Zhou and Quek (1996), one can let
X 1 ; Y 1
V 1 ; V 2 ;
X 2 ; Y 2
V 2 ; V 3 ;
X 3 ; Y 3
V 3 ; V 4 and obtains
1090
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
2
3 3 1 7 ÿ3 ÿ1 7 7 ÿ1 1 7 7; 7 1 3 5 ÿ1 1
1 1 1 6 ÿ1 ÿ1 ÿ1 6 6 W 6 1 6 ÿ3 1 6 4 ÿ1 ÿ1 3 ÿ3 1 1
6
by Eq. (1). One can check that, with the above weight matrix, the three training pairs are all ®xed points of T the DBAM. However, let X
0 ÿ1 ÿ1 1 ÿ1 1 be an initial state of the DCBAM having the same weight matrix, one can obtain the following sequence of updates: Y 0
1 Y
1 ÿ1
T
1 ÿ1
X 0
2 X
2 ÿ1
ÿ1 ÿ1 ;
Y 0
3 Y
3 Y
1; Y 0
5 Y
5 ÿ1 Y 0
7 Y
7 1
X 0
4 ÿ1 ÿ1
ÿ1
ÿ1 ÿ1
ÿ1
T
ÿ1 ; T
ÿ1 ÿ1 ;
Y 0
9 Y
9 Y
7; ÿ1 ÿ1
Y 0
13 Y
13 1
1
0
0
ÿ1
X 0
6 X
6 ÿ1 X 0
8 0 1
ÿ1
1 ÿ1
X 0
10 X
10 1 1
Y 0
11 Y
11 1
Y 0
15 1
1
1
0 1
1
ÿ1 T ; T
ÿ1
1
T
Y
15 Y
13;
0 ;
ÿ1 ;
1
T
ÿ1
0 T ; ÿ1
1 ; X
4 X
2;
ÿ1 T
ÿ1 ; ÿ1
T
ÿ1 ; X
8 X
6;
ÿ1
X 0
12 X
12 1 ÿ1
ÿ1 ÿ1
X 0
14 1 ÿ1
ÿ1 ;
ÿ1 0
T
T
ÿ1 ; ÿ1 T ; X
14 X
12;
X 0
16 X
16 ÿX
2;
Y 0
2g ÿ 1 ÿY 0
2g ÿ 15; Y
2g ÿ 1 ÿY
2g ÿ 15; X 0
2g ÿX 0
2g ÿ 14;
X
2g ÿX
2g ÿ 14;
for g P 9:
From above observation, we can conclude that Y 0
2g ÿ 1 Y 0
2g ÿ 29; Y
2g ÿ 1 Y
2g ÿ 29; X 0
2g X 0
2g ÿ 28; X
2g X
2g ÿ 28;
for g P 16:
It is obvious that a limited cycle of period (length) 28 occurs. Moreover, we test W with all 32 (25 32) possible bipolar states in the state space. It is found that all 32 states converge to the limited cycle described above. We can thus summarize that the state of the DCBAM may converge to a limited cycle. In fact, from this example there is no stable attractor within the entire state space of the DCBAM. In the following section we show that it is possible to realize multistep retrieval by conventional DBAM structure. Since the encoding (learning) of the weight matrix is constructed from a dierent (sequential) view point, we give the DBAM a new name called Discrete Sequential BAM (DSBAM).
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
1091
4. Discrete sequential bidirectional associative memory Consider the sequence from V 1 to V m and the following updating rule: V
t 1 sgnfW^ V
tg; V
t 1 sgnfW^ T V
tg;
for t is 0 or even;
7a
for t is odd:
7b
Eqs. (7a) and (7b) are the same as Eqs. (2a) and (2b) except that a dierent weight matrix W^ and a uni®ed state vector V are used. In fact, multistep retrieval from V 1 to V m is possible if one can ®nd a matrix W^ satisfying the following mapping: V 2 sgnfW^ V 1 g; V 4 sgnfW^ V 3 g; V m sgnfW^ V mÿ1 g;
V 3 sgnfW^ T V 2 g; V 5 sgnfW^ T V 4 g; V mÿ1 sgnfW^ T V mÿ2 g; V mÿ1 sgnfW^ T V m g:
Here m is assumed to be an even number. The DBAM, in this case, can be called a Discrete Sequential BAM (DSBAM) because a pattern sequence instead of heteroassociative pattern pairs is to be stored in the matrix W^ . It is noticed that, in above representation, V m is the only stable attractor and the others are all transient (unstable) attractors. Therefore, it is possible to realize the following relation (i.e., a transient sequence of states leading to an attractor): V 1 ! V 2 ! V 3 ! ! V mÿ1 $ V m ; by the DSBAM if a solution matrix W^ exists. 4.1. Bidirectional sequential encoding method Let V m1 V mÿ1 be an augmented vector, the sequence can be recalled by Eqs. (7a) and (7b) if one can ®nd a matrix W^ such that ( ) ( k1 n X vj ; k 1; 3; . . . ; m ÿ 1 if m is even; k ^ ij vi w sgn
8a vjk1 ; k 1; 3; . . . ; m if m is odd; i1 ( sgn
n X j1
(
) ^ ij vkj w
vik1 ;
k 2; 4; . . . ; m
if m is even;
vik1 ;
k 2; 4; . . . ; m ÿ 1
if m is odd:
8b
Note that V m will be the stable attractor if V m1 V mÿ1 . Moreover, since every matrix is bidirectional stable (Kosko, 1988), we expect that no limited cycle will occur. Now de®ne the following transformations: 1; 2; . . . ; m=2 if m is even; u 2uÿ1 2u vj ; u
9a Lj V 1; 2; . . . ;
m 1=2 if m is odd; ; Ziu V 2u v2u1 i
u
1; 2; . . . ; m=2
if m is even;
1; 2; . . . ;
m ÿ 1=2
if m is odd:
9b
1092
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
Then the problem of Eqs. (8a) and (8b) can be rewritten as ®nding a matrix W^ such that 1; 2; . . . ; m=2 if m is even; u c ^ Lj Wj > 0; j 1; . . . ; n; u
10a 1; 2; . . . ;
m 1=2 if m is odd; 1; 2; . . . ; m=2 if m is even; u r ^
10b Zi Wi > 0; i 1; . . . ; n; u 1; 2; . . . ;
m ÿ 1=2 if m is odd; where W^jc and W^i r denote the jth column and the ith row of W^ , respectively, and ``'' is the inner product operator. The problem de®ned by Eqs. (10a) and (10b) is a bidirectional linear separation problem (or bidirectional perceptron problem) since each component wij appears on both Eqs. (10a) and (10b). To take care of the bidirectional learning, the conventional minimum overlap perceptron algorithm (Krauth and Mezard, 1987) can be modi®ed to train the DSBAM. 4.2. Modi®ed minimum overlap algorithm The Minimum Overlap Algorithm (MOA, Krauth and Mezard, 1987) is an improved version of the Perceptron (Minsky and Papert, 1969) algorithm (PA). Given two training sets of patterns belonging, respectively, to two classes. Both algorithms are tuned to generate a decision hyperplane according to patterns in the training sets. The PA makes a change in the decision hyperplane if and only if the pattern being considered at a training step is misclassi®ed by the hyperplane at that step. Moreover, patterns are repeatedly and sequentially considered, one at a time (learning step), until none is misclassi®ed by the hyperplane. The MOA diers the PA in that all patterns are considered at each step. Moreover, among the misclassi®ed patterns at a step, it chooses the one which has a minimal overlap (or maximal distance) for that step (Krauth and Mezard, 1987). If the two classes are linearly separable, both algorithms will terminate with a correct decision hyperplane after a ®nite number of steps. However, unlike the PA, the hyperplane obtained from the MOA is optimal in the sense of stability. In other words, the MOA is tuned to maximize the classi®cation margin. Another famous variation of the PA is the Pocket Algorithm (Gallent, 1986) which can be used to handle non-separable training patterns in an optimal sense. However, it minimizes the number of misclassi®ed patterns but not the classi®cation margin. For this reason, only the MOA is discussed in the following. A DSBAM can be thought as a combination of 2n dependent perceptrons because the information be^ ij appears on tween two neuron ®elds ¯ows bidirectionally through W^ and W^ T . Since each component w both Eqs. (10a) and (10b), we proposed the following modi®ed MOA (MMOA) to train the DSBAM. Step 1. Set the initial weight W^
0 0 and select a ®xed positive number c. Step 2. Determine Lj and Zi that have minimum overlap with W^jc and W^i r , respectively, i.e., Lj W^jc min Luj W^jc ;
11a
Zi W^i r min Ziu W^i r :
11b
u
u
Step 3. If Lj W^jc > c and Zi W^i r > c, then stop. Otherwise, update W^ by W^
t 1 W^
t
1=nfF
Lj W^jc
Lj T F
Zi W^i r Zi g;
12
where F
a 0 if a > c, and F
a 1 if a 6 c. Go to Step 2. In Step 1, c is a number concerning stability margin. An optimal solution is obtained by choosing a large c. However, the number of required iterations is increased almost linearly with the size of c. With the aid of the MMOA, let us train a DSBAM to store the same sequence in Example 1. Let c 0.1, we obtain
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
1093
Fig. 1. Results of storage capacity test. ``Ð Ð'': DCBAMs; ``Ð Ð'': DSBAMs.
Fig. 2. Average convergence rates to stored stable attractor versus length of stored sequence for DCBAMs
Ð Ð and DSBAMs
Ð Ð:
1094
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
Fig. 3. Average convergence rates to spurious stable attractor versus length of stored sequence for DCBAMs
Ð Ð and DSBAMs
Ð Ð:
2
0:2
6 6 ÿ0:2 6 ^ W 6 6 ÿ0:2 6 4 0:2 ÿ0:2
0:2
ÿ0:2
0:2
ÿ0:2
ÿ0:2 ÿ0:2
0:2 0:2
ÿ0:2 ÿ0:2
0:2 0:2
ÿ1
1
0:2
ÿ0:2
0:2
ÿ0:2
3
7 7 7 7; 7 7 1 5 0:2
by six iterations of the MMOA. Now consider the same initial state V
0 ÿ1 Example 1. By Eqs. (8a) and (8b), one has V
1 sgnfW^ V
0g ÿ1
1 ÿ1
V
2 sgnfW^ T V
1g 1
ÿ1 ÿ1
V
3 sgnfW^ V
2g 1
1
ÿ1
ÿ1
1
T
ÿ1 1 in
T
ÿ1 ÿ1 V 2 ; T
ÿ1 ÿ1 V 3 ;
1
T
ÿ1 V 4 ;
V
4 sgnfW^ T V
3g V 3 ; V
2g ÿ 1 sgnfW^ V
2g ÿ 2g V 4 ; V
2g sgnfW^ T V
2g ÿ 1g V 3 ;
for g P 2:
Table 1 Rate of convergence to oscillatory states for DCBAM Sequence length (m) Convergence rate (%)
1 0
2 0
3 42.3
4 20.6
5 21.8
6 22.9
7 22.4
8 23.7
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
1095
Fig. 4. Three pattern sequences for the pattern sequence recognition example.
It is obvious that the DSBAM converges to the stable attractor V 4 in two iterations. State space statistics as in Example 1 were performed. It is found that half of them converge to V 4 , the others converge to ÿV 4 . No limited cycle occurs. Note that ÿV 4 will also be a stable attractor if V is a stable attractor because sgnfW^
ÿV g ÿsgnfW^ V and sgnfW^ T
ÿV g ÿsgnfW^ T V g. In fact, if state V is attracted by V 4 , then state ÿV will be attracted by )V4 . The state trajectory from V to V 4 and the state trajectory from ÿV to ÿ V 4 will be complements for each other. 5. Computer simulations To ascertain the improvement in memory capacity and recall probability, a large number of computer simulations have been performed. The parameter values used are n 20, c 0 for Sections 5.1 and 5.2. 5.1. Capacity test We varied the number of training patterns m for DCBAMs and DSBAMs. For each m, there are 100 independent sets and each set contains m training patterns. The m patterns constitute a vector sequence of Table 2 Recall probability against Hamming distance for DSBAM Hamming distance
1
3
5
7
9
11
13
15
17
19
21
V~ 1 ! V 5 V~ 7 ! V 11 V~ 13 ! V 17
0.91 0.90 0.92
0.50 0.66 0.75
0.38 0.49 0.65
0.26 0.34 0.52
0.14 0.27 0.40
0.13 0.24 0.38
0.08 0.18 0.29
0.07 0.15 0.27
0.05 0.11 0.26
0.04 0.11 0.22
0.01 0.08 0.19
1096
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
length m. For each element of the vectors in a sequence, the probabilities of being 1 and )1 are equal. Since the basic requirement of recalling successfully in a DCBAM is that each training pair be ®xed point of Eqs. (2a) and (2b), the memory capacity of a DCBAM is equal to Kosko's DBAM. In other words, for DCBAMs, if each pair of a training set is a ®xed point of Eqs. (2a) and (2b), the set is called a successful set. On the other hand, we use the BSEM with the MMOA to encode the weight matrices of the DSBAMs. Since the MMOA will not terminate when the solution matrix does not exist, the maximum number of learning cycles for MMOA is set to 500. It means that the MMOA will be terminated after 500 learning cycles whether the set is a successful set or not. The percentages of successes are shown in Fig. 1. Compared with DCBAM, DSBAM oers higher storage capacity. For instance, the DCBAM has over 90% of successful sets if m 6 6. The DSBAM improves it to m 6 10: 5.2. Recall capability test The capabilities and performance characteristics of the DCBAMs and the DSBAMs are investigated here. The simulation conditions are similar to the above. For each m, 100 successful sets of DCBAMs and DSBAMs were generated, respectively. For each successful set, a weight matrix was encoded (Eq. (1) for DCBAM and MMOA for DSBAM). After weight matrices have been encoded, we test the DCBAM and DSBAM with 1000 random patterns (the probabilities of being 1 and )1 are equal for all components). Simulation results are recorded and presented in Figs. 2 and 3 and Table 1. Here each datum is obtained from average of 1000 results. The results in Figs. 2 and 3 and Table 1 clearly depict the superior performance of the DSBAM over the DCBAM. For example, when m 5, the DSBAM presents approximately
Fig. 5. Test result of DSBAM recalling. (a)±(c) sequentially represent the initial state
V
0; V
ÿ1 to ®nal state
V
4; V
3, respectively. The initial state V
0 V~ 1 is obtained from inverting 11 pixels (about 14%) of V 1 . V
ÿ1 is a vector whose components are all )1.
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
1097
Fig. 6. Test result of DSBAM recalling. (a)±(c) sequentially represent the initial state
V
0; V
ÿ1 to ®nal state
V
4; V
3, respectively. The initial state V
0 V~ 1 is obtained from inverting 11 pixels (about 14%) of V 7 . V
ÿ1 is a vector whose components are all )1.
88% more convergence to the stable prototype attractor (i.e., V5 or )V5 ) than DCBAM (see Fig. 2). In this case, with DSBAM, zero convergence to spurious stable state (false memory). By contrast, DCBAM generated a large number of spurious stable states with approximately 65% convergence to such states (see Fig. 3). Also, with DSBAM, no limited cycle was found. On the other hand, DCBAM generated limited cycles, which attracted 22% of the input initial pattern (see Table 1). Finally, it is to be noted here that appropriate modi®cations to the weight matrix W of the DCBAM will reduce the number of spurious memories and may increase the storage capacity (Wang et al., 1991; Leung, 1993; Lee and Wang, 1993; Oh and Kothari, 1994; Wang et al., 1994); however, such improvements in performance will always be restricted because of the intrinsic unstable dynamic in the DCBAM recalling 5.3. A pattern sequence recognition example This example shows that the DSBAM is capable of producing many temporal evolutions by a unique matrix W^ . The objective is to train a DSBAM to perform the following three sequences:
1098
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
81
Here all patterns are composed of 9 9 small pixels corresponding to vectors in fÿ1; 1g . Fig. 4 displays the three pattern sequences needed to be stored. The sequence in Fig. 4(a) represents digits ``1'' to ``5'' and Figs. 4(b) and 4(c) are the corresponding numbers in Chinese and Roman representations, respectively. Proper adjustment of the BSEM is necessary in order to learn the three independent pattern sequences with only one unique weight matrix. Let m(k) be the length of the kth pattern sequence. Then, in this example, m
1 m
2 m
3 5. Also, let l(k) and z(k) be, respectively, the numbers of Lj and Zi in the kth pattern sequence. l
3 3 and z
1 z
3 2 (from Eqs. (9a) and (9b)). Thus we have P Pz
2 P3 Then l
1 l
2 3 3 m k1 m
k 15; l k1 l
k 9 and z k1 z
k 6: Now let
Then, the problem can be restated as ®nding a weight matrix W^ such that ( ) n X k ^ ij vi vk1 w k 1; 3; . . . ; 17; sgn j ; i1
Fig. 7. Test result of DSBAM recalling. (a)±(c) sequentially represent the initial state
V
0; V
ÿ1 to ®nal state
V
4; V
3, respectively. The initial state V
0 V~ 1 is obtained from inverting 11 pixels (about 14%) of V 13 . V
ÿ1 is a vector whose components are all )1.
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
1099
Fig. 8. Test result of DSBAM recalling. (a)±(d) sequentially represent the initial state
V
0; V
ÿ1 to ®nal state
V
6; V
5, respectively. The initial state V
0 V~ 1 is obtained from inverting 19 pixels (about 23%) of V 1 . V
ÿ1 is a vector whose components are all )1.
( sgn
n X j1
) ^ ij vkj w
; vk1 i
k 2; 4; . . . ; 16:
Moreover, the following transformations are de®ned: Luj V 2uÿ1 v2u j ;
5 u 1; 2; . . . ; l; Zi1 V 2 v3i ; Zi2 V 4 v5i ; Zi3 V 8 v9i ; Zi4 V 10 v11 i ; Zi
6 16 17 V 14 v15 i ; Zi V v i :
The l + z transformed patterns can be used to obtain a weight matrix W^ by using the MMOA. Parameter c is set to 20 in order to obtain a robust result. After learning has been performed, the initial state of the DSBAM was applied externally, and the state in individual time steps was recorded. Table 2 shows the results of successful multistep recalling when noisy patterns, V~ 1 , V~ 7 and V~ 13 are used as
1100
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
Fig. 9. Test result of DSBAM recalling. (a)±(c) sequentially represent the initial state
V
0; V
ÿ1 to ®nal state
V
4; V
3, respectively. The initial state V
0 V~ 1 is obtained from inverting 19 pixels (about 23%) of V 7 . V
ÿ1 is a vector whose components are all )1.
initial states of the DSBAM. The noise was created by randomly inverting each bit from +1 to )1 or vice verse with a speci®ed Hamming distance. For a noisy input, if the corresponding stored stable attractor (pattern) is recalled, it is a success; otherwise it is a failure. The percentage of successful recall is summarized in Table 2 where each datum is obtained from average of 1000 independent trial runs. With 3, 5 and 7 noisy pixels, the DSBAM attains approximately one half correct recalls for sequences, V~ 1 ! V 5 ; V~ 7 ! V 11 and V~ 13 ! V 17 , respectively. Some examples of the multistep recall procedure are depicted in Figs. 5±10. Figs. 5±7 show the steps in the multistep recalling processes when noisy patterns, V~ 1 ; V~ 7 and V~ 13 , with 11 noisy pixels, are used as initial states of the DSBAM, respectively. It can be observed from Figs. 5±7 that correct transient patterns V 2 ; V 8 and V 14 , are successfully recalled with only one step. After only four steps of Eqs. (7a) and (7b), the DSBAM reaches its stable attractors which are identical to the three stored stable attractors, V 5 , V 11 and V 17 , respectively. All the transient states are recalled correctly in sequence. These ®gures demonstrate that the proposed DSBAM possesses the ability of multistep retrieval. Figs. 8±10 show similar results when noisy patterns, V~ 1 ; V~ 7 and V~ 13 , with 19 noisy pixels, are used as initial states of the DSBAM, respectively. Again, after six, four and six steps of Eqs. (7a) and (7b), respectively, the DSBAM reaches its stable attractors. However, pattern ``2'' in Fig. 8, and patterns ``II'', ``III'' in Fig. 10, are not correctly recalled. A similar phenomenon has also found in DCBAMs (Zhou and Quek, 1996). This implies that the attractivity (or the basin of attraction) of a stable attractor is always larger than those of the transient (unstable) attractors leading to that stable attractor.
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
1101
Fig. 10. Test result of DSBAM recalling. (a)±(d) sequentially represent the initial state
V
0; V
ÿ1 to ®nal state
V
6; V
5, respectively. The initial state V
0 V~ 1 is obtained from inverting 19 pixels (about 23%) of V 13 . V
ÿ1 is a vector whose components are all )1.
6. Conclusions In this paper, a possible way of realizing multistep retrieval by using DSBAM is proposed. The DSBAM diers from DBAM only on encoding consideration. Their structures as well as update equations are functionally the same. This method is dierent from the two approaches proposed by Zhou and Quek (1996). On realizing multistep retrieval, the proposed DSBAM possesses many advantages over the DCBAM: (i) the structure is simpler; no additional chaining feedback is required, (ii) absence of oscillating memories or limited cycles, (iii) higher memory capacity, and (iv) stronger multistep recalling capability. However, the computation eort for encoding the weight matrix is increased comparing to conventional method, such as the outer product rule (Hebb, 1949) and the pseudoinverse technique (Kohonen, 1984). Therefore, how to reduce the computation eort is an open problem deserving further study.
1102
D.-L. Lee / Pattern Recognition Letters 19 (1998) 1087±1102
References Gallent, S.I., 1986. Optimal linear discriminants. In: Proceedings of the Eighth International Conference on Pattern Recognition, Paris, France, pp. 849±852. Hebb, D.O., 1949. The Organization of Behavior. Wiley, New York. Hop®eld, J.J., 1982. Neural networks and physical systems with emerging collective computational abilities. Proc. Nat. Acad. Sci. USA 79, 2554±2558. Hop®eld, J.J., 1984. Neurons with graded response have collective computational properties like those of two state neurons. Proc. Nat. Acad. Sci. USA 81, 3088±3092. Kohonen, T., 1984. Self-Organization and Associative Memory. Springer, Berlin. Kosko, B., 1987. Adaptive bidirectional associative memories. Appl. Opt. 26 (23), 4947±4960. Kosko, B., 1988. Bidirectional associative memories. IEEE Trans. Syst., Man and Cybern. 18 (1), 49±60. Krauth, W., Mezard, M., 1987. Learning algorithm with optimal stability in neural networks. J. Phys. A: Math. Gen. 20, L745±L752. Leung, C.S., 1993. Encoding method for bidirectional associative memory using projection on convex sets. IEEE Trans. Neural Networks 4 (5), 879±881. Lee, D.L., Wang, W.J., 1993. Improvement of bidirectional associative memory by using correlation signi®cance. Electron. Lett. 29 (8), 688±690. Minsky. M.L., Papert, S., 1969. Perceptrons. MIT press, Cambridge, MA. Oh, H., Kothari, S.C., 1994. Adaptation of the relaxation method for learning in bidirectional associative memory. IEEE Trans. Neural Networks 5 (4), 576±583. Wang, T., Zhuang, X., Xing, X., 1994. Designing bidirectional associative memories with optimal stability. IEEE Trans. Syst., Man and Cybern. 24 (5), 778±790. Wang, Y.F., Cruz Jr., J.B., Mulligan Jr., J.H., 1991. Guaranteed recall of all training pairs for bidirectional associative memory. IEEE Trans. Neural Networks 2 (6), 559±567. Zhou, R.W., Quek, C., 1996. DCBAM: A discrete chainable bidirectional associative memory. Pattern Recognition Letters 17 (9), 985±999.