Applied Mathematics and Computation 111 (2000) 209±218 www.elsevier.nl/locate/amc
Temporal learning rule and dynamic neural network model Yukifumi Shigematsu a b
a,*,1
, Gen Matsumoto
b
Super Molecular Division, Electrotechnical Laboratory 1-1-4 Umezono, Tsukuba, Ibaraki 305-8568, Japan Brainway group, Brain Science Institute, RIKEN Wako, Saitama 351-0198, Japan
Abstract The central nervous system is a highly dynamic network which is constantly being changed by a learning process. A new temporal learning rule, the revised Hebbian rule with synaptic history, was proposed in order to organize the dynamic associative memory. The learning rule was applied to a pulse-driven neural network model, and a temporal associative memory was self-organized by input temporal signals. This result leads to a new concept that the temporal sequence of events is memorized among the asymmetric connections in the network. It was also shown that dynamic neural networks were eectively organized using temporal information. Grouping or isolation for the multi-modal information was performed well by temporal learning processing. These results suggest that temporal information may be an important factor for organizing information processing circuits in the nervous system in addition to spatial information. Ó 2000 Elsevier Science Inc. All rights reserved. Keywords: Temporal learning rule; Self-organization; Signal interaction; Network dynamics
1. Introduction The central nervous system receives external information from sensory organs, recognizes it and dynamically restructures its neural network into a spatio-temporal associative memory. It is possible by this learning and *
Corresponding author. E-mail:
[email protected] Present address: Ehime University, Faculty of Engineering, 3-Bunkyo-cho, Matsuyama, 7908577, Japan. Tel.: +81-89 927 8538; fax: +81-89 927 9973; E-mail:
[email protected],ac.jp 1
0096-3003/00/$ - see front matter Ó 2000 Elsevier Science Inc. All rights reserved. PII: S 0 0 9 6 - 3 0 0 3 ( 9 9 ) 0 0 1 6 4 - 2
210
Y. Shigematsu, G. Matsumoto / Appl. Math. Comput. 111 (2000) 209±218
memorizing process to construct a high order of intelligence. Neural cell models in conventional neural networks are fundamentally static, therefore recurrent loop or time delay elements are indispensable to treat time sequence signals. An appropriate learning rule has not been devised for temporal associative memory. Temporal information, however, is very important for our daily lives. To consider the classical conditioning experiment, an animal can easily predict a late unconditioned stimulus from a preceding conditioned stimulus after experimental training. The order of the time sequence is important and irreversible in temporal associative memory [1]. It is essential that real neural cells have the capability of temporal information processing, and also a temporal learning rule may be inherent in their synapses. There are many reports of physiological results of the synaptic plasticity in nerve cells for temporal signal processing [2±4]. The long-term potentiation (LTP) or long-term depression (LTD) is an eect that the synaptic ecacy is strengthened by input stimulus and is kept the same value for long time. The activity-dependent changes in synaptic ecacy are a key factor in the formation of neural circuits underlying learning and memory. In this paper, we propose a new temporal learning rule and introduce a spatio-temporal associative memory. The learning rule is a physiologically plausible one, a revised Hebbian rule with synaptic history [5]. It realizes an associative memory in which time sequence pattern can be recalled step by step from an initial event, and introduce an integrated understanding of spatiotemporal structure in neural networks.
2. Temporal learning rule We attempted to device a novel learning rule for temporal information processing based on the understanding of the physiological results of synaptic plasticity [5]. The Hebbian learning rule is a well-known rule, which enhances the connection ecacy of a particular synapse if responses of both input and output cells are simultaneously active. It can generally be formulated by the following equation: DWij Cij Xj
t Yi
t;
1
where Wij is synaptic ecacy from the j-cell to the i-cell; Xj (t) and Yi (t) are input and output response values of the i-cell, respectively; and Cij is a parameter of learning speed. Xj and Yi are binary values of 1 or 0 for pulse response. In order to expand the Hebbian rule to the temporal learning process, a synaptic history of the input signal is introduced:
Y. Shigematsu, G. Matsumoto / Appl. Math. Comput. 111 (2000) 209±218
Hij
t Xj
t q Hij
t ÿ 1;
211
2
where q is the decay time constant (0 < q < 1) for the input history. The value of Hij (t) is accumulated value of the input signal received from the j-cell at a synaptic site of the dendritic portion in the i-cell. It is analogous to an aftereect such as the intracellular calcium ion concentration. Two types of the revised Hebbian rule were introduced by using the history. One is an input dependent learning rule which is formulated as the product of the input signal and the synaptic history: DWik Cik
Hik
t ÿ H0 Xk
t;
3
where H0 is a learning threshold which distinguishes between enhancement and depression. The other is an output dependent one in which particular attention is paid to the output cell. This is formulated as ÿ
4 DWhj Chj Hhj
t ÿ H0 Yh
t; where H0 is as described for Eq. (3). These equations were applied to the case of the classical conditioning experiment. The output dependent learning rule was suitable for explaining the conditioning experiment result. Therefore, we came to the conclusion that the output dependent type revised Hebbian rule with synaptic history (Eq. (4)) was the temporal learning rule for associative networks. 3. Neural cell model with learning rule In order to construct the neural networks, we introduced a pulse neural cell model [6]. The cell model was a spike accumulation and delta modulation (SAM) model, and is illustrated in Fig. 1. It consisted of three parts: (1) a leaky integrator for input pulses, (2) a binary threshold function to generate an output pulse train, and (3) a subtraction part to remove a constant given value from internal potential at spike ®ring. The operating factors were the input pulse signal Xj (t) from the j-cell, output signal Yi (t), accumulated potential Ui (t) and internal potential Vi (t). Ui (t) for the i-cell is described as X Wij Xj
t a Vi
t ÿ 1;
5 Ui
t j
Fig. 1. A model of temporal learning.
212
Y. Shigematsu, G. Matsumoto / Appl. Math. Comput. 111 (2000) 209±218
where `a' is a decay constant (0 < a < 1) for the potential Vi (t). Comparing Ui (t) with threshold T, the output Yi (t) and internal potential Vi (t) are Yi
t gUi
t ÿ T ;
6
Vi
t Ui
t ÿ p Yi
t;
7
where g[z] 1 if z > 0 and g[z] 0 if z < 0, and `p' is the subtraction constant at pulse ®ring. We adopted a discrete time interval and parallel operation for calculation. The time interval may include intervals of both action potential and depression period. The remaining value of the internal potential Vi (t) was held for the next step and acts as a charged signal for the next ®ring. This pulse neural cell model was easily adopted with the learning operation of the revised Hebbian rule with synaptic history. The input history decreased exponentially to zero after the input pulse, and two threshold levels (H1 , H2 ) were set for enhancement and depression of the ecacy as shown in Fig. 2(A). For computation, the learning rule of Eq. (4) was simpli®ed into the stepwise relationship function, as shown in Fig. 2(B). When the i-cell ®red, Yi (t) 1, the spike signal was fed back to each synapse and the new ecacy was calculated by comparing the history and the thresholds. The learning threshold H1 was used for enhancement, and another threshold H2 was used for depression. When the after-eect of an input signal was nearly 0 and the history was smaller than H2 , the ecacy was not changed. This means that the memory in the ecacy was not erased by non-correlated signals. The following were the learning procedures: If Hij (t) > H1 then ÿ
8 DWij c1 Wmax ÿ Wij Yi
t: If Hij (t) < H1 and Hij
t > H2 , then ÿ DWij c2 Wmin ÿ Wij Yi
t:
9
c1 and c2 are parameters of learning speed, and Wmax and Wmin are maximum and minimum values of the connection ecacy (Wmax > Wij > Wmin ), respectively. The learning calculation was applied to the ®ring cell only.
Fig. 2. H(t) time sequence and DW.
Y. Shigematsu, G. Matsumoto / Appl. Math. Comput. 111 (2000) 209±218
213
The synaptic history Hi (t) was calculated at each synapse (*) of the i-cell. It would otherwise require an elaborate calculation, but it was simpli®ed; a history of output Yj (t) instead of each input history was calculated using Eq. (2), and delivered as each input history Hj (t) to all cells receiving the history from the j-cell. 4. Temporal associative memory Using the above pulse neural cell model and the revised Hebbian learning rule, a new type of temporal associative memory network was introduced. Each neural cell in this neural network was interconnected with all other cells and had input and output terminals as shown in Fig. 3. The mutual connection ecacy was initially weak. Associative memory network was constructed by the following learning process, particular cells in group A received input signals and ®red at ®rst. This ®ring output signal from group A was transmitted to every other cell and raised each input history at each synapse of the receiving cells. When other cells in group B were ®red by a following input stimulus, the learning process was performed among the ®red cells in group B. The ecacy of connection from group A to group B was enhanced because the values of the input history from group A were larger than the learning thresholds of H1 . This ®ring signal from group B was also transmitted to every other cell and raised each history. In the next step, other cells in group C were ®red by following input stimuli, then the learning process was performed to enhance the connection from group B to group C. Next learning process was also performed among the following ®red cells in group D. This learning process was performed to make the relation of time sequence among the cells. The series of the temporal ®ring pattern `A®B®C®D' was memorized by applying the stimuli repeatedly to the above networks. After the training, the temporal sequence of input ®ring pattern was imprinted within the connections of the neural network. The neural network ®nally became a temporal associative memory.
Fig. 3. Temporal associative memory.
214
Y. Shigematsu, G. Matsumoto / Appl. Math. Comput. 111 (2000) 209±218
The recalling process of the associative memory was as follows: initial stimulus was applied to the input terminals of group A for a short duration, and the cells in group A output the ®ring spikes. This ®red signal was transferred from group A by the enhanced mutual connection to the group B, and the cells in group B were ®red by the signal. Similar recalling process could be performed sequentially in groups C and D. The sequence of recalling was induced by the enhanced mutual connections. The recalling speed was independent of the time period of the training process, but related to the intensity of the input ®ring and the mutual connection ecacy within the sequence. If the network was well trained, predictive recalling was possible for the welltrained sequence of events. In this temporal associative memory network, the temporal information was used to organize the asymmetrical connections among the neural cells. This network memorized the sequences of the input signals or events, but did not memorize the duration of these events. This seems logical, based on our memory of daily life. This learning rule also includes the characteristic of the Hebbian rule. Therefore, it can self-organize the spatial associative memory. The spatiotemporal associative neural network self-organizes by using this learning rule, the symmetrical connection is enhanced for spatial associative memory and the asymmetrical connection is enhanced for temporal associative memory. 5. Interaction among dierent signals The second trial was performed to determine the eect of the interaction between dierent input signals on the temporal learning process. The neural network was a two-layered network consisted of an input layer and a receiving layer shown in Fig. 4. The input layer consisted of ®ve cells. Two of them output dierent random pulse trains, and two cells output periodic pulse trains out of phase with each other. The ®fth cell output the same pulse train as one of these four input pulse signals. The average pulse interval for each cell was about one shot per 20 time steps. The receiving layer cells had the weak excitatory mutual connections among the four near R-cells. We also considered the eect of noise signals. Many neural cells generate spontaneous spikes without any input signal, therefore random noise input was introduced into the neural cell in order to simulate the spontaneous spikes. The ®rst (1±25) quarter of R-cells received a random pulse train from the ®rst input cell. The second (26±50), third (51±75) and fourth (76±100) quarters of R-cells received periodic, random and another periodic pulse trains from three other input cells, respectively. The ®fth input cell sent a signal to all the R-cells (1±100). Each R-cell received two inputs, one from the ®fth cell and one from each other input cell.
Y. Shigematsu, G. Matsumoto / Appl. Math. Comput. 111 (2000) 209±218
215
Fig. 4. Pulse-driven neural networks with dierent pulse train inputs.
We set the signal of the ®fth cell to have the same periodic pulses of the second input cell, and tested the interaction from several initial conditions of dierent connection weights (W1 , W2 , W3 , W4 , W5 ) from the ®ve input cells to R-cells. Calculation and learning was performed until the averaged value of connections within some zone was saturated to be the maximum connection weight (Wmax 0.30). Fig. 5 shows a result from an initial condition of strong zone connections (W1±4 0.27) and weak common connections (W5 0.03), where Wmax 0.3. Learning of the weak common connection was dominated by strong inputs to
Fig. 5. Connection weight of each R-cell from zone input cells (n, Wn ) and the ®fth cell (s, W5 ) after the learning process. Initial weights were W1±4 0.27 and W5 0.03. Quarter of R-cells (1±25) in ®rst zone received random pulses. The cells in second, third and fourth zone received cyclic, random and another cyclic input pulse trains, respectively. The common input of the ®fth cell was the same cyclic as the second one.
216
Y. Shigematsu, G. Matsumoto / Appl. Math. Comput. 111 (2000) 209±218
Table 1 Wn by two pulse train inputs (Wmax 0.30) Wn W5 Wn W5
Initial
Zone 1
Zone 2
Zone 3
Zone 4
0.15 0.15 0.03 0.03
0.26 0.23 0.26 0.21
0.29 0.29 0.29 0.29
0.26 0.20 0.28 0.20
0.16 0.19 0.09 0.22
each of the zone cells. Synchronizing pulse signals collaborated to strengthen the connections from input cells to R-cells in the second zone, but asynchronizing signals suppressed the connections from the ®fth cell to R-cells in the fourth zone. Random pulse signals were accepted to make the moderate connections of the common input in the ®rst and third zones. Learning processes under various conditions were studied by setting dierent initial connection weights such as Wn /W5 0.03/0.27, 0.15/0.15 and 0.03/ 0.03. A similar tendency to that described above was observed. If initial connection weights of common input and zone inputs were of the same value, both input signals competed to obtain dominant connections to R-cells as shown in Table 1. 6. Summary and discussion The real nervous system receives huge ¯ows of spatial and temporal information from the external world. Input signals to the memory system are complex, for example, simultaneous visual, auditory and motional signals, and the nervous system then treats all of them for associative memorization. In this process, the role of spatial and temporal associative memory is very important. We have described temporal associative memory by means of very simple networks. Some important results were obtained by the simulation of the temporal learning process. Spatial information processing is not useful for treating multi-modal or higher level information, because the spatial relation has no meaning for these ®elds. Temporal information processing can be useful, however, by using the time correlation; the cells which receive the correlated pulse input begin to ®re concurrently and form a group by increasing the connection weights for similar inputs. The pulse neural network can also isolate temporally correlated activity from non-correlated activity by the temporal learning rule. These results indicate that temporal information processing is an important factor in the self-organization of the associative networks in the neural system in addition to spatial information processing.
Y. Shigematsu, G. Matsumoto / Appl. Math. Comput. 111 (2000) 209±218
217
We presented a primary function for temporal associative memory which the neural cell model should process. The change in synaptic ecacy may be the result of interference between potentiation and depression. We simpli®ed the function to use only one component for this learning rule, but it requires two learning thresholds. Some physiological results suggest more complex mechanisms, for example, that the learning thresholds should vary as a function of previous synaptic activity [9]. We consider a hypothesis of the feedback signal from the cell body through the dendrite to the synapses in the learning rule. Stuart and Sakmann [8] show that action potentials are initiated ®rst in the axon hillock and then actively propagate back into the dendritic tree. The role of the dendritic action potential is unknown, but it is a candidate for the feedback signal. This network model was studied by using one shot pulse calculation, but pulse bursts for strong input signals may be more realistic and eective. It is more dicult to obtain a relation between the events for a very long time interval by this processing method, because the time interval for grouping was short. Another function or mechanism should be added to network association for long-term relation. The recall process is dierent from that of conventional associative memory with loop circuits. The speed of recalling is ¯exible, and it may be possible to predict the subsequent events with an experienced network. This memory model provides the temporal relationship within the connections of the neural cells, and fundamentally does not require a special recurrent loop. There are, however, lots of recurrent loops in real neural networks, and the recurrent loops can be used to introduce other functions. This learning rule depends on only local information of input and output signals of the neural cell. Global information is also necessary to construct a large-scale complex network system. One type of global information is a value factor such as a reward or punishment used to control the learning process to the required objective. For a more complex memory system, it is necessary to refer to the functions and structures in the real nervous system such as hippocampus and neocortex [9]. This model may have potential for simulating real complex nervous systems because of the physiologically plausible learning rule and the realistic pulsedriven neural cell model. References [1] A. Herz, R. Sulzer, J.L. van Hemmen, Hebbian learning reconsidered: representation of static and dynamic objects in associative neural nets, Biol. Cybern. 60 (1989) 457±467. [2] S. Fujii, K. Saito, H. Miyakawa, K. Ito, H. Kato, Reversal of long-term potentiation (depotentiation) induced by tetanus stimulation of the input to CA1 neurons of guinea pig hippocampal slices, Brain Res. 555 (1991) 112±122.
218
Y. Shigematsu, G. Matsumoto / Appl. Math. Comput. 111 (2000) 209±218
[3] E.M. Wexler, P.K. Stanton, Priming of homosynaptic long-term depression in hippocampus by previous synaptic activity, Neuro Report 4 (1993) 591±594. [4] P.K. Stanton, T.J. Sejnowski, Associative long-term depression in the hippocampus: induction of synaptic plasticity by Hebbian covariance, Nature 339 (1989) 215±218. [5] Y. Shigematsu, G. Matsumoto, M. Ichikawa, A Temporal Learning Rule and Associative Memory. Brain Processes, Theories and Models, MIT Press, Cambridge, MA, 1995, pp. 164± 172. [6] Y. Shigematsu, S. Akiyama, G. Matsumoto, A dynamic neural cell model driven by pulse train, Proceedings of the International Symposium, Iizuka, NIS, 1992, pp. 130±133. [7] A. Artola, S. Broecher, W. Singer, Dierent voltage-dependent thresholds for inducing longterm depression and long-term potentiation in slices of rat visual cortex, Nature 347 (1990) 69± 72. [8] G.J. Stuart, B. Sakmann, Active propagation of somatic action potentials into neocortical pyramidal cell dendrites, Nature 367 (1994) 69±72. [9] T.V.P. Bliss, G.L. Collingridge, A synaptic model of memory: long-term potentiation in the hippocampus, Nature 361 (1993) 31±39.