NeuralNetworks, Vol. 6, pp. 203-215, 1993
0893-6080/93 $6.00 + .00 Copyright © 1993 Pergamon Press Ltd.
Printed in the USA. All rights reserved.
ORIGINAL CONTRIBUTION
Pulse Propagation Networks: A Neural Network Model That Uses Temporal Coding by Action Potentials K E V I N T. J U D D * A N D K A Z U Y U K I A I H A R A * *University of Tokyo and *TokyoDenki University (Received 17 February 1992; revised and accepted 24 July 1992)
Abstract--In this paper we study a model o f a neural network that is fundamentally different from currently popular models. In this model we consider every action potential in the network, rather than average firing rates; this enables us to consider temporal coding by action potentials. This kind o f model is not new, but we believe our results on computational ability to be new. We introduce a specific model, which we call a pulse propagation network ( PPN), and consider this model from the point o f view o f information processing, as a dynamical system and as a computing machine. We show, in particular, that as a computing machine it can operate with real numbers and consequently it is o f a class more powerful than a conventional Turing machine. In the process of this analysis, we develop a framework of concepts and techniques useful to understand and analyze these PPN.
Keywords--Action potential, Dynamical system, Computing machine, Information processing. 1. I N T R O D U C T I O N A N D O V E R V I E W
In biological neural networks we expect that the information per action potential is somewhere between these two extremes, but that in some situations one of the two extreme models is a meaningful approximation, and perhaps in other situations neither is appropriate. The information per action potential may differ according to the function of the neural network: for example, whether it has a function in locomotion control; tactile, olfactory, visual, or auditory information reception and preprocessing; m e m o r y recall; language construction; or logical thought. The information per action potential may also differ according to the nature of the information processing taking place: for example, whether in an alert or resting state, or during dreaming. Currently, popular neural network models tend to concentrate on information processing at the latter extreme, where information is represented as the firing rate of a neuron and the information per action potential is very low (Amari, 1978, 1989; Anderson & Rosenfeld, 1988; Grossberg, 1988; Hopfield, 1984). In this paper, we focus on information processing at the other the extreme, where the information per action potential pulse is very high. To do this we use a fundamentally different model. We introduce a specific model, which we call a pulse propagation network ( P P N ) , and consider this model from the point of view of information processing, as a dynamical system and as a computing machine. In a PPN, binary information can be represented as occurrence or nonoccurrence of action potentials, that is, as one bit of information per action
In biological neural networks, information is transmitted by the conduction of action potentials along axons, and information processing takes place at the synapses, dendrites, and soma of neurons (Koch & Poggio, 1987). From the point of view of information processing, an important question is how much information is being carried by an action potential. At one extreme, each action potential could represent a single bit of information, similar to a serial digital c o m m u nication channel without error checking. At the other extreme, one bit of information may be represented as a burst of action potentials, so that the information is spread over so m a n y action potentials that each pulse carries almost zero information and any one action potential could be removed with no information loss. This situation is similar to tone dialing and frequency modulation encoding where digital information is encoded as a burst of tone.
Acknowledgements: The authors wish to express their thanks to Prof. S. Amari for his assistance and encouragementof this research. The first author was supportedby the Japan Societyfor the Promotion of Science Post Doctoral Fellowship.This research was partially supported by a Grant-in-Aid (02255107) for Scientific Research on Priority Areas from the Ministry of Education, Scienceand Culture of Japan. Address reprint requests and all correspondenceto Kazuyuki Aihara, Department of Electronic Engineering, TokyoDenki University, 2-2 Nishiki-cho, Kanda, Chiyoda-ku, Tokyo 101, Japan. 203
204
potential, but it is also possible to represent continuous valued information (real numbers) as the time interval between two action potentials. In this paper, we concentrate on the second, more general case where the continuous valued information is represented as the time intervals between action potentials and show that PPNs have some interesting characteristics when considered from the point of view of their dynamics, computing ability, and information processing; we summarize our results in the following paragraphs. Viewed as dynamical systems, many currently popular models are fundamentally equilibrium-attractor models (Amari, 1978, 1989; Amari & Maginu, 1988; Amit, 1989; Anderson & Rosenfeld, 1988; Grossberg, 1988; Hirsch, 1989; Hopfield, 1984). Equilibrium-attractor models are, from the dynamical systems point of view, uninteresting except in their transients and basins of attraction. However, recurrent networks and some other models are not equilibrium-attractor models and do have interesting dynamics (Hirsch, 1989; Freeman, 1987 ). In contrast, PPNs have a rich variety of complex dynamics. As dynamical systems, PPNs are neither discrete nor continuous time systems; they are discrete event systems. To view PPNs as dynamical systems requires a minor extension of the usual definition of the phase space. The most familiar model of a computing machine is the Turing machine (Arbib, 1967; Davis, 1985). A more general notion is the X-machine, which implements algorithms defined by a transition graph and operates with data of type X (Eilenberg, 1974; Herman & Isard, 1970). The transition graph defines the state transitions of the machine and the rules applied to data in each state. A Turing machine is a Z-machine because its calculations are performed on the integers Z, or binary numbers, and the rules of the Turing machine can be described as a transition graph. Machines that operate over the reals are R-machines and various questions regarding the class defined by polynomial rules have been considered by Blum, Shub, and Smale, (1989) and Smale (1990). Viewed as computing machines, it is known that a conventional neural network model is as least as powerful as a Turing machine (Minsky, 1967; Garzon, 1990) and can, in fact, implement a class of R-machine (Hayashihara, Yamashita, & Ae, 1990). Chaitin has proved by algorithmic information theory that almost all real numbers can not be reduced to a G6del number, that is, no Z-machine exists that prints out all their decimal digits (Chaitin, 1987 ), but on the other hand, R-machines, which use real numbers as an "entity," can be constructed to print out any real number's decimal digits. Consequently, in a very definite and unambiguous sense, R-machines are more powerful than Z-machines, since Z-machines are a special case of the R-machines, but true R-machines can perform tasks that Z-machines cannot. (The
K. ~ Jttdd and K. Aihara
question of whether R-machines can solve the halting problem for Z-machines, that is, be super-Turing-machines (Stannett, 1990) is a different question.) We will show that PPNs implement a class of R-machines-the class of machines defined by piecewise smooth rules. We also show as a consequence that a PPN can approximate any R-machine arbitrarily closely. In a PPN, information can be represented dynamically as the time between action potentials, or more generally, as arbitrary temporal patterns of action potentials. This temporal coding of information could have advantages in biological neural networks. Many standard neural network models propose to represent continuous values as average firing rates of neurons. However, the average firing rate is not a well defined number; whereas the interval between two action potentials is precisely defined. Furthermore, in temporal coding, only two action potentials define a value, but many action potentials are required to approximate a value as an average firing rate. Hence, temporal coding is more precise and faster. In work presented elsewhere (Judd & Aihara, 1992) we show that a PPN can generate and recognize quite arbitrary temporal patterns, and when equipped with a Hebbian learning rule, can learn temporal patterns by a process of self-organization. Whether biological neural networks employ this kind of temporal coding is a matter for speculation. One simple observation is that temporal coding and the processing of these codes can result in firing patterns in the neural network which appear very incoherent compared to networks using only average firing rates to code and process information. If biological neural networks use temporal coding, then the complex fluctuations seen in real EEG signals (which are generally taken to be noise) may be indicative of deeper, more complex forms of information processing (Judd & Aihara, 1992). 2. A P U L S E PROPAGATION N E T W O R K To investigate neural networks for which the information per action potential pulse is high, one might combine models of membrane conduction of axons such as the Hodgkin-Huxley (1952), or FitzHugh-Nagumo equations (FitzHugh, 1969; Nagumo, Arimoto, & Yoshizawa, 1962), a threshold model of the soma, and Rali's theory of passive dendritic trees (Rail, 1959; Shepherd, 1979) to model electrical activity of each single neuron, then combine these models into a network through their synaptic interaction. Numerical simulations of the dynamics of biological neural networks by this method, in both discrete and continuous time, have been made, but require enormous computation. Excellent reviews are Sampson (1984) and MacGregor (1988). However, mathematical analysis of such networks is impossible. The alternative is to
Pulse Propagation Networks consider discrete events such as action potential stimulations, in the PABLO numerical simulation (Perkel, 1976; Sampson, 1984). This is, the approach we adopt since it allows a theoretical development. If the bulk of information is assumed to be carried by action potentials, all we really need to know is whether an action potential is generated by a stimulated neuron, and at what times newly generated action potentials stimulate the other neurons to which they are connected. The details of the propagation of an action potential, for example, are to a major extent, irrelevant. It has been shown through mathematical analysis and verified by experiments on squid giant axons, that a one dimensional model is nearly always sufficient to determine if a stimulated excitable neuron will generate an action potential (Judd & Matsumoto, 1991; Judd, Hanyu, Takahashi, & Matsumoto, 1991 ); similar one dimensional models and results exist for oscillatory neurons (Glass, Guevara, & Shrier, 1985). Such onedimensional models can be derived from the HodgkinHuxley or FitzHugh-Nagumo equations. For an excitable neuron, the dynamics in response to stimulation is given by an equation r' = f ( r + 7", s). In this equation, r is a scalar that represents the internal state of the neuron, and r' is the new internal state after the neuron has been stimulated by the action potential of strength s, given that the last stimulation of the neuron occurred a time T earlier. The function f i s of the form shown in Figure 1. The generation of an action potential as a result of a stimulation is determined by the new state r'. I f r ' is less than some value R ( s ) , then no action potential results from the stimulation; otherwise, there is an action potential. The value R ( s ) is shown in Figure 1. In the oscillatory case, the function f i s a circle map. A simple model of action potential generation such as the above provides much of the significant detail of a neuron's dynamics from the point of view of information processing, and hence reduces the complexity of the calculations in simulations. The important aspect of such a model is that the generation of action potentials is determined by two quantities: a scalar internal state of the neuron and the time since its last stimulation. The only other information required is the time it takes for an action potential to propagate between neurons. (Of course, we must also allow for the transfer behaviour of synapses that determine the stimulation strength s; synapses are not simply weighted connections (Shepherd, 1979). However, in this preliminary analysis we shall maintain the common assumption that synapses are simple weights.) Consequently, we will assume a one-dimensional model of neuron behaviour and proceed with the following theoretical approach to the analysis of neural
205
f(r)
f I
I
I
I
I
FIGURE 1. Change of state function. Under most conditions the behaviour of a stimulated nonoscillatory neuron can be described by a one-dimensional change of state function of a form similar to that shown hero (Judd & Matsumoto, 1991; Judd, Takahashi, & Matsumoto, 1991 ). For an oscillatory neuron the change of state function is a circle map (Glass, Guevaro, & Shrier, 1985). The change of state function is r' = f ( : + T, s), where r is a scalar that represents the internal state of the neuron, and r' is the new internal state after the neuron has been stimulated by the action potential of strength s, given that the last stimulation of the neuron occurred a time T earlier. The generation of an action potential as a result of a stimulabon is determined by the state r. The change of state function has a deep, narrow well at the state value R ( s ) . If r is less than R ( s ) then no action potential results from the stimulation. The function shown here is derived from a singular limit of the FitzHugh-Nagumo equations and is
f(7; s)
r + ~ log(1 + 1 / ~ ) =
+ ~ log(1 - s / g ( ' y e - ' ) )
s < g(~/e-')
l o g ( B / ( 1 - (1 - B)'ye ")) + e log((1 - g(~'e "))l(s - g('ye ")))
s > g(-ye-'),
whereg(r)=~+B+(1-B)r, 3,=(l+l/e) 'and0<~x< < 1. The parameters ~, B and ~ define the characteristics of the singular flow. In this figure the parameter values are e = 0.1, ~ = 0.05, B = 0.1, s = 0.3 and T = 0.
networks in which every action potential is considered. We introduce here a model of a neural network which will be called a PPN. The following is a complete definition of the PPN model. DEFINITION. A P P N is a network of a finite number of (neural) elements that have an internal state z and a response to stimulation given by an equation r' = f t r + T, s). In this equation, r is a scalar that represents the internal state of the neuron, and r' is the new internal state after the element has received a stimulation of strength s, given that the last stimulation occurred a time T earlier. The generation of an action potential as a result of a stimulation to an element is determined by the state r and stimulation strength s; if r is less than some value R ( s ) , then no action potential results from the stimulation. A PPN has a finite number of connections be-
206
K. T. Judd and K. Aihara
tween elements. The connections are one-way paths along which action potentials or information pulses propagate. When an action potential propagates along a connection, there are two forms of delay: a fixed finite nonzero delay dependent only upon the connection (it can be thought of as the length of the connection), and a variable delay g(r', s) that is independent of the connection (this delay is in the generation of the action potential). Each connection in the network has associated with it a (synaptic) weight s, which gives the strength of the stimulation at the receiving element. The stimulations are assumed to act instantaneously. The functions l a n d g are assumed to be C r with r > 0, except possibly at R, where they may be discontinuous. If a neural element should receive stimulation from more than one connection simultaneously, then we will choose (arbitrarily) the largest of the stimulations to occur. The exact form of the functions l a n d g is not important in the following, although we have something similar to Figure 1 in mind. We will also consider inhibitory/refractory neurons, which can be written as a piecewise linear function. The question of what to do about simultaneous stimulation of neurons is not important because stimulations are assumed to act ins t a n t a n e o u s l y - o n e can equally well take the sum of the simultaneous stimulations. Neither is a loss of smoothness or discontinuity in the change of state function .fat r = R important. What is important is that the occurrence of stimulations that result in r' = R will often mark the boundaries between regions where the PPN has very different information processing, dynamical evolutions, or computational results; the demonstration of this is a significant result of this paper. The aim of this paper is to consider two questions. What is the dynamics of a PPN? What is the potential ofa PPN for computation and information processing? We give first a simple example.
The simplest neural elements one might use in a PPN are absolutely refractory/inhibitory elements, that is, they will respond to a second stimulation after a previous firing or after an inhibitory pulse only if the second pulse arrives after some refractory/inhibitory period R has elapsed. The simplest networks of such neural elements are composed of either strictly excitatory or strictly inhibitory connections with only fixed delays, that is, g ~ 0. To illustrate such PPNs we consider the construction of a binary pseudo-random number generator. EXAMPLE 1. Two mutually exciting, absolutely refractory neural elements, (see Figure 2 (a)), form an oscillator if the combined delay of both connections T is greater than the refractory period R. The oscillation can be started by a stimulation to one of the elements. Suppose we have two such oscillators 1 and 2, with periods T~ and T2, where T~ < T2. These are coupled to a fifth absolutely refractory/inhibitory element, with oscillator 1 by an excitatory connection and oscillator 2 by an inhibitory connection of the same delay, as shown in Figure 2 ( b ) . Suppose that oscillator 1 is started first by a stimulation and oscillator 2 is started by a stimulation a time x later. The output of this network, firing of the fifth element, can only occur at integer multiples of Tj as a result of stimulations from oscillator 1, but sometimes they will not occur at these times because the stimulation arrives during a inhibitory period induced by inhibition from oscillator 2. The nonfiring or firing of the fifth element can be represented as a sequence of binary digits, 0 for nonfiring and 1 for firing. It can be seen that a 1 will occur at the n-th position in the binary sequence if the condition n f l - x > R(mod T2)
is satisfied. It can easily be shown that if T~ / 7"2 is irrational, then there is a unique binary sequence for
(a) Connection Types Co)
excitatory Oscillator 1: start
Output
I inhibitory
Oscillator 2: start FIGURE 2. A simple PPN. (a) Two mutually exciting absolutely refractory/inhibitory neurons. These two neurons form an oscillator if both neurons make supra-threshold stimulations and the total delay in the propagation of an action potential from one neuron to the other and back is greater than the refractory period. The oscillation can be started by a stimulation to one neuron. In (b), the output of two such oscillators are directed to a fifth neuron. One oscillator provides an excitatory stimulation and the other an inhibitory stimulation. By adjusting the periods of the two oscillators, as described in the text, this PPN can implement the linearcongruence algorithm to generate a binary psuedo-random sequence.
Pulse Propagation Networks
207
(a) 90 1
'
'
'
I
'
'
'
r
I
'
'
'
I
V'"
'
I'
'
'
I, .
.
I
.
'
'
'
.
_ 100
Time
(b)
80
7O .# > "6
~6o ~2 50 I
40
60
I
I
I
62
I
I
I
I
64
I
I
Time
I
I
I
66
J
L
I
68
L
I
t
70
FIGURE 3. The total activity of a large PPN. Some computer simulations of PPN have been made with elements of either absolutely refrectory/inhibitory elements, relative refractory elements or the realistic model elements described in Figure 1. The PPN were connected into regular grids with local connections and randomly connected clusters. Shown here in ( a ) is a typical measurement of the total activity of a PPN exhibiting complex behaviour. The total activity is the total number of action potentials propagating at a given instant, that is, if the PPN has a future ( S , Ft), then the total activity is the number of events in Ft. The total activity of a PPN is analogous to the EEG of a real neural network. Both long period oscillations with complex wave-forms and apparently aperiodic behaviour have been observed. PPN with the realistic model elements can be proved to be chaotic. This simulation is of 50 neurons connected randomly to 3 other neurons with connection strengths chosen uniformly between - 0 . 3 and 1.0. The neurons have a threshold of 0.1 and a relative refractory behaviour similar to real neurons. The simulation involved more than 10,000 events. (b) is an expanded view of the b o x e d region of (a). One can see a self-similarity in this signal.
each 0 -< x < T l , which is aperiodic if x~ T2 is irrational. Thus, this five element PPN can function as a binary pseudo-random number generator with x functioning as the seed; in fact, it is an implementation of the linear congruence algorithm. It should be noted that this implementation is "better" than any pseudo-random number generator that can be implemented on a digital computer (a finite Z-machine), since their " r a n d o m " sequences are always periodic, but that it is not truly random because the binary sequence is quasi-periodic. We see from this simple example that a PPN is capable of fairly complex dynamics even when the neural elements are not individually very complex. We also see that a PPN can perform computations impossible on digital computers, or more precisely, on any Turing machine, since the above is effectively equivalent to calculating the binary expansion of an arbitrary real number. Arbitrary real numbers, however, are not computable, that is, their binary expansion can not be calculated with a Turing machine (Chaitin, 1987). Some computer simulations of larger, randomly, and regularly connected PPNs have been made. It was found that these networks apparently could have very complex
behaviour. With the more realistic one-dimensional change of state function described earlier, it is possible for PPNs to have truly chaotic dynamics. Figure 3 is a typical time sequence of the total activity, that is, the total number of action potential pulses propagating at a given instant. The total activity is the equivalent of an EEG of a biological neural network, and superficially the time sequence of the total activity resembles a real EEG. We now ask, what are the dynamics and information processing potential of PPN in general? 3. T H E DYNAMICS OF P P N We will give a formal description of the dynamics of a PPN. As dynamical systems, PPNs are neither discrete nor continuous time systems; they are discrete event systems. To describe the dynamics of a PPN, an extended concept of phase space is required, however, with realistic constraints it can be shown that PPNs are finite-dimensional, discrete dynamical systems defined by piecewise smooth maps. For a PPN, let St be the state of all the neural elements at the time t, that is, St gives each element's
208
K. ~ Judd and K. Aihara
internal state r and the time of its last stimulation T. Let E~j be the event of a pulse from neural element i arriving at element j at time t. Two alternatives can occur as a result of the event E~j; they are: 1. The state of the element j changes, or 2. The state of) changes and new action potential pulse is created which divides and propagates to all elements to which j is connected. Case 2. is equivalent to: 2'. The state o f j changes and a finite number of new events Ej~kare created which occur at different times in the future, that is, for some r > t. From the definition of PPN, r = Djk + g(T', s), where Djk is the fixed delay of the connect from element j to element k and g(T', s) is the delay in the creation of the action potential. The dynamics o f a PPN can be described as follows. First note that St is piecewise constant, changing only when some event E~j occurs. Let Ft be the future events at any time t, that is, a sequence of events F~ = [E~), E~,I. . . . ], ordered according to the times at which they occur, t _< s _< r _<. . . . Note also, that Ft changes only when s = t, that is, when the first event E~j in the sequence occurs. In Case 1 above, this event is removed from the sequence of future events. In Case 2, the first event is removed and a finite number of new future events are added; these new events are slotted in amongst the other future events. The future behaviour of the network is determined by the pair (St, Ft) and so such a pair will be called the .haure. We can calculate the evolution of a future by determining the effect of each event Eli on St and 1:1, that is, at the instant of an event E~j, (S,, F~) --,. (S'~, F'~), where S'~ is the new state after the event E~j has occurred and F't is the new sequence of future events. For an initial state So and initial events Fo, the set of futures (St, Ft), for some t < T, will be called the evolution up to time T and will be denoted ET(S0, F0). The entire evolution will be denoted E(S0, Fo). We note again that the future (st, F,) is piecewise constant, and only changes when the next event of Ft occurs. The future (St, Ft) is the "phase space" of the PPN, but in general, it is not equivalent to the usual notion of phase space from dynamical systems theory because the future event list Ft is continually changing. If the neurons have a nonzero refractory period, which is a realistic constraint, then it can be shown that the number of events in F~ has an upper bound. As a consequence, it can be shown that the evolution E (So, Fo) is described by a finite-dimensional, discrete dynamical system with a piecewise smooth map (Judd & Aihara, 1992). However, since we will not use this equivalence in this paper we will not prove it. 4. P P N AS C O M P U T I N G M A C H I N E S We will first note that a PPN is capable of performing any calculation that a Turing machine can make. That
a PPN can make a calculation that a Turing machine cannot was demonstrated in Example 1. Together these imply that PPNs are strictly more powerful than Zmachines. Some observations of PPNs in relation to X-machines over different fields X will be made. An X-machine is a computing machine that applies a fixed algorithm to data of type X (Eilenberg, 1974; Herman & Isard, 1970), though the algorithm may enable the X-machine to apply an algorithm defined by the input data. The algorithm of an X-machine is defined by a directed graph called the transition graph. The transition graph has three types of nodes: a single input and possibly several output nodes, identified by having attached just one in-going or out-going edge, respectively; function nodes with one in-going and one out-going edge; and branch nodes with one in-going edge and more than one out-going edges. The machine operates by putting one element of X-data at the input node, then carrying this datum along the directed edges. Each function node has associated with it a function or rule for transforming X-data. When it arrives at a function node, the function is applied to the datum and the new datum is carried off. Each branch node has associated with it a test. At a branch node the test is applied to the datum and the machine proceeds on an edge determined by the outcome of the test. The machine stops when it arrives at an output node, and the output of the machine is the current datum. More complex machines receiving more than one input or operating on vectors are also possible, but are effectively the same as the simple machine just described. Turing machines are equivalent to Z-machines. It should be noted that X-machines can be subclassified by the kind of rules, or functions, they use. For example, R-machines can be restricted to polynomial functions (Blum et al., 1989; Smale, 1990). We now ask, are PPNs X-machines, and if so, what kind of X-machines are they? That is, what is X and what kind of rules can these machines use? The description in the last section of a PPN is formally similar to a Turing machine, where St is the state of the machine, F, is the tape, the events E~j are the symbols on the tape with t denoting the position on the tape, and the rules of the machine are the connections of the network, that is, these rules determine how new symbols are placed on the tape. As stated, this is not a finite state machine, that is, it is not strictly a Z-machine, but in any case, the following result holds: THEOREM 1. Any Z-machine can be represented by a
PPN with absolutely refractory~inhibitory elements and g-=0. Despite the formal similarity noted above, the easiest proof of this theorem is simply to implement a Turing machine in "hardware" as in the following sketch. Figure 4 illustrates the hardware components needed. First of all, note that one can make a two element oscillator and arrange that this provides a global synchronization
209
Pulse Propagation Networks
(a)
Connection Types excitatory ~ l
inhibitory
qo)
read ~
erase
write
~s.[
(c)
!s.[
,J
,RT read/write
((1)
OR
NOT
AND
clock
[~ e l ~ k
FIGURE 4. Components of a Turing machine. A Turing machine can be implemented using a PPN by connecting together the components shown here. (a) is an oscillator which is used to provide a global synchronization clock as in a conventional digital computer; and (b) is a memory cell based on an oscillator with the same period and synchronized to the global clock. The memory cell holds a value 1 if the oscillator is running and 0 if it is not. There are connections to start the oscillator with an excitatory stimulation and to stop it with an inhibitory stimulation, that is, a write I and a write 0 or erase input. (c) is a Turing machine tape implemented as a chain of the above memory cells. One of these cells can be read or written to and is the location of the Turing machine head. Special signals L, R and S cause the information stored in the tape memory to be shifted left or right, or to stay where it is. Here the negation of these signals tL and IR are used to give inhibitory stimulations to neurons connecting the cells to prevent information being shifted, and tS is used to erase memory cells during a shift. For a Turing machine an infinite tape is required which contradicts the finiteness in the definition of a PPN, but since only a finite portion of the tape is ever used there is no difficulty in extending the definition to allow for an infinite tape. (d) are implementations of logical OR, AND and NOT functions. We can arrange that all signals are synchronized by the global clock and hence we can usa this clock signal to implement the logical NOT and AND functions. A Turing machine can now be built by connecting these components together.
clock ( Figure 4 (a)). Then one can construct the other necessary hardware: memory cells for the internal state of the machine (Figure 4(b)); a shift register memory that functions as the Turing machine tape, which can shift left, right, or not at all in response to special pulses, and has a tape reading and writing head (Figure 4 (c)); and logical OR, AND and NOT circuits, with the aid of the synchronization clock ( Figure 4 (d)). It is then possible to construct any Turing machine, given its set of rules, by connecting logical circuits to provide the
necessary shift, write, and read signal pulses according to the rules of the machine. The construction in this sketch is, of course, not necessarily the most efficient means of representing a given Turing machine, nor is it absolutely necessary that a global clock be provided. This ability to represent any Z-machine shows that a PPN must be a class of X-machine that contains the Z-machines. However, we have already seen in Example I that a PPN can perform some calculations Z-machines cannot, and so this class
210
K. T Judd and K. Aihara
of machines is strictly larger than the Z-machines. Since a PPN can represent any real number as the time between two events, does this make a PPN an R-machine? We consider this question next. To be an R-machine a PPN requires three properties: to be able to represent arbitrary real numbers, calculate arbitrary functions, and test which is the greater of two arbitrary real numbers. Taking a restricted class of functions is also reasonable, the algebraic theory of Rmachines of Blum et al. (1989) considers polynomial functions. We will show that a PPN can be an R-machine using a class of piecewise smooth functions. A PPN can take any real number as an input, represented as the time between two events. In general, it can receive a finite set of real numbers S as input information in the form of the periods between a number of input stimulations relative to some input reference pulse. That is, the reference pulse tells the PPN that it is about to receive input; this input also defines time 0. The finite set of real numbers S is defined either by consecutive stimulations to one input or stimulation to several inputs. In the following, we will assume for convenience that set S includes 0. Without loss of generality, we may assume that the output of the machine is a single real number y represented as the time between an output reference pulse and a second information pulse. The values that y can take define the data set X of the PPN. Consider first the case o f a PPN where g is a constant, which can be assumed to be zero. In this case, the propagation delays of all connections are fixed. Let D be the set of all connection delays. Since stimulations act instantaneously, the time at which any event occurs must be in the set
S + Z + ( D ) = { s + ~ D z t s E S , z@Z+ }.
Now consider the case o f a PPN where g is not constant or piecewise constant. ( I f g is piecewise constant, the PPN is essentially the same as the constant case.) Suppose instead that g has a range J = (0, a) C R, a > 0. (Since g is somewhere continuous and nonconstant, its range must contain an open interval. A more complex range gives essentially the same results as the range J . ) For this PPN, with the set of fixed connection delays D and input stimulation times S, an event must occur at a time in the set S + Z+(D + J) ={s+
~ z(t+u) sES, z E Z ~ , u E J } . tED
Consequently, the output y of the PPN can be a positive member of the set
S-S+Z(D+J) :{s-s'+
~,~Dz(t+u) s . s ' E S , z ~ Z , u C J } .
It can be seen that if Z ( D ) f3 J = ~ , then the above set is a union of disjoint open intervals, but i f Z ( D ) fq J # ~ , then the intervals overlap and y can be any positive real number, and the PPN can be an R-machine. We now ask the question, if the range o f g contains an open interval J , and Z ( D ) 7 / J # ~ , can the PPN then represent any R-machine? That is, can the PPN have arbitrary rules? The answer is no. To be an arbitrary R-machine we would require that for any real scalar function y = h(x), there is a PPN such that given an input x, its output is y = h(x). We have instead that the rules are piecewise smooth functions as described by the following theorem: THEOREM 2. Let a PPN take a single input x, give an
Consequently, the output y of the PPN, which is the difference in the times of two events, must be a positive member of the set S-S+Z(D)={
s-s'+
~,~Dzts's'ES'zEZ}'
Thus, for a PPN where g -= 0, the output y is restricted to a m e m b e r of a set T + Z ( D ) , where Tis determined by the input, and so the field over which calculations can be made by such a PPN is Z ( D ) . The PPN is at most a Z(D)-machine. If the set D contains only rationals, then the PPN can be no more than a Z-machine, since by rescaling time, the output can only be an integer. I f D contains irrationals, then Z ( D ) is dense in R. In this case, a Z ( D ) - m a c h i n e is not an R-machine, but any real number could be approximated arbitrarily closely by increasing computation time. Conversely, real numbers could be expanded uniquely as in Example 1, and are consequently strictly more powerful than Zmachines.
output y and have connection delays D. I f W is the set of x and So for which the PPN has finite calculation time for an initial state So, then W can be divided into connected open sets on each of which the output function y = h(x, So) is a finite composition of the operations of evaluating thefunctions f g and addition of constants .from D. Furthermore, the closure of the union of these open sets contains W. We note, as a consequence of Theorem 2, i f g -= 0, then h is piecewise constant, since from the preceding discussion we have that in this case the range of h is in Z ( D ) . I f D contains irrationals, then a function h could be constructed to approximate any Borel measurable function arbitrarily closely, provided the sets on which h is constant can be suitably constructed. This is the subject of Theorem 3. In the case where g ~ 0, such approximations would also be possible, since g is C ~. However, an arbitrary close approximation may require arbitrarily long computations to construct the open sets on which h is defined.
Pulse Propagation Networks
211
To prove Theorem 2 we consider how an evolution Et(So, F0) changes with So and F0. We consider having a fixed set of m initial events Fo(x), where x ~ X ___ R ÷m is a vector giving the times at which the m events occur. I f S ___R 2N is the set of allowable states of the N elements of the PPN, then any evolution of the PPN can be specified by a point in S x X. If u = (So, x) E S X X, we will write E r ( u ) = Et(S0, Fo(x)). LEMMA 1. I f T < ~ , then ET(U), u E S X
X, contains
only finitely many events. Proof Each connection has a m i n i m u m nonzero delay, and hence any chain of causal events E~j, t < T must be of finite length. Each element of a PPN has only a finite n u m b e r of connections, and so can only produce a finite number of new events, and hence can only initiate a finite n u m b e r of chains of events. For u = (So, x) E S X X, the set of initial events Fo(x) is finite, and hence the above E r ( u ) contains only a finite number of events. DEFINITION. A stimulation s o f a neural element in a state ~-, given that the last stimulation occurred a time T earlier, is called critical if f(r + T,s)=R,
that is, the new state of the element is at the border between action potential generation and no action potential generation. An event E l of a PPN in a state St, is called critical if the corresponding stimulation is critical or there exists another event E~j, that is, there are at least two stimulations of the neural element occurring simultaneously. An evolution Er is called critical if it contains a critical event. Critical events are a key to understanding the evolution of PPN; the following l e m m a shows that if the evolution of a PPN contains no critical events up to some time T for a given initial condition u = (So, x) E S X X, then any initial condition in a neighbourhood of u has essentially the same evolution as for u. First, we must make precise the notion of " u p to time T."
4. If Sto is a state of Er0(Uo), there exists a C r function s on U w i t h S(Uo) = S, 0, such that for all u E U, s(u) is a state of Er(u)(U). 5. There are no states in Er(u)(U) other than those given in 4. 6. The functions t and s of 2 and 4 are finite compositions of the functions f, g and addition of constants from D. Proof of this and the following lemmas are given in the appendix. In simple language, L e m m a 1 states that if a suspended evolution Er0(u0) is noncritical, then there is an open connected set of initial condition U containing u0 such that every evolution E ( u ) for u E U up to some time T ( u ) is noncritical and identical to the evolution Ero(u0) in the sense that the same events occur, but possibly at different times. Or put another way, only critical events can significantly change evolutions. Let Wr --- S X X be the set of evolutions which terminate in a time T, that is, if(S0, Fo) E Wr, then Ft = ~ for all t > T. Note that for w ~ Wr, the function T(w) = t > T is a suspension on I417. LEMMA 3. Let V c Wr be the set of noncritical evolu-
tions. Let v~ E V1 ~ V and v2 ~ V2 ~_ V, where VI and V2 are open connected sets, as in Lemma 2, for the evolutions Er( vl ) and ET-(vz), respectively. I f VI n Vz J~, then V~ U 112 satisfies Lemma 2for both Et( vl) and Er(vz). L e m m a 3 shows that an open set satisfying L e m m a 2 can be expanded to a maximal set by considering whether points on the boundary of the set are noncritical. ( I f not, then an open neighbourhood of the point can be included in the set). Applying this idea, we obtain the following lemma. LEMMA 4. Let w ~ Wr have a noncritical evolution er(W), and let U(w) be the maximal set satisfying Lemma 2 f o r E r ( w ) . I f u ~ OU(w), then E ( u ) is a critical evolution.
DEFINITION. A suspension of an evolution E ( u) is an
The maximal sets U(w) in L e m m a 4 represent an important concept and will be given a name.
evolution up to a time T, such that no event of Er(u) T A C r occurs at the time T, that is, there is no event Eij. function T(u) from U ~_ S × X to R + is called a suspension on U, iffor all u E U, Er(u)(U) is a suspension.
DEFINITION. Let T(u) be a suspension on U Define an evolution domain o f T to be a maximal set satisfying Lemma 2 for some Er{,)(u), u E U
LEMMA 2. I f a suspended evolution Er0(u0) is noncrit-
ical then all of the following are true. 1. There exists an open connected set U E S X X containing Uo and a suspension T on U with T(u0) = To, such that for all u E U, ETc,)(u) is noncritical. 2. For any event Er,~ in Er0(Uo) there exists a C r function t on U w i t h t(Uo) = to, such that for all u ~ U there is an event E~ ") in Er¢,)(u). 3. There are no events in Er~u)(U) other than those given in 2.
An evolution domain is a set of initial conditions whose evolutions are identical in the sense of L e m m a 2. All evolutions in an evolution domain are noncritical, and the next l e m m a shows that the critical evolutions form the boundaries between evolution domains. LEMMA 5. Let T( u) be a suspension on U IfEr(u)( U),
u E U is a critical evolution, then there exists an evolution domain W C U such that u ~ 0 W. I f W is an evolution domain in U, then OW is a piecewise C r codimension I submanifold.
212 L e m m a s 4 and 5 imply that Wr can be decomposed into open connected sets (the evolution domains) on which all evolutions are identical in the sense of L e m m a 2. Furthermore, these sets fit together tightly with the boundaries between evolution domains being formed by critical evolutions. The output function h of Theorem 2 will in fact be defined on evolution domains.
Proof of Theorem 2. By L e m m a 4 we have that Wr can be decomposed into evolution domains. On each of these domains we have by L e m m a 2 that there are fixed chains of events, and the times of the events are given by C r functions which are finite compositions of]~ g and addition of constants from D. In particular, this holds for the two events that specify y and hence h on Wr. Clearly, W = UT>0Wr. By the converse part of L e m m a 4, we have that any critical evolution is on the boundary of one of the above open sets, and hence the closure of these sets must contain W.II We conclude this section with a discussion of the dynamics and information processing of PPN in the light of the preceding proof. We note that the important concept that has been introduced is that of the evolution domain. From the point of view of the dynamics of a PPN, evolution domains provide a means to analyze recurrent behaviour. For example, let Tl and 7"2 be two suspensions on U c_ S × X such that T2 > Tl. Suppose D is an evolution domain of the suspension T2 and there exists a C r m a p 4~: D -+ U such that for u E D, if the evolution E (u) has a future ( S, F) at time T2 ( u ), then E(q~(u)) has the future (S, F) at time T~(4~(u)). That is, for each u E D there exists a 4~(u) which evolves to the same future as u, but in a shorter time. Now suppose ET2 o q~(D) _~ D, then the evolution of D is recurrent. Now standard tools from dynamical systems theory can be used. For example, the recurrence implies there exists a periodic evolution in D, and if Er2 o ¢ were a contraction mapping, then all evolutions in D converge asymptotically to this periodic evolution. From the point of view of information processing, an evolution domain is the set of initial conditions that by the time of the suspension Thave an undifferentiated information content. To see this, we make the following useful observation. If T, and T2 are two suspensions on U, with T2 > Tl, then every evolution domain of T2 is a subset of an evolution domain of Tl. This is true because any critical evolution up to Tl is also critical up to 7"2. The implication of this observation is that evolution domains can only be subdivided as time progresses; they cannot change shape. This subdivision can be viewed as a decision process that extracts information from the initial conditions. We note that the evolution of a PPN in different evolution domains can be very different, and hence provide clear distinction of information present in the initial conditions.
K. 77 Judd and K. Aihara 5. A P P R O X I M A T I O N O F R - M A C H I N E S Theorem 2 suggests that it is possible that although a PPN is not a general R-machine, it might at least be able to approximate one arbitrarily closely. THEOREM 3. l f h is a bounded positive fimction on a bounded domain X C R +, then for any f and g there exists a sequence of PPN Pn taking a single input x, giving an output y, with output function y = h,( x), such that h, -+ h in L I ( X ).
Proof We prove this result by giving a method of constructing the Pn. The construction is by no means the most efficient, but is sufficient. We show first that for any 0 < a < b E X and y E R + there is a PPN with an output function
h(x) =
none 3'
x < a a
none
x > t~,
First note that for any .['we can chose the connection weights so that every connection is either fully excitatory or fully inhibitory. This can be done by making the weights large positive or negative values. Now consider the PPN shown in Figure 5 which can be made to do the job of providing h. We assume that initially, all elements are in a rest state and ready to fire on stimulation. This PPN has two input elements I and X. Element I receives the input reference pulse that defines time zero, and X receives the information pulse at time x. There are two output elements O and Y, with O receiving the output reference pulse at some time t and Y receiving the information pulse at the time t + y. There are two pairs of elements that act as oscillators Ol and 02. The element A acts as a gate to allow or disallow an information pulse to pass. The PPN operates as follows. First, of all the input reference pulse to I causes it to stimulate and start oscillator O,, and the information pulse to X causes it to stimulate A at some time x + d. Oscillator Ol makes multiple inhibitory connections to A. The delays of these connections are so arranged that A receives a continuing sequence of closely spaced inhibitory pulses that have the effect, once they begin, of completely inhibiting A. We arrange that the first inhibitory pulse from Ol arrives at A in a time less than d, so that element A is completely inhibited before it can be stimulated from X. However, oscillator Ol also receives an inhibitory pulse from I which is timed so that it stops the oscillation of O~. As a result, when O, stops, A is no longer inhibited. We arrange that A is able to be stimulated at a time a + d, which implies stimulation from X can only be successful if x > a. However, there is a second oscillator 02 that is started by a pulse from I and has multiple inhibitory connec-
Pulse Propagation Networks
213
0 1
0 2
!
~0
excitatory
Connection Types
I
xO
---~
J
inhibitory multiple inhibitory
A FIGURE 5. The basic R-machine component. Here is illustrated the basic component required to build an approximation to any Rmachine. The operation of this PPN is described in detail in the proof of Theorem 3. The only unusual aspect of this network is the multiple inhibitory connections. These are a number of inhibitory connections from the same source neuron which have differing delays and hence cause the neuron A to be inhibited for extended periods.
tions to A in the same manner as O1. That is, once 02 is started and the first inhibitory pulse arrives at A, then A is completely inhibited. We arrange that A is inhibited by 02 after a time b + d. Consequently, the stimulation from X can only be successful if a < x < b. If a < x < b, then A is successfully stimulated and will then stimulate the output elements O and Y. We arrange that the difference in the delay of the connections to these elements is y. Note that if x < a or x > b, then the stimulation of A is not successful and there is no output from the PPN. Thus the PPN has the required output function. It now follows that for any piecewise constant function an L I(X ) equivalent function can be constructed from several PPN of the above type. Furthermore, for any bounded function h there is a sequence of piecewise constant functions hn such that hn --~ h in L I(X ).1 In the language of the preceding section, the PPN of Figure 5 used in Theorem 3 has exactly three evolution domains, x < a, a < x < b, and x > b. The two critical events that define the boundaries of the evolution domains are a critical stimulation of A by X at a time a + d and the simultaneous events of stimulation of A by X and inhibition of A by 02 at time b + d. The consequence of Theorem 3 is that for any Rmachine there are PPN whose output approximates the R-machine arbitrarily closely. This is done by approximating all functions in the R-machine as above and approximating the decisions at branch nodes by comparing outputs of functions against fixed delays.
but not as powerful as general R-machines; although it is possible for them to approximate such machines arbitrarily closely. The discussion following the proof of Theorem 2 suggests ways in which to continue the analysis begun here and to further investigate the potential of PPN. The proof of Theorem 2 shows that the key to information processing by PPN is in the control of critical events. It is hoped that further analysis of the dynamics of PPN will reveal ways to control critical events and perform novel forms of information processing. Computer simulations of PPN have already shown that critical events can be controlled. Perhaps further analysis can provide some insight into information processing of biological neural networks. One implication of our results is that the complex fluctuations seen in EEG signals, which are generally taken to be noise, may be indicative of deeper more complex form of dynamical information processing similar to that of PPN. If this is the case, it was noted that the temporal coding of PPN is more precise and faster than information processing based on representation of continuous values by average firing rates. O f course, we do not suggest that biological networks are organized as the networks used in our proofs. These networks were constructed merely to show that the potential for dynamical information processing exists. Further study of PPN and computer simulations have also shown that PPN can be constructed to spontaneously learn and recognize quite arbitrary recurrent temporal sequences of stimulations (Judd & Aihara, 1992).
6. C O N C L U S I O N It has been demonstrated through Theorems 1 and 2 that PPN have considerable potential for information processing. They are more powerful than Z-machines,
REFERENCES Amari, S. (1978). Mathematical theory of neural networks (in Japanese). Tokyo:Sangyo-tosyo.
214 Amari, S. ( 1978 ). Mathematical theory of neural networks (in Japanese). Tokyo: Sangyo-tosyo. Amari, S. (1989). Mathematical foundations of neuralcomputing (METR 89-06 ). University of Tokyo, Tokyo. Amari, S., & Maginu, K. (1988). Statistical neural dynamics of associative memory. Neural Networks, 1 ( 1 ), 63-74. Amit, D. J. ( 1989 ). Modelling brain function: The world of attractor neural networks. Cambridge: Cambridge University Press. Anderson, J. A,, & Rosenfeld, E. (Eds.). (1988). Neuralcomputing: Foundations of research. Cambridge, MA: MIT Press. Arbib, M. A. (1967). Finite automata. New York: McGraw-Hill. Blum, L., Shub, M., & Smale, S. (1989). On the theory of computational complexity over the real numbers. Bulletin of the American Mathematical Society, 21 ( 1 ), 1-47. Chaitin, G. J. (1987). Algorithmic information theory Cambridge tracts in theoretical computer science. Cambridge: Cambridge University Press. Davis, M. (1985). Computability and unsolvability. New York: McGraw-Hill. Eilenberg, S. (1974). Automata, languages and machines, Vol.A. New York: Academic Press. FitzHugh, R. ( 1969 ). Mathematical models of excitation and propagation in nerves. In H. E Schwan (Ed.), Biological Engineering (pp. 1-85 ). New York: McGraw Hill. Freeman, W. J. ( 1987 ). Simulation of chaotic EEG patterns with a
dynamical model of the olfactory system. Biological Cybernetics, 56, 139-150. Garzon, M. (1990). Cellular automata and discrete neural networks. Physica D, 45, 431-440. Glass, L., Guevara, M. R., & Shrier, A. (1985). Universal bifurcation and the classification of cardiac arrhythmias, perspectives in biological dynamics and theoretical medicine. Annals of the New York Academy of Sciences, 504, 168-178. Grossberg, S. (Ed.). (1988). Neural networks and natural intelligence. Cambridge, MA: MIT Press. Hayashihara, M., Yamashita, M., & Ae, T. (1990). On machine power of neural networks depending on continuity or discreteness of sigmoid functions (in Japanese). Transactions of the Institute of Electronics, Information and Communication Engineers. D(8), 1220-1226. Herman, G. T., & Isard, S. D. (1970). Computability over arbitary fields. Journal of the London Mathematical Society. 2( 2 ), 73-79. Hirsch, M. W. (1989). Convergent activation dynamics in continuous time networks. Neural Networks, 2, 331-349. Hodgkin, A. L., & Huxley, A. E (1952). A Quantitative description of membrane current and its application to conduction and excitation. Journal of Physiology, 117, 500-544. Hopfield, J. J. ( 1984 ). Neurons with graded response have collective computational properties like those of two-state neurons. Proceedings of the National Academy of Sciences USA, 81, 30883092. Judd, K., & Aihara, K. ( 1992 ). Generation, recognition and learning of recurrent signals by pulse propagation networks (in preparation). Judd, K., Hanyu, Y., Takahashi, N., & Matsumoto, G. (1991). A
simple model of nerve excitation that agrees with experiment (submitted for publication ). J udd, K., & Matsumoto, G. ( 1991 ). On stimulated dynamical systems
and applications to action potential generation in neurons (submitted for publication ). Koch, C., & Poggio, T. ( 1987 ). Biophysics of computation: Neurons, synapses and membranes. In G. M. Edelman, W. E. Gall, & W. M. Cowan (Eds.), (pp. 637-698) New York: John Wiley & Sons. MacGregor, R. J. (1988). Neural and brain modeling, neuroscience series 1. San Francisco, CA: Academic Press. Minsky, M. (1967). Computation: Finite and infinite machines. London: Prentice-Hall. Nagumo, J., Arimoto, S., & Yoshizawa, S. (1962). An active pulse
K. T. Judd and K. Aihara transmission line simulating nerve axon. Proceedings oJthe Institute of Radio Engineers, 50(10), 2061-2070. Perkel, P. H. ( 1976 ). A Computer program for simulating a network of interacting neurons. Computers and Biomedical Research, 9, 31-43, 45-66, and 67-73. Rail, W. (1959). Branching dendritic trees and motoneuron membrane resistivity. Experimental neurolog); 1, 491-527. Sampson, J. R. (1984). Biological i~7/ormationprocessing. New York: John Wiley & Sons. Shepherd, G. M. (1979). The synaptic organization of the brain. Oxford: Oxford University Press. Smale, S. (1990). Some remarks on the foundation of numerical analysis. Siam Review. 32, 221. Stannett, M. (1990). X-machines and the halting problem: Building a super-Turing machine. Journal of Formal Aspects of Computing, 2, 331-341.
APPENDIX Proof of Lemma 2. Consider any event E~k occurring when the PPN state is 5', in the evolution ETo(Uo). For this event let rk be the state of element k, Sjk the stimulation and Tk the time since the last stimulation. Since the event is noncritical /'(rk + Tk, ~)k) ¢ R, and no other event occurs at time t. Since f i s C r, there exists an open, connected set Yjk ~-- R × R + with (rk, Tk) E Yjk, such that for all (r, T) E Yjk
.f( r + T, ~)k) ¢ R. Note that Yjk is a set of noncritical perturbations of the event Ejk. Suppose that the event EJk is not an initial event, but was produced by an event E~, s = t - Tk. Using the notation of in the above for the element j, there is an open, connected set Y,j containing (7.j, ~ ) of noncritical perturbations of the event E~. By the definition of a PPN we have that Tk = Djk +
g(f( Tj + Tj, S,j), Si)),
where Djk is the fixed delay of the connection ofelementj to k. Define ¢,j= {(7., T)@ Ytjl(-ck, D,k + g ( f ( r + T, so),so))G Yjk}. Since f a n d g are C' it follows that Y0 is open and contains (%, 7)). Note that ~'0 is a set of noncritical perturbations of the event E,~ such that the resulting perturbation of the event E~k is a noncritical stimulation. Furthermore, it is easily seen that this is also true if we extend the above definition so that Y0 - { ( r, T) @ Y,,[( rk, O,k + g(f( 7" + T, s0), so) ) E Yjk, k G Kj } where ~ is the set of all elements to which the element j connects. Note that for each k ~ ~ there is an event EJk for some t, and hence, Y,j is a set of noncritical perturbations of the event E~ such that the resulting perturbation of the events E~k, k E Kj, are noncritical stimulations. However, it may happen that one of the perturbed events E~koccurs at the same time as some other event. We overcome this difficulty as follows. The event E~k occurs for (r, T) E 17"eat the time
tjk(r, T) = s + Djk + g(f(r + T, So), sa). Note that Ok(rj, Tj) = t and that Ok is C ~ because f a n d g are C . ~ n c e eTo(Uo)is noncritical there exists an open, connected set Zuk ___ Ya, with (rj, ~) E Zok, such that for all (r, T) ~ Zok, Ok(r, T) is not the same as the time of any other event. Note that Zak is a set of noncritical perturbations of the event E~ such that the resulting perturbation of the event E~k is also noncritical. Furthermore, by f a n d g being C', there exists an open, connected set Z~j =_ tJkEKjZijk, with ( 7.j, Tj) ~ Zjk, such that Ok(Z, T) ~ tj,,( 7., T) for all k, m ~ Kj and (7., T) E Z 0. The set Z 0, it is an open, connected, nonempty set of noncritical perturbations oftbe event E~ such that the resulting perturbation of the events E~k, k ~ Kj, are also noncritical. Define Y ~ to be an open connected subset strictly contained in Z 0. Note that the closure of Y ~j is a connected, nonempty set of
Pulse Propagation Networks
215
noncritical perturbations of the event E,~ such that the resulting perturbation of the events EJk, k E Kj, are also noncritical. It easily follows that we may now construct by the preceding technique open, connected sets Y ~") containing (rj, T/) such that any (z, T) in the closure of Y ti~) is a noncritical perturbation of the event E~j and that any event in a chain of n or less events initiated by E,~ is given a noncritical perturbation. By Lemma 1 there exists finite N such that every chain of events (N) in Ero(uo) is of length less than N. Now consider the sets Yo for the initial events Ex]. Let U ~_ S X X be the set initial conditions consistent with the sets Y c~, that is, every initial event is in its corresponding set. The set U contains St and is open and connected. Now choose a function T(u) on U that is greater than the perturbed times of all events of Ero(u0) whose unperturbed times are less than To, and such that T(u) is less than the perturbed times of any unperturbed events occurring after To which is produced by an event of Ero(Uo). The function T is a suspension on U since for any u in the closure of U every event is noncritical, so in particular no event whose unperturbed time was less than To can occur at the same perturbed time as any unperturbed events occurring after To which is produced by an event of Bro(uo), and hence, T(u) can be chosen to be different from the times of these perturbed events and divide perturbed events of Ero(u0) from new events produced by these events. The set U and the function T satisfy the requirements of( 1), that is ET~,~(U) is noncritical for u @ U. In the preceding proof of part ( 1 ), it was shown for every event E~k produced by an event E~ that the perturbed time of the event Ejk for (T, T) E Z,jk is
for v ~ Wr, and furthermore, s is strictly greater than the time of any event in any evolution B(v) for v E U(w), and so, E t can not correspond to any event in an evolution E(v) for v ~ U(w), since there is no C r function t(v) giving the time of a perturbed event which connects the event E~j to an event in an evolution E (v) for v ~ U(w). Consequently, E~j must be a new event, and such a new event can only be produced by a critical event.It
s + Djk + g(f(T + T, sij), s 0)
Once again, if this equation has a solution w E W, then by the implicit function theorem there exists an open neighbourhood Wo C W of w and a codimension 1, C'-manifold M of solutions in W0. Hence, if simple events are critical, then there is a local codimension 1 C'-submanifold M in I'110of evolutions in which the simple event is critical. Note that if w ~5 Wok M then the simple event is not critical. Let E~ be the first critical event in a chain of events in some evolution Br~,)(u), u ~ U. If the state of etement j is not R for this event, then it is a simple event and by the above there exists an open neighbourhood Wo of u and a local codimension 1 C'-submanifold M in W0 of evolutions in which this event E o is critical. Let Br~w)(w), w ~ U, be a critical evolution. By Lemma 1, every evolution Er~,)(u), u ~ U, has finitely many events, and hence only finitely many critical events. Suppose, there are n simple critical events in Br~w)(w), which are the first critical event in any chain of events. By the above there exists an open neighbourhood I4Io of w and n codimension 1 C'-submanifolds Mi, i = 1. . . . . n, for each of these n critical events. Note that w G fq~Mi and fq~Mi is a submanifold with codimension at least 1. Furthermore, note that any critical event is either a simple critical event of the above type, or else its existence is dependent upon a previous critical event of the above type, and hence only occurs in an already defined submanifold of critical evolutions. Since the evolution Bnw)(w) has only finitely many critical events, there are at most finitely many submanifolds of critical evolutions meeting at w. Consequently, in any neighbourhood of w there must exist a noncritical evolution, and hence w is on the boundary of an evolution domain. This proves the first part of the lemma. To show that the boundary of an evolution domain is piecewise C r, we first note that the evolution domain contains only finitely many different events, hence there can be only a finite number of different combinations of critical events in the critical evolutions on the boundary of the evolution domain. Since any of these critical evolutions is either contained in a local codimension 1 C'-submanifold, or and intersection of such manifolds, it follows that the boundary of the evolution domain is composed of a finite number of codimension 1 C'-submanifoldsl
and furthermore it can be seen that the perturbed new state of element j is given by
f(rk + Djk + g ( f ( r + T, so), so), sjk). Since f a n d g are C' function it follows that the both the perturbed new state and time of the event are C' functions oft, T). Furthermore, in any chain of perturbed events the times of the events and new states of the elements are compositions of C" functions and are hence also C r. This establishes (2), (4), and (6). To prove ( 3 ) and ( 5 ) note that any event is produced by another event or it is an initial event. Hence, for any event there is a finite chain of events connecting it to an initial event, and the only way to produce a new event by changing the initial conditions, without changing the initial events, is through a critical event. Since for all u E U, ET~,)(U) is noncritical, the only events in this evolution are perturbations of events in Ero(Uo); This establishes (3). Furthermore, note that a state only changes as a result of an event and hence ( 5 ) follows from ( 3 ).it
Proof of Lemma 3. For any event in Er(vt) there exists a corresponding event in E r ( v 2 ) since there exists v3 ~ Vt fq V2 and ET(V3) has the same events as both Er(vl) and Er(v2). Let ti(v) and st(v) be C ' functions giving the perturbed time of an event and perturbed new state of an element on V~, for i = 1 or 2. By Lemma 2 there exists an open set V3 ~_ Vl fq V2 containing v3 for which there are C ' perturbation functions t3(v) and s3(v) for the above event and state. Necessarily, we have that q(v) = t 3 ( v ) = 12(1)) and Sl(V) = s~(v) = sz(v) on V~. It follows that t~(v) and s,(v) can be extended in a unique way into C r functions on V~ U V2.it Proof of Lemma 4. Suppose that u ~ OU(w) fq Wr and Er(u) is a noncritical evolution. If U' is an open set as in Lemma 2, then U(w) f3 U' 4= ~ and by Lemma 3 U(w) fq U' satisfies Lemma 2 for Er(w), which contradicts U(w) being maximal. Suppose on the other hand that u ~ aU(w) ~ W~r. Then e(u) has an event E$j with t > T. If T < s < t, then T(v) = s is a suspension
Proof of Lemma 5. Let T(u) be a suspension on U. An event E~j will be called a simple event of Erc~)(u), u ~ U, if it is an initial event, or every event in the chain of events that produced it from an initial event is noncritical and the state of element j is not R, that is, the last stimulation was not critical. By arguments identical to parts of Lemma 2, it can be shown that there exists an open neighbourhood W ~_ U of u and C ' functions r j, Tj and tj on W such that, for w W, rj(w) is that state of element j and ~ ( w ) is the time since the last stimulation of element j and tj(w) is time of the event. Furthermore, Wcan be chosen so that rj(w) 4=R on W. A simple event E~j receives a critical stimulation in W i f a n d only if f(zj(w) + Tj(w), so) = R. Since rj(w) ~ R and f, rj, and ~ are C', if this equation has a solution w ~ W, then by the implicit function theorem there exists an open neighbourhood W0 C W of w and a codimension 1, Cr-manifold M of solutions in Wo. Similarly, a simple event E~ is simultaneous with another simple event El,, on W, and is hence critical, if and only if
O(w) = t,,(w).