Journal
of Statistical
Planning
and Inference
33 (1992) 5-25
5
North-Holland
Stochastic methods for neural systems Edward J. Wegman”
and Muhammad
Received
1August 1990; accepted 3 October 1990
Abstract:
This paper is a survey of recent developments
cal analysis of biological processing
nervous
discuss some of the current
system including
modeling and statisti-
of synaptic
potential
including
of information
communications. temporal
We
models. We
modeling and analysis of neural spike trains. From a focus on communications
pairs of neurons,
we turn our attention
with models of networks
Kql words andphrases:
of stochastic
the basic physiology
models of and inference for membrane
discuss some of the stochastic closing our discussion
in the application
and artificial neuron systems. We focus first on a general description
in the central
between individual
K. Habib**
to broader
scale neural network
considerations,
with hidden layers.
Artificial neuron systems; neural spike trains; semi-martingales;
central nervous sys-
tem
1. Introduction
This paper presents a survey of some recent developments in the application of stochastic modeling and statistical analysis of biological and artificial neural systems. Recent advances in neurophysiology and experimental psychology provide statisticians with experimentally generated data about information processing in the nervous system which is amenable to careful statistical modeling and analysis. Experiments in electrical neurophysiology fall naturally into four categories. The first category consists of recording the electrical activity of ionic channels in single neurons. Understanding the dynamics of ionic channel openings and closings under a variety of experimental conditions and in response to different stimuli sheds light on how neurons receive and integrate information at the microscopic level. Yang and Swenberg (1992) apply an estimation method, introduced by Le Cam (1986), in order to estimate important parameters in studies of the electrical behavior of ionic channels. Correspondence to: Dr. Edward Wegman, Center for Computational Statistics, 157 Science-Technology II, George Mason University, Fairfax, VA 22030, USA. * The research of this author was rpontored by the Army Research Office under contract DAAL03-87-K-0087, tional
Science
by the Office
Foundation
of Naval Research
under grant
**The research N00014-89-J-1502.
of this
0378-3758/92/$05.00
0
author
was sponsored
1992-Elsevier
under contract
N00014-89-J-1807
and by the Na-
DMS-90002237.
Science
by the Office
Publishers
of Naval
B.V. All rights
Research
reserved
under
contract
6
E.J. Wegman, M.K. Habib / Neural systems
The second category amine the subthreshold neurons. reflected
consists of experiments of intracellular recordings to exbehavior of the somal membrane potential of single
These studies shed light on neuronal integration of synaptic input as by the difference in voltage across the somal membrane of nerve cells. On
the quantitative side, the somal solution of stochastic differential
membrane equations
potential of a neuron is modeled as a driven by point processes, a Wiener pro-
cess or, in general, by a semi-martingale. These models include parameters which reflect important neurophysiological properties such as the effective somal membrane time constant, amplitudes and frequencies of post-synaptic potentials and the variability of synaptic input. Methods of estimation of these parameters are discussed in Habib and Thavaneswaran (1992). See also Heyde (1992) for an elegant discussion of recent methods of inference for stochastic processes. The third category of experiments is concerned with the extra-cellular recordings of trains of action potentials (or spike trains) generated by neurons spontaneously or in response to stimuli. This is a vigorous area of research in experimental and theoretical neurobiology. The statistical analysis of these spike trains which are modeled as stochastic point processes has attracted much attention of statisticians (see e.g. Brillinger, 1988). In addition to studies of spike trains of single neurons, there has been great interest in the study of the correlation between spike trains of two or more neurons. Quantitative neurophysiological studies of two or more simultaneously recorded spike trains using measures of cross correlation have proven effective in indicating the existence and type of synaptic connections and other sources of functional interaction among neurons in small neural networks. See, for example, Toyama, Kimora and Tanaka (1981) and Michalski et al. (1983). For discussions of the statistical aspects of correlation studies see Habib and Sen (1985) and Aertsen et al., (1989). The fourth category of experiments are those experiments concerned with information represented in large neural networks, using labeling techniques (such as the 2-deoxyglucose or ‘2DG’ method). These experiments provide data about the way in which a sensory stimulus is represented in large neural networks in the sensory area of the brain (see e.g. Juliano et al., 1981). Unfortunately, these types of studies have received comparatively little attention from statisticians. In Section 2, a brief description of information processing in the nervous system is given for the benefit of those readers who are not familiar with this subject. The following sections are concerned with the stochastic modeling and statistical analysis of biological and artificial neural systems.
2. Information
processing
in the nervous system
The central nervous system (CNS) is viewed as a communication system which receives, processes and transmits a large amount of information. Some degree of uncertainty is inherent in the behavior of the neural communication system because
E.J.
of its anatomical
Wegman,
and functional
M.K.
Habib
complexity
/ Neural
systems
and the non-deterministic
7
nature
of the
responses of its components to identical stimuli or experience. This uncertainty (or stochastic variability) is reflected in the electrical behavior of single neurons as well as small and large neural networks. The basic unit of the nervous system which receives and transmits information is the nerve cell or neuron. (See Figure 1 for a schematic diagram of the neuron.) The neuron has three morphological regions: the cell body (or soma), the dendrites and the axon. The soma contains the nucleus and many of the organelles involved in metabolic processes. The dendrites form a series of highly branched protoplasmic tree-like outgrowths from the cell body. The dendrites and the soma are the sites of most specialized junctions where signals are received from other neurons. The axon is typically a protoplasmic extension which exits the soma at the initial segment (or axon hillock). Near its end, the axon branches into numerous axonal terminals, which are responsible for transmitting signals from the neuron to other parts of the system. The junction between two neurons is called the synapse, which is an anatomically specialized junction between two neurons where the electrical activity in one neuron influences the activity of the other. A single motor neuron is the spinal cord receives, probably, some 15 000 synaptic junctions while neurons in the brain may have more than 100000 synapses. The electrical activity in the nervous system is due to the presence of organic as well as inorganic electrically charged ions. Among the important inorganic ions are sodium (Na+), potassium (K+) and chloride (Cl-). These ions are present outside (i.e. in the extracellular fluid) as well as inside the cell. The cell’s membrane is selectively permeable to different ions. This leads to a difference in concentration of ions on both sides of the membrane which in turn leads to a difference in potential across the membrane. The transmembrane potential is regulated by, among other things, active as well as passive membrane transport mechanisms (see e.g. Kuffler and Nicholls, 1976), and is defined by convention as inside minus outside potential. In the absence of synaptic events, the membrane potential is kept at a certain level called the resting potential (which is about - 60 to - 70 mV). The potential difference between the inside and the outside of the membrane of an excitable cell depends on the ionic concentration gradients and the selective permeability of the membrane. The ions are transported across the membrane through structures or pathways which are called channels (such as potassium channels and sodium channels). These channels exist in active and inactive states (it is also said that the channels are closed or open). See, for instance, Neher and Stevens (1977). Ionic channels open and close in a stochastic manner. Transition between these states are usually modeled as a finite-state Markov process. When a chemical synapse is activated due to the arrival of action potentials along the axon of the presynaptic neuron a chemical substance called a neural transmitter is released into the synaptic cleft. The transmitter then crosses the synaptic cleft and combines with the receptor sites of the postsynaptic membrane and produces a change in potential. This potential change is called postsynaptic potential (PSP). If
E.J.
8
Wegman, M.K. Habib / Neural systems
---J-L-
synaptic potential
Fig. 1. Schematic
diagram
vk
of two neurons
-1,; action
with a magnified
V
po entials
L T
image of their synapse.
the PSP results in reducing the potential difference in the post-synaptic membrane (i.e. if the membrane is depolarized) the PSP is called an excitatory post-synaptic potential (EPSP). On the other hand, if the membrane is hyperpolarized, (i.e. the difference in potential across the membrane is increased) as a result of the arrival of the post-synaptic potential, it is called an inhibitory post-synaptic potential (IPSP). Between synaptic events the membrane potential decays exponentially to a resting potential. At a spatially restricted area of the neural soma (where the sodium conductance, per unit area, is high relative to that of the remaining somal membrane), the excitatory and inhibitory post-synaptic potentials are integrated. Now, if the membrane potential reaches a level called the neuron’s threshold (- 30 mV), the membrane undergoes a very rapid, transient change which is known as the action potential. Action potentials may last only 1 ms during which time the membrane potential changes from -60 to about +30 mV and then it returns to the resting potential. After each action potential there is a period of reduced excitability called the refractory period. Action potentials are nearly identical in shape and hence they contain a limited amount of information in their shape of wave form. It is believed then that the temporal patterns of spike trains are the information carriers in many ares in the CNS. The purpose of many studies is to investigate the temporal behavior of spike trains in many areas in the CNS under different controlled conditions, that is, in the presence and absence of external stimuli, and under normal as well as experimentally modified conditions.
E. J. Wegman, M.K. Habib / Neural systems
3. Statistical
analysis
of ion channel
9
activity
As has been mentioned in Section 2, the neuronal membrane is selectively permeable to different ions. That is, the membrane contains different types of ionic channels (or gates) which selectively allow the passage of specific ions across the membrane. When a channel allows the passage of a certain ion, it is said that the channel is open and otherwise it is said to be closed. Tuckwell (1989) briefly discusses a two-state Markov model that describes the transition between states of a single channel. However, it is well established the channel exist in several open and closed substates. One proposed scheme is to assume the channel has n open states, and m closed states, c,,c2, . . ..c.,?. This can schematically be or,o2,-..,o, represented as follows: o,*o,F?...
~o,~c,*c2~~~~~c,,,.
At any time the channel is assumed to be in one of the kinetically distinct states S= {S,,S,} where St = (o,,02, . . . . 0,) and S2= {c,,c2, . . . . c,,}. Let X, denote the kinetic state of the channel at time t. The level of conductance of the channel can be described by u(t) =
1 0 i
if X,ES,, if X,ES2.
Yang and Swenberg (1992) studied the kinetic transitions the observed current record B(t) of N channels where
of ionic channels
through
B(t)= f Uj(t), j=l
and addressed
4. Stochastic
some general
questions
models of and inference
of model
identifiability.
for the membrane
potential
Theoretical and experimental aspects of neuronal integration of synaptic input as reflected by the difference in voltage across the somal membrane of nerve cells have been extensively studied. Among the important factors influencing synaptic integration mechanisms are the geometry of the post-synaptic neuron (Rail, 1977), the spatial organization, the temporal patterns of synaptic inputs, and their amplitudes. For instance, Rall (1964) showed that the somatic potential is critically dependent on the timing of synaptic inputs arranged in an orderly sequence of distances from the soma of the post-synaptic neuron, the temporal patterns of activation of the synapses, and the shapes and amplitudes of the post-synaptic potentials (PSP). In this section we discuss models of the subthreshold behavior of the somal membrane potential; that is, its behavior between the time it is equal to a resting potential up to the time the potential reaches the neuron’s threshold.
E. J. Wegman, M. K. Habib / Neural systems
10
There is an extensive
literature
concerning
stochastic
models of membrane
poten-
tial of neurons (Johannesma (1968), Ricciardi and Sacerdote (1979), Cope and Tuckwell (1979), Kallianpur (1983)). These models are descriptive in the sense that they are not studied along with real data in order to investigate certain aspects of neuronal synaptic integrations, but rather, these models relate the subthreshold behavior of the transmembrane potential to important neurophysiological parameters. The so-called leaky integrator model of membrane potential appears to be more realistic than the rest of the models. In the leaky integrator model, the fact that the membrane potential decays between synaptic events is taken into account. A parameter representing the membrane time constant is then included in the model. In addition, the post-synaptic potentials (PSP) are modeled as point events, i.e. events occurring randomly in time according to Poisson process models. The excitatory post-synaptic potentials (EPSP) are assumed to arrive at the soma according to independent Poisson processes. Similar assumptions are made concerning the inhibitory post-synaptic potentials (IPSP). Parameters representing the rates of arrival as well as the amplitudes of the EPSPs and IPSPs are included. It is clear, then, that estimating these parameters from real data by recording intracellularly the voltage trajectory of the somal membrane of neurons should shed light on the important aspects of the mechanisms which underlie neuronal integration of synaptic input. These quantitative studies of neuronal integration lend themselves to studies of the neural basis of higher brain functions, such as learning and memory. For example, parameters of the models may be estimated under different conditions during the period of alteration of synaptic connectivity due to normal experience through the critical period. Another application is to estimate these parameters before and after applying experimental paradigms based on the principals of classical conditioning. We believe that advanced methods of inference of stochastic processes along with appropriate stochastic modeling should enhance the results of experimental studies. 5. Temporal
stochastic
It has conventionally
neuronal
models
been assumed
that the synaptic
inputs
to a neuron
can be
treated as inputs delivered to a single summing point on the neuron’s surface (the axon hillock). That such assumption is restrictive is clearly indicated by the wellestablished anatomical observation that several types of neurons in the CNS have extensively branched dendritic receptive surfaces and that synaptic inputs occur both on the somatic region and the dendrites. However, the temporal models are appropriate for modeling experimentally generated intracellular recordings of the membrane potential in the absence of any experimental information of the spatial aspects of synaptic input. The temporal stochastic model of the subthreshold behavior of the somal membrane potential we will discuss below is based on the following assumptions and experimentally established neurophysiological observations.
E. J. Wegman,
M.K.
Hobib
/ Neural systems
11
1. The subthreshold state of the neuron is assumed to be characterized by the difference in potential across its somal membrane (membrane potential) near a spatially restricted area of the soma at which the sodium conductance, per unit area, is high relative to that of the remaining somal membrane. This area is thought to be the action potential (or spike) generating area and is frequently called the trigger zone (axon hillock). The membrane potential at any time t is modeled as a stochastic process, V(f), defined on a probability space (Q, Cq, P). V(t) is assumed to be subject to instantaneous changes due to the occurrence of (idealized) post-synaptic potentials (PSP) of two different types excitatory post-synaptic potentials (EPSP) and inhibitory post-synaptic potentials (IPSP). In the absence of post-synaptic activity, the membrane potential decays exponentially with rate Q. Therefore an incremented decay A V(t) during a small time interval At is given by A V’(f) = -Q V(f)4 t. 2. There exist thousands of synapses on surfaces of certain types of neurons. For example, there exist on the order of 20000 synapses on the surface of a typical motorneuron. The number of synapses may increase to 100000 for some types of sensory neurons. In response to a stimulus, a rather small number of synapses are activated. Tanaka (1983) established that approximately 10 to 30 synapses are activated on the surface of some types of simple and complex cells in the visual cortex in response to certain visual stimuli. The rest of the synapses, though, may still be active due to the spontaneous or other activity of the presynaptic neurons projecting to the neuron under study. We, therefore, model two types of synaptic activity. The first post-synaptic input received due to stimulus presentation, we assume that there are n, excitatory synapses and n2 inhibitory synapses. The EPSPs are assumed to arrive according to stochastic point processes N;(t), with stochastic intensities A:(t) and stochastic amplitudes (or potential displacements) a;(t), k = 1,2, . . . , n, . Similarly, the IPSPs are assumed to arrive according to stochastic point processes N;(t), with intensities AL(t) and amplitudes a’,(t), k = 1,2, . . . , n2. It is also assumed that this synaptic input is summed linearly at the trigger zone. 3. The rest of synaptic input is lumped together and it is assumed that their potential displacements are small in magnitude and occur frequently and independently. This input then may be modeled by a diffusion process driven by a standard Wiener process I+‘(t). See Kallianpur (1983). Based on the stated experimental observations and theoretical model V(t) as the solution of the following stochastic differential
asumptions, equation.
we
dI’(t)=@-eV(t))dt+odW(t) nI +
C u;(r) dN,(t) - 2 a;(r) dN;(r). k=l
k=l
(5.1)
The special case that a:(t) = Q~[V,’ - V(t)] (a;(t) = GL[V~ - V(t)]) takes into account an important physiological property, namely reversal potentials. It is well established experimentally that the amplitudes of the post-synaptic potentials depend on the pre-existing value of the membrane potential. It is also well established
12
E.J.
Wegman, M.K. Habib / Neural systems
that the arrival of an action potential at a presynaptic terminal causes a release of a transmitter substance (for the cerebral cortex this could be a variety of substances including acetylcholine, glutamate, or glycine). In any case, a transmitter’s action on the neuronal membrane at a given synaptic junction can be characterized by means of the experimentally observable reversal potential. This is the membrane potential at which the observed change in membrane potential caused by transmitter induced conductance change is zero. Reversal potential have been utilized in deterministic modeling of neuronal membranes (Rall, 1964). It is desirable to extend model (5.1) in various directions in order to enhance its applicability. For instance, we may replace the term p -Q V(t) by a continuous functionf=f(V(t), t) which may not necessarily be linear in I/. The Wiener process W may be replaced by a second order martingale, MC, with continuous trajectories. This results in the following more general model: dV(t) =f(V(t),
t) dL(t) + a(t) dMC(t) + i
G(L’(t-), u)N( du, dt),
(5.2)
where N is a linear combination of compensated point processes representing the stimulus evoked synaptic potentials, and U is a measurable space. In the special case when M= W and N is a compensated Poisson process (i.e. and N(dtl, dr) = N(du, dr) - p(du) dr; with EN(du, dt) = p(du) dt), Kallianpur Wolpert (1984) discussed the existence and uniqueness of a strong solution of (5.2). Another important characteristic of central nervous system information processing is the dependence of both the magnitude and time course of the post-synaptic potential evoked by a given synapse on the spatial location of the active junction. This important feature is not considered in most existing stochastic models of single neurons, which have concerned themselves only with the influences of temporal summation of synaptic inputs. More specifically, as mentioned earlier, it has conventionally been assumed that the synaptic inputs to a neuron can be treated as inputs delivered to a single summing point on the neuron’s surface (triggering zone). That such an assumption is unjustified is clearly indicated by the well-established anatomical fact that a great number of the neurons in the CNS have extensively branched dendritic receptive surfaces, and that synaptic inputs may occur both on the somatic region and the dendrites. Another common assumption is that synapses located on distal dendritic branches have little effect on the spike initiation zone of a neuron. According to this view, distally-located synapses would merely set the overall excitability of the neuron and would be ineffective in generating neural discharge activity. Synapses located near the soma of a neuron, on the other hand, are widely believed directly and strongly to influence neuronal firing behavior. A major extension of this view was suggested by Rall (1977), based on calculations of passive electronic current spread through the dendritic tree. Rall’s work showed that distal synapses can play a functionally much more interesting role than previously assumed. More specifically, if the synaptic input to the dendrite has the appropriate
E.J.
spatio-temporal
Wegman,
characteristics,
A4.h’. Habib
distal
13
/ Neural systems
synapses
can influence
neuronal
firing
to a
much greater extent than is predicted on the basis of their dendritic location. In view of Rail’s demonstration and in recognition of the suggestions (based on experimental evidence) that such a mechanism plays an important role in feature-extraction by single sensory neurons (Fernald, 1971), it seems necessary to carry out modeling studies to evaluate the potential for different spatial distributions of synaptic inputs to influence sensory neuron behavior. The stochastic model (5.1) may be extended then to incorporate the important feature of synaptic spatial distribution. This extension is based on Rail’s model neuron (Rail, 1977). In Rail’s model neuron the cable properties of a system of branched dendrites are reduced to a one-dimensional equivalent dendrite. Considering the nerve cell as a line segment of finite length L, the subthreshold behavior of the membrane’s potential, V(t,x) may be modeled as a solution of the following stochastic partial differential equation: a2 v(t, X)
a2 +
tp >
dt+adW(t,x)
2 C$a(X-X;)[V,:(x)-v(t,x)]
dN(A;(t),x,t)
j=l
-k;,cfX.a(x-x#$x)
- V(t,x>] dN($(f),x,
t),
(5.3)
where 8 is the delta function (or delta distribution), and x; (xi) is the location of the excitatory (inhibitory) synaptic inputs which occur according to independent pointprocesses with rates 1; (Ai) and amplitudes of o; (c$),j= 1,2, . . . . nl; k= 1,2, . . . ,n,. The solution of (5.3) is a stochastic process, { V(t,x), OO}, which can be represented as a stochastic integral with respect to N(r; x). This model may be further extended along the same lines of model (5.2). See Habib and Thavaneswaran (1990) for a discussion of parameter estimation for models similar to (5.2) and (5.3).
6. Statistical
analysis
of spike trains
The transmission of trains of action potentials (or spike trains) between neurons is the primary means of communication of information in the nervous system. There exists an extensive body of literature on experimental and quantitative studies of the properties of extracellularly recorded spike trains, using microelectrodes, from single or multiple neurons. The goals of these studies are to elucidate the different functional roles of neurons in various areas in the nervous system in encoding and transmitting information in response to stimuli and to analyze patterns of spike trains generated by the neurons that are spontaneously active (i.e. activity in the absence of stimulation). The statistical analysis of stimulus evoked or spontaneously
14
E.J.
Wegman, M.K. Habib / Neural systems
generated spike trains may lead to estimates of physiological and anatomical properties of interest. For example, in sensory areas of the brain, such as the visual, auditory and somatosensory cortices, it is of great interest to determine the characteristics of patterns of neuronal responses to different stimuli, and the feature-extracting capabilities of the cortical neurons must be assessed with respect to each stimulus parameters (such as direction of movement, orientation, and velocity of each stimulus). This will lead to understanding the neuronal coding and information representation concerning the various stimulus parameters. For experimental studies of spike trains activity of single neurons see Whitsel et al. (1972). On the quantitative side there has been a large volume of studies in which spike trains were modeled as realizations of stochastic point processes, see for example Perkel, Gerstein and Moore (1967), Fienberg (1974), Yang and Chen (1978), Cope and Tuckwell (1979), De Kwaadsteniet (1982). Only weakly stationary stochastic point process models have been investigated and only stationary segments of spike trains have been selected in most published studies. An exception is the work of Johnson and Swami (1983) who proposed a non-stationary point process model to describe the spike activity of single auditory nerve fibers. The associated counting process of their model possessed a multiplicative random intensity. Studied was the effect of the refractory period on the post-stimulus time histogram. The studies mention so far were concerned with single unit behavior. Quantitative neurophysiological studies of two or more simultaneously recorded spike trains using measures of cross-correlation and related statistical techniques have been proven effective in indicating the existence and type of synaptic connections and other sources of functional interaction among studied neurons. For some of the influential theoretical studies of association between simultaneously recorded spike trains see Perkel, Gerstein and Moore (1967), Gerstein and Perkel (1972), Palm, Aertsen and Gerstein (1988). On the experimental side see Toyama, Kimura and Tanaka (1981), Michalski et al. (1983), Tanaka (1983), Bach and Kruger (1986). In these studies the spike trains were assumed to be jointly weakly stationary. This stringent assumption is not likely to hold in reality, in particular for stimulus driven neurons. The incorporation of non-stationary processes is then crucial to studies of discharge spike activity of neurons driven by external stimuli, see Johnson and Swami (1983) for a discussion of certain classes of neurons in the auditory cortex that fire in a non-stationary fashion. Non-stationary models are particularly suitable for studies of neuronal aspects which change due to experience. Habib and Sen (198.5) proposed the cross-correlation surface to study the non-stationary association between simultaneously recorded spike trains. Borisyuk et al. (1985) considers a non-stationary model of spike trains in which the counting process has a multiplicative random intensity with finite dimensional parameters. A similar model was proposed by Chornoboy, Schramm and Karr (1988). A discussion of the dynamics of neuronal firing correlation for non-stationary spike trains was recently presented by Aertsen et al. (1989). Let us consider non-stationary point process models of spike trains. Let
E. J. Uiegman, M.K. Habib / Neural systems
T,, T2, T,, . . . be the (idealized)
times of the occurrence
of action
15
potentials
extra-
cellularly recorded from a certain neuron. These times are modeled as a realization of a point process defined on a complete probability space (Q, @, P). Let N(f) be the counting process associated with the sequence {T,}, i.e. N(t) counts the number
of spikes in the interval
(0, f], with N(0) = 0, or
where IA is the indicator of the set A. Let {3+-,,06 t< T} be a non-decreasing family of sub-a-fields of 9 and define the internal history of N(r) by Xl = cz{N(s), 0
N(t)=il(f)+M(t),
where M(r) is a martingale, i.e. E[M(t)]
if we assume
that N satisfies
the two conditions
P{NQ+dt)-N(f)=1
1Xl}
=A(t)dr+o(dt),
P(N(f+df)-N(f)32
1X,} =O(nt),
and then E[N(t+df)-N(t) and hence n(t) may be obtained
1X,]=~(r)~f+O(~r), as
In this case /z is called the conditional intensity of N. Modeling ,I is of great importance from a practical point of view. For example, if the neuron fires slowly and it is believed that it recovers before the next action potential is generated, it may be appropriate then to assume that A(f) is X0-measurable, i.e. A(I) does not depend on N(f). In this special case, N(t) is a doubly stochastic Poisson process. This model was employed by Habib and Sen (1985) in modeling non-stationary spike trains. Borisyuk et al. (1985) modeled J. by
(6.1) where i, is a base line process, Z(t) is a vector process which represents the influence of the pre-synaptic neurons projecting the neuron of interest and B is a finite-
16
dimensional Chornoboy,
E.J.
parameter Schramm
Wegman,
M.K.
Habib
/ Neural
systems
which represents the degree of influence or synaptic and Karr (1988) considered the following model:
where La is a constant
representing
spontaneous
activity,
weights.
N,, j = 1,2, . . . , J, is the in-
put received from the pre-synaptic neurons and h,(r) are kernels representing the synaptic weights. An interesting model of d which reflects both the influence of a limited number of influential cells (those pre-synaptic cells which are activated in response to a stimulus) as well as a large number of the spontaneously active presynaptic cells may be given by dA(r)=a
dt+p
dM(t)+
;
W’ dNj(t),
(6.3)
j=l
where M(t) is a martingale with continuous trajectory. In this case we have a finite number of parameters to estimate. See Habib (1985) for the estimation of a and p with M assumed to be a Wiener process in the absence of the Wj’s. Habib and Thavaneswaran (1992) considered the case where M is a second order martingale, but still in the absence of the point process input. The general case (6.3) has not yet been treated in the literature. Another interesting model for I is one in which A is modeled by ‘II n(t) = A, +
I TO
h(t -s)
dN(s) + ” g(t -s) \ CO
dX(s),
(6.4)
where {X(t)} may be either an observed point process or a mixed process. In (6.4), A depends on N(t) and, hence, extends the previously discussed models for A. In (6.4) h and g may be assumed to be square-integrable functions on [0,7’] with the inner product
(f,s>=
“SWgW du.
i .O
In this case h, g are infinite-dimensional parameters, i.e. functions. The method of maximum likelihood fails in this case since the likelihood function is unbounded in h and g. The method of sieves, anticipated in Wegman (1975) and developed by Grenander (1981), provides a way to estimate h and g. The idea behind the method of sieves of estimating parameters in an infinitedimensional space is to construct a nested sequence of suitable finite-dimensional subspaces of the parameter space. The likelihood function is the maximized on the finite-dimensional sieves yielding a sequence of ‘sieve’ estimators. This is accomplished in such a way that as the dimension of the sieves increases (at a proper speed), the sieve estimators exhibit the desirable asymptotic properties such as strong consistency and appropriate asymptotic distribution (see e.g. Nguyen and Pham, 1982).
E. J. Wegman, M. K. Habib / Neural systems
sequences
A,,
Consider
as sieves
increasing
dimensional
subspaces
of L2[0, T] with dimension
and
B,,
17
n = 1,2, . . . , of
finite-
d, and d; such that A, c A,,
I
and B, c B,,, ,, n = 1,2, . . . , and U ,,,, A,, and U,,, B, are dense in L2[0, T]. Let {&, I$~, . . . . Qd,,} and {u/,, v2, . . . . I,u~;,} be bases i/n A,, and B, respectively, for /7=1,2,.... Now the projection of the function h(t) and g(f) on A, and B, respectively are given by /P)(t)=
; h,&(t) i= I
and d;l
g’W=
c g;vjw. J=I
Now, the finite-dimensional parameters { hl, h,, . . . , hd,) and (gl, g2, . . . , gdr;) can be estimated using the methods of maximum likelihood or the method of optimal estimating functions (see e.g. Thavaneswaran and Thompson (1986), Habib and Thavaneswaran (1990)). McKeague (1986) established strong consistency of the sieves estimator for a model similar to the one considered here under the condition that d,, = O(n) and d; = O(n).
7. Stochastic neural networks The subject of neural networks (or artificial neural systems) has recently received considerable attention from scientists in several disciplines such as theoretical and experimental neurobiology, psychology, physics, computer science, electrical and computer engineering, linguistics, etc. Statisticians and mathematicians are yet to make their mark on this important area of research. Neural networks model biological systems (e.g. visual and auditory systems and their applications to robotics), cognitive simulation (e.g. learning machines), computation and optimization theory and pattern recognition. Existing neural networks consist of a set of (extremely simple) neuron-like computational units that are connected to one another through (synaptic) junctions. The efficacy of these junctions is controlled by the synaptic weights (which emulate the amplitudes of post-synaptic potentials in biological neurons). These weights are assumed to be modifiable through experience or learning in order to enhance or improve the performance of the network. There are many different types of networks, most notably associators, which are single (or double) layer networks of neural elements that are fully connected to each other (see e.g. Rosenblatt (1959), Hopfield (1982), Hopfield and Tank (1986), Anderson, (1986) and Wacholder, Han and Mann (1989)). The second type of network is called a pattern classifier network. These are multilayer networks in which all connections are feed-forward connections between layers where the layers between the input and output layers are called the hidden layers (see e.g. Grossberg
18
E.J.
Wegman, M.K. Habib / Neural systems
(1982, 1987), Rumelhart and McClelland (1986), Sejnowski and Kienker (1986), Fukushima (1988), Kohonen and Makisara (1989), White (1989) and Alkon et al. (1990)). Before discussing these two types of networks in some detail and proposing several extensions of the models governing the dynamical states of the units, we briefly discuss two pioneering neuronal models, namely the McCulloch-Pitts model and the perceptron of Rosenblatt. McCulloch and Pitts (1943) proposed one of the earliest models to represent the dynamical state of a neuron. In this model, the neuron is modeled as a binary device; that is, it can be in only one of two possible states, (0 or l}, say. This emulates the firing behavior of biological neurons decribed in terms of trains of action potentials (or spikes) generated by the neuron. The neuron receives inhibitory and excitatory synaptic input; however, it cannot fire if it receives inhibitory synaptic input. This simplifying assumption clearly violates well-known observations concerning the firing mechanisms of real neurons. The artificial neuron then multiplies the synaptic input by a certain weight and linearly sums the input. This linear combination of synaptic input then is compared to a threshold, and if it exceeds the threshold it fires (i.e. it exhibits the state 1 and otherwise it stays in state zero). That is, if x,, i= 1,2, . . ..n. are the input projected within active period of length S to the neuron and w,~; i,j = 1,2, . . . , n, are the corresponding synaptic weights connecting neuron j to neuron i, then the state of the neuron can be modeled by Yj(t)=
I
~ WijX;(t)> /
i#j
ej
)
(7.1)
I
where I is the indicator function and ej is the neuronal threshold. The authors showed that such elements could be used to construct sequential networks that can perform combinations of logical operations of high degree of complexity. This model generated considerable interest following its presentation and became known as the McCulloch-Pitts model. Rosenblatt (1959) generalized the McCulloch-Pitts model by adding a learning rule and introduced a three layers network called the perceptron. This model was built to emulate some functions of the visual system. The visual system roughly consists of three areas. The first is the retina which is sensory part of the system that receives information from visual stimuli. The retina then projects its output signals to an association area called the lateral geniculate nucleus (LGN) that in turn projects to the visual cortex. Rosenblatt designs his perceptron as follows: a number of units in a region in the retina projected to a single A-unit (association unit) in the LGN and in turn the association units projected to R-units (or response units) in the visual cortex. The goal of a trained perceptron is the activation of the appropriate R-unit in response to a given input or stimulus. For simplicity, it was assumed that only one R-unit be active at a time. To guarantee this a set of reciprocal inhibitory connections was used, so that an R-unit inhibited all the Aunits that did not project to it. Therefore, when an R-unit was activated, it indirectly
19
E. J. Wegman, M. K. Habib / Neural systems
suppressed
its competitor
the scientific
community
7.1. Network
R-units.
The perceptron
also created
great excitement
in
at the time.
with no hidden layers
Hopfield (1982) proposed a neural network that consists of McCulloch-Pitts threshold units which is capable of solving certain optimization problems. The external
state (or output) Xi=Z
1
of each neuron
1
WijX,+Zi~Bi
,
L itj
was modeled
by
i= 1,2, . . . . n,
(7.2)
where I is the indicator function, wij is the weight of the synapse connecting neuron i to neuron j,~, is the input received from neurons external to the network, and 0; is a neuronal threshold. The global performance of the network is measured by an energy function E, of the form
E= -iC
C i+j
WoX;Xj
C Z;X; +
C
(7.3)
B;X;.
i
i
The objective is to incrementally modify the synaptic weights Wij in order to minimize the energy or objective function E which is developed in such a way that the global minimum of E corresponds to the optimal solution of the problem at hand. The increments of wU are modeled by A Wij= E[X;Xj ])
(7.4)
where the average, E, is taken over a part time period. Substituting for w,~ in (7.3) by wiJ + A Wij will produce an increment Ax, which induces a change AE in E as follows:
AE=+f Hopfield
i=l
[- C
(1982) imposed
i#j
W,Xj+Z;-B;]AX;.
the condition
that wij = wj, for all i and j to guarantee
a
solution. This is the steepest descent algorithm which may lead to local minima of E (i.e. a suboptimal solution) rather than the global one. In order to get around this difficulty the algorithm is then implemented several times with different initial values for the synaptic weights until a satisfactory solution is obtained. Hopfield (1984) extended this network to take into account the internal states of the neurons. The internal state of each neuron was modeled by the differential equation Ci
($ >=-s+ 1
where the external xj
=d"j),
C WijXj+Zi,
(7.5)
j#i
state xi of the j-th
neuron
is modeled
as
(7.6)
E.J.
20
where
g is a sigmoid
Wegman, M.K. Habib / Neural systems
function,
resistance and capacitance solution of the differential
e.g.
g(u) = l/(1 +e-“),
of the neuron. equation
The energy
Rj and function
Cj represent
the
in this case is the
(7.7) Together with the boundedness of E, equation (7.7) shows that the time evolution of the system is a motion in state space that seeks out minima of E. Notice that the system is purely deterministic and hence the solutions of the differential equations governing the internal states of the units are functions of the arbitrary values initially chosen for the synaptic weights. Hopfield and Tank (1986) used this network to store several combinatorial optimization problems, among them the traveling salesman problem (TSP). See also Wacholder et al. (1989). In order to improve the performance of the Hopfield network in the search for the optimal solution or global minimum Levy and Adams (1987) used a hybrid of the method of simulated annealing, discussed by Geman and Hwang (1982), and the neural approach of Hopfield and Tank (1986). They proposed the following diffusion model of the internal state of each neuron N
du. /=
-ui+
dt
C
r;
r=,
w,-Xj+Z;+~COsh
where Zi(r),i= 1,2, . . . . N, are independent zero-mean Gaussian cesses, ,I and T are parameters and r plays the rule of temperature nealing. As r tends slowly to 0 the algorithm then leads to a global The performance of this network was evaluated by Cervantes (1987). We propose the following model to describe the internal state
dui(t)=
-‘u;(t)+t ri
wij(t) dNj(t) + dMi(t),
(7.8) white noise proin simulated anminimum of E. and Hildebrant of each neuron (7.9)
j=l
where Nj(t) is the counting process that counts the number of action potentials (spikes) generated by the j-th (pre-synaptic) neuron in the time interval (0, t), Wij are the synaptic weights, and M,(t) are continuous trajectory martingales that describe the collective input received from neurons external to the network. Model (7.9) extends all previously proposed model neurons and provides flexibility in the modeling process. For example, the counting processes Nj(l) may be assumed to be homogeneous Poisson processes or more generally counting processes with stochastic intensities n,(t). These intensities may be modeled in a meaningful way to take advantage of specific features which the problem at hand may have. The martingales Mi(l) may be taken to be Wiener processes if one can appeal to a central limit theorem derived argument. Hopfield’s learning rule (7.4) is an example of correlation-based rules which are motived by the well-known Hebb’s hypothesis of synaptic plasticity. Hebb’s
E.J.
hypothesis
(Hebb,
21
Wegman, M.K. Habib / Neural systems
1949) basically
states
that
if the firing
pattern
of a neuron
A
repeatedly contributes to the firing of neuron B, then the weight of the synapse joining A to B increases. This hypothesis has been modified to include the reverse hypothesis also, that is, where the firing patterns of A and B are frequently uncorrelated the synaptic weight decreases. Another example is the Delta learning (or Widrow-Hoff) rule with a teacher. This rule is used of neural networks involved in pattern recognition. That is, if there is target activation o;(r) provided by the actual pattern
or teacher
the synaptic
weights
are modified
according
to the rule
ow,=cw[u;(r)-U;(l)]X,(f).
(7.10)
The amount of learning is proportional to the difference between the actual activation level of the network at a certain time t and the desired activation level provided by the teacher. Another important learning rule is the competitive learning rule without a teacher, (see e.g. von der Malsburg (1973), Crossberg (1982), Fukushima (1975, 1978), Fukushima and Miyaki (1982), Kohonen, (1984)). 7.2. Network
with hidden layers
Neural networks without hidden layers have proven to be useful in a wide variety of applications. A characteristic feature of such networks is that they map similar input patterns into similar output patterns. This constraint in performance may lead to an inability of these networks to learn certain mappings from input to output. Minsky and Papert (1969) argued that the addition of hidden layers allows the network to develop internal representations of the input patterns in the hidden units that support any required mapping from input to output units. Rumelhart, Hinton and Williams (1986) developed a learning rule which they called the generalized learning rule to govern the modification of the synaptic weights of the output as well as the hidden layers. To be specific, consider a neural network with one hidden layer of m units, one input layer of n unit and an output layer of one unit. The activation level (or external state) of each hidden unit, hi, j = 1,2, . . . , m, is assumed to be a non-linear function ly of a weighted sum of inputs x= (x1,x2, . . . ,x,JT: j=
1,2 ,...,m,
(7.11)
where vij is the synaptic weight of the synapse connecting the i-th input unit to the j-th hidden unit and w, is a non-decreasing differentiable (or sigmoid) function. Also, assume that the activation level (or external state) of the output unit is given by
111
Y=F
(7.12)
C Xj(VTx)P 9 > ( j=I
where F is another Y=f(.G@,
sigmoid
function.
To simplify
notation,
we rewrite
(7.12) as
22
E. J. Wegman, M. K. Habib / Neural systems
where 0=(/I’, modify 8:
v’). Rumelhart
e, = s, where
et al. (1986) proposed
, + a rf’(x,
e, _ ,)( Y, -f(x,
Y is the vector of the target output
the following
learning
rule to
(7.13)
8, _ I)),
and 01 is the learning
rate. This recursive
method of modifying 0 is called the method of back-propagation. White (1989) used methods of stochastic approximation to prove that 0, converges with probability 1 to an optimal solution 0” or en --t 00 with probability 1. White also established the asymptotic normality of Q,,. The dynamic model (7.12) may be extended to accommodate more realistic features of biological neurons as follows: let u;(t) be the internal state (or subthreshold potential) of an output (or a hidden) unit in the network which is modeled as a solution of the stochastic differential equation duj(t)=pj
dt-Q;uJt)
dt+
where wti is the weight of synapse i= 1,2, . ..) n, are counting processes analytical tractability and w,(t) is a in general). The recursive estimates calculated as in Thavaneswaran and
i wij dN;(t)+Gj i=l between
dWj(t),
the i-th
unit
and j-th
unit.
Nj(t),
which may be assumed to be independent for Weiner process (or a second order martingale, of the synaptic weights in this case may be Habib (1988).
References Aertsen,
A.M.,
modulation Alkon,
D.L.,
Cl. Gerstein,
M.K. Habib
of ‘Effective
Connectivity’.
K.T. Blackwell,
an artificial Anderson,
network
and Cl. Palm (1989). Dynamics J. Neurophysiol.
G.S. Barbour,
derived
firing correlation:
A.K. Rigler and T.P. Vogl (1990). Pattern-recognition
from biological
neural
capabilities
of a parallel
J.A. (1986). Cognitive
of neuronal
61, 900-917. by
Biol. Cybernet. 62, 363-376.
systems. system.
In: E. Bienenstock,
E. Fegelman,
F.
Soulie and G. Weisbuck, Eds., Disordered Systems and Biological Organization. Springer, New York. Bach, M. and J. Kruger (1986). Correlated neuronal variability in monkey visual cortex revealed by a multi-electrode. Borisyuk,
G.N.,
Exp. Brain Res. 61, 451-456. R.M.
Borisyuk,
statistical method for Cybernet. 52, 301-306.
A.B.
identifying
Brillinger, D.R. (1988). Maximum Cybernet. 59, 189-200.
Kirillov,
E.I.
interconnections likelihood
Kovalenko
and V.I.
between
analysis
neuronal
of spike trains
Kryukov network
of interacting
(1985).
‘A new
elements.’
Biol.
nerve cells. Biol.
Cervantes, J.H. and R.R. Hildebrant (1987). Comparison of three neuron-based computation Proceedings of the IEEE First International Conference on Neural Networks, 657-667.
schemes.
Chornoboy, E.S., point processes.
of neural
L.P. Schramm and A.F. Karr (1988). Maximum Biol. Cybernet. 59, 265-275.
Cope, D.K. and H.C. Tuckwell (1979). Firing rates of neurons J. Theoret. Biol. 80, 1-14. De Kwaadsteniet, J.W. (1982). Statistical analysis and stochastic
likelihood
with random modeling
identification excitation
and inhibition.
of neural spike train activity.
Math. Biosci. 60, 17-71. Fernald,
R.D. (1971). A neuron
model with spatially
distributed
synaptic
input.
Biophys. J. 11, 323-340.
23
E. J. Wegman, M.K. Habib / Neural systems Fienberg,
S.E. (1974). Stochastic
Fukushima,
models
K. (1975). Cognitron:
for single neurons
a self-organizing
firing
Biomerrics 30, 399-427. Biol. Cybernet. 20,
trains.
multi-layered
neural
network.
121-136. Fukushima,
K. (1978).
Self-organizing
neural
network
with a function
associative
memory:
feedback
Biol. Cybernet. 28, 201-208.
cognitron. Fukushima,
K. (1988). Neocognitron:
a hierarchical
neural
network
capable
of visual pattern
recogni-
tion. Neural Networks 1, 119-130. Fukushima,
K. and S. Miyaki
(1982). Neocognitron:
a new algorithm
for pattern
recognition.
fatiern
Recognifion 1.5,455-469. Geman,
S. and G.R. Hwang
(1982). Nonparametric
likelihood
estimation
by the methods
of sieves. Ann.
Star&. IO, 401-414. Gerstein,
G.L.
and D.H.
Perkel
(1972).
Mutual
temporal
relationships
among
neuronal
spike trains.
Biophys. J. 12, 453-473. Grenander, U. (1981). Abstracf Inference. John Wiley, New York. Grossberg, S. (1982). Studies of Mind and Brain: Neural Principles of Learning, Perception, Cognition and Motor Control. Reidel, Boston. Grossberg, S. (1987). Competitive learning: from interactive activation to adaptive resonance. Cognitive Sci. Ser. 11, 23-63. Habib, M.K. (1985). Parameter estimation for randomly stopped process and neuronal modeling. Mimeo Series No. 1492, Chapel Hill, Institute of Statistics, The University of North Carolina. Habib,
M.K. and P.K. Sen (1985). Non-stationary
with applications
stochastic
and Environmental Sciences, Elsevier/North-Holland, Habib,
M.K.
point-process
models
in neurophysiology
In: P.K. Sen, Ed., Biostatistics: Statistics in Biomedical, Public Health
to learning.
and A. Thavaneswaran
(1990).
Amsterdam,
Inference
481-509.
for stochastic
neuronal
models.
Appl. Math.
Comput. 38, 51-73. Habib,
M.K. and A. Thavaneswaran
(1992). Optimal
estimation
for semi-martingale
neuronal
models.
J. Statist. Plann. Inference 33, 143-156. (This issue.) Hebb,
D. (1949).
Heyde,
Organization of Behavior. John Wiley, New York.
C.C. (1992). New developments
in inference
for temporal
stochastic
J. Stafist. Plann.
processes.
Inference 33, 121-129. (This issue.) Hopfield,
J.J.
(1982).
Neural
Hopfield,
J.J. (1984). Neurons
of two-state Hopfield,
neurons.
J.J.
Johannesma, Johnson,
sory
and physical
systems
with emergent
collective
computational
with graded
response
have collective
computational
properties
like those
Proc. Nat. Acad. Sci. U.S.A. 81, 3088-3092.
and D.W. Tank
(1986). Computing
P. (1968). Diffusion
Neural Networks, Springer, terns. Juliano,
networks
Proc. Nat. Acad. Sri. U.S.A. 79, 2554-2558.
abilities.
with neural
models for the stochastic
New York,
circuits:
activity
a model.
of neurons.
Science 233, 625-633.
In: E.K. Caraniello,
D.H. and A. Swami (1983). The transmission
of signals by auditory-nerve
fiber discharge
S., P.J. cortex
Kallianpur, In: P.K.
Holland
and B. Whitsel (1981). Patterns
of increased
Macaca fascicularis, subjected study. J. Neurophysiol. 46, 1260-1284.
of monkeys
metabolic
to controlled
activity
cutaneous
G. (1983). On the diffusion approximation to a discontinuous model Sen, Ed., Contributions to Statistics. North-Holland, Amsterdam.
in somatosenstimulation:
T. (1984). Self-Organization and Associative Memory. Springer,
a
for a single neuron.
Kallianpur, G. and R. Wolpert (1984). Infinite dimensional stochastic differential equation models spatially distributed neurons. Appl. Math. Opfim. 12, 125-172. Kuffler, S.W. and I.G. Nicholls (1976). From Neuron to Brain. Sinaur, Sunderland, MA. Kohonen, Le Cam,
pat-
J. Acoust. Sot. Amer. 74, 493-501.
2-deoxyglucose
Kohonen,
Ed.,
116-144.
Berlin.
T. and K. Makisara (1989). The self-organizing feature maps. Phys. Ser. 39, 168-172. L. (1986). Statistical Methods in Asymptotic Decision Theory. Springer, New York.
for
24
E.J.
Levy, B.C. and M.B. Adams
Wegman, M.K. Habib / Neural systems
(1987). Global
optimization
with stochastic
of the IEEE International Conference on Neural Networks. Malsburg, C. von der (1973). Self-organization of orientation Kybernetik 14, 85-100. McCulloch, W.S. and W. Pitts (1943). A logical
calculus
neural
sensitive
Proceeding
networks.
cells in the striate
of ideas imminent
in nervous
cortex,
activity.
Bull.
Math. Siophys 5, 115-133. McKeague,
I.W. (1986). Estimation
for semi-martingale
of sieves. Ann. Statist.
model using the method
13, 579-589. Michalski,
A., G.L.
cortex Minsky,
Gerstein,
S. Czarkowska
and T. Tarnecki
(1983). Interactions
between
cat striate
Exp. Brain Res. 51, 97-107.
neurons.
(1969). ferceptrons.
M. and S. Papert
Neher, E. and C.F. Stevens (1977). Conductance
MIT Press,
Cambridge,
fluctuations
and ionic pores in membranes.
MA. Ann. Rev.
Biophys. Bioeng. 6, 345-381. Nguyen,
H.T. and T.D. Pham
(1982). Identification
of non-stationary
diffusion
models by rhe method
of sieves. SIAM J. Control Optim. 20, 603-611. Palm,
G., A.M.H.J.
neuronal
Aertsen
and G.L.
Gerstein
Biol. Cybernet.
spike trains.
(1988).
On the significance
Perkel, D.H., G.L. Gerstein and G.P. Moore (1967). Neuronal cesses II. Simultaneous spike trains. Biophys. J. 7, 419-440. Rall, W. (1964). Theoretical
of correlations
among
59, 1-l 1.
significance
of dendritic
Reiss, Ed., Neural Theory and Modeling,
spike trains
trees for neuronal
Stanford
University
and stochastic
input-output
Press,
point pro-
relations.
Stanford,
In: R.
CA.
Rail, W. (1977). Core conductor theory and cable properties. In: Handbook System I, Vol. I. American Physiological Society, Bethesda, MO.
of Physiology. The Nervous
Ricciardi,
as a model for neuronal
L.M. and L. Sacerdote
(1979). The Ornstein-Uhlenbeck
Rosenblatt,
F. (1959). Two theorems
Processes: Proceedings Rumelhart,
D.E.,
pagation. Rumelhart,
G.E. Hinton
and J.L.
ac-
of Cognition, and P.K.
Thavaneswaran,
MIT Press,
Vol. II: Applications.
(1986). Learning
A. and M.K. Habib
analysis
Cambridge,
symmetry
groups
MIT Press,
with hidden
of
geniculate
neuronal
ExploraMA. in the
Cambridge.
units:
beyond
the
relationships
(1988). Recursive
estimation
for semi-martingales.
in
cats.
J.
J. Appl. Math.
19, 1901-1909.
Thavaneswaran, A. and M.E. Thompson bab. 23, 409-417. Toyama, K., M. Kimura and K. Tanaka in cat visual cortex. Tuckwell, H.C. Mathematics,
J. Neurophysiol.
(1986). Optimal
estimation
(1981). Cross-correlation
for semi-martingales. analysis
of interneuronal
(1989). Stochastic Processes in the Neural Sciences. Society Philadelphia, PA.
E., J. Han and R.C. Mann
H. (1989). J. Amer.
Some asymptotic
J. Appl. Proconnectivity
40, 191-201.
(1989). A neural
salesman problem. Biol. Cybernet. 61, 11-19. Wegman, E.J. (1975). Maximum likelihood estimation 21 I-224. models.
by error pro-
(1986) Eds., Parallel Distributed Processing: Explorations
Tanaka, K. (1983). Cross-correlation Neurophysiol. 49, 1303-1318.
Wacholder,
I, 1.
representation
Parallel Distributed Processing:
Vol. I.: Foundations,
Vol. I: Foundations,
Kienker
Eds.,
internal
Physica D 22, 260-275.
perceptrons.
Comput.
Mechanization of Thought
in perceptron.
(1986). Learning
McClelland,
of Cognition,
D.E. and J.L. McClelland T.J.
separability
and R.J. Williams
In: D.E. Rumelhart
Microstructure Sejnowski,
of statistical
of a Symposium Held at the National Physical Laboratory,
tions in the Microstructure
White,
process
Biol. Cybernet. 35, l-9.
tivity.
results
for learning
Statist. Assoc. 84. 1003-1013.
network
for Industrial
algorithm
of a probability
for the multiple
density.
in single hidden-layer
and Applied traveling
Sankhya A Ser. 37, feed-forward
network
E.J.
Whitsel,
B., J. Roppolo
primate
Wegman,
and G. Werner
skin. J. Neurophysiol.
M.K.
Habib
(1972). Cortical
/ Neural systems
25
information
processing
of stimulus
in neuronal
spike-train
analysis.
motion
Yang, G.L. and T.C. Chen (1978). On statistical
methods
Math.
Biosci.
38, l-34. Yang,
G.L.
in channel
and C.E.
Swenberg
experiments.
on
35, 691-717.
(1992). Estimation
J. Statist.
Plann.
of open dwell time and problems
Inference
33, 107-l 19. (This issue.)
of identifiability