Stochastic methods for neural systems

Stochastic methods for neural systems

Journal of Statistical Planning and Inference 33 (1992) 5-25 5 North-Holland Stochastic methods for neural systems Edward J. Wegman” and Muham...

1MB Sizes 3 Downloads 128 Views

Journal

of Statistical

Planning

and Inference

33 (1992) 5-25

5

North-Holland

Stochastic methods for neural systems Edward J. Wegman”

and Muhammad

Received

1August 1990; accepted 3 October 1990

Abstract:

This paper is a survey of recent developments

cal analysis of biological processing

nervous

discuss some of the current

system including

modeling and statisti-

of synaptic

potential

including

of information

communications. temporal

We

models. We

modeling and analysis of neural spike trains. From a focus on communications

pairs of neurons,

we turn our attention

with models of networks

Kql words andphrases:

of stochastic

the basic physiology

models of and inference for membrane

discuss some of the stochastic closing our discussion

in the application

and artificial neuron systems. We focus first on a general description

in the central

between individual

K. Habib**

to broader

scale neural network

considerations,

with hidden layers.

Artificial neuron systems; neural spike trains; semi-martingales;

central nervous sys-

tem

1. Introduction

This paper presents a survey of some recent developments in the application of stochastic modeling and statistical analysis of biological and artificial neural systems. Recent advances in neurophysiology and experimental psychology provide statisticians with experimentally generated data about information processing in the nervous system which is amenable to careful statistical modeling and analysis. Experiments in electrical neurophysiology fall naturally into four categories. The first category consists of recording the electrical activity of ionic channels in single neurons. Understanding the dynamics of ionic channel openings and closings under a variety of experimental conditions and in response to different stimuli sheds light on how neurons receive and integrate information at the microscopic level. Yang and Swenberg (1992) apply an estimation method, introduced by Le Cam (1986), in order to estimate important parameters in studies of the electrical behavior of ionic channels. Correspondence to: Dr. Edward Wegman, Center for Computational Statistics, 157 Science-Technology II, George Mason University, Fairfax, VA 22030, USA. * The research of this author was rpontored by the Army Research Office under contract DAAL03-87-K-0087, tional

Science

by the Office

Foundation

of Naval Research

under grant

**The research N00014-89-J-1502.

of this

0378-3758/92/$05.00

0

author

was sponsored

1992-Elsevier

under contract

N00014-89-J-1807

and by the Na-

DMS-90002237.

Science

by the Office

Publishers

of Naval

B.V. All rights

Research

reserved

under

contract

6

E.J. Wegman, M.K. Habib / Neural systems

The second category amine the subthreshold neurons. reflected

consists of experiments of intracellular recordings to exbehavior of the somal membrane potential of single

These studies shed light on neuronal integration of synaptic input as by the difference in voltage across the somal membrane of nerve cells. On

the quantitative side, the somal solution of stochastic differential

membrane equations

potential of a neuron is modeled as a driven by point processes, a Wiener pro-

cess or, in general, by a semi-martingale. These models include parameters which reflect important neurophysiological properties such as the effective somal membrane time constant, amplitudes and frequencies of post-synaptic potentials and the variability of synaptic input. Methods of estimation of these parameters are discussed in Habib and Thavaneswaran (1992). See also Heyde (1992) for an elegant discussion of recent methods of inference for stochastic processes. The third category of experiments is concerned with the extra-cellular recordings of trains of action potentials (or spike trains) generated by neurons spontaneously or in response to stimuli. This is a vigorous area of research in experimental and theoretical neurobiology. The statistical analysis of these spike trains which are modeled as stochastic point processes has attracted much attention of statisticians (see e.g. Brillinger, 1988). In addition to studies of spike trains of single neurons, there has been great interest in the study of the correlation between spike trains of two or more neurons. Quantitative neurophysiological studies of two or more simultaneously recorded spike trains using measures of cross correlation have proven effective in indicating the existence and type of synaptic connections and other sources of functional interaction among neurons in small neural networks. See, for example, Toyama, Kimora and Tanaka (1981) and Michalski et al. (1983). For discussions of the statistical aspects of correlation studies see Habib and Sen (1985) and Aertsen et al., (1989). The fourth category of experiments are those experiments concerned with information represented in large neural networks, using labeling techniques (such as the 2-deoxyglucose or ‘2DG’ method). These experiments provide data about the way in which a sensory stimulus is represented in large neural networks in the sensory area of the brain (see e.g. Juliano et al., 1981). Unfortunately, these types of studies have received comparatively little attention from statisticians. In Section 2, a brief description of information processing in the nervous system is given for the benefit of those readers who are not familiar with this subject. The following sections are concerned with the stochastic modeling and statistical analysis of biological and artificial neural systems.

2. Information

processing

in the nervous system

The central nervous system (CNS) is viewed as a communication system which receives, processes and transmits a large amount of information. Some degree of uncertainty is inherent in the behavior of the neural communication system because

E.J.

of its anatomical

Wegman,

and functional

M.K.

Habib

complexity

/ Neural

systems

and the non-deterministic

7

nature

of the

responses of its components to identical stimuli or experience. This uncertainty (or stochastic variability) is reflected in the electrical behavior of single neurons as well as small and large neural networks. The basic unit of the nervous system which receives and transmits information is the nerve cell or neuron. (See Figure 1 for a schematic diagram of the neuron.) The neuron has three morphological regions: the cell body (or soma), the dendrites and the axon. The soma contains the nucleus and many of the organelles involved in metabolic processes. The dendrites form a series of highly branched protoplasmic tree-like outgrowths from the cell body. The dendrites and the soma are the sites of most specialized junctions where signals are received from other neurons. The axon is typically a protoplasmic extension which exits the soma at the initial segment (or axon hillock). Near its end, the axon branches into numerous axonal terminals, which are responsible for transmitting signals from the neuron to other parts of the system. The junction between two neurons is called the synapse, which is an anatomically specialized junction between two neurons where the electrical activity in one neuron influences the activity of the other. A single motor neuron is the spinal cord receives, probably, some 15 000 synaptic junctions while neurons in the brain may have more than 100000 synapses. The electrical activity in the nervous system is due to the presence of organic as well as inorganic electrically charged ions. Among the important inorganic ions are sodium (Na+), potassium (K+) and chloride (Cl-). These ions are present outside (i.e. in the extracellular fluid) as well as inside the cell. The cell’s membrane is selectively permeable to different ions. This leads to a difference in concentration of ions on both sides of the membrane which in turn leads to a difference in potential across the membrane. The transmembrane potential is regulated by, among other things, active as well as passive membrane transport mechanisms (see e.g. Kuffler and Nicholls, 1976), and is defined by convention as inside minus outside potential. In the absence of synaptic events, the membrane potential is kept at a certain level called the resting potential (which is about - 60 to - 70 mV). The potential difference between the inside and the outside of the membrane of an excitable cell depends on the ionic concentration gradients and the selective permeability of the membrane. The ions are transported across the membrane through structures or pathways which are called channels (such as potassium channels and sodium channels). These channels exist in active and inactive states (it is also said that the channels are closed or open). See, for instance, Neher and Stevens (1977). Ionic channels open and close in a stochastic manner. Transition between these states are usually modeled as a finite-state Markov process. When a chemical synapse is activated due to the arrival of action potentials along the axon of the presynaptic neuron a chemical substance called a neural transmitter is released into the synaptic cleft. The transmitter then crosses the synaptic cleft and combines with the receptor sites of the postsynaptic membrane and produces a change in potential. This potential change is called postsynaptic potential (PSP). If

E.J.

8

Wegman, M.K. Habib / Neural systems

---J-L-

synaptic potential

Fig. 1. Schematic

diagram

vk

of two neurons

-1,; action

with a magnified

V

po entials

L T

image of their synapse.

the PSP results in reducing the potential difference in the post-synaptic membrane (i.e. if the membrane is depolarized) the PSP is called an excitatory post-synaptic potential (EPSP). On the other hand, if the membrane is hyperpolarized, (i.e. the difference in potential across the membrane is increased) as a result of the arrival of the post-synaptic potential, it is called an inhibitory post-synaptic potential (IPSP). Between synaptic events the membrane potential decays exponentially to a resting potential. At a spatially restricted area of the neural soma (where the sodium conductance, per unit area, is high relative to that of the remaining somal membrane), the excitatory and inhibitory post-synaptic potentials are integrated. Now, if the membrane potential reaches a level called the neuron’s threshold (- 30 mV), the membrane undergoes a very rapid, transient change which is known as the action potential. Action potentials may last only 1 ms during which time the membrane potential changes from -60 to about +30 mV and then it returns to the resting potential. After each action potential there is a period of reduced excitability called the refractory period. Action potentials are nearly identical in shape and hence they contain a limited amount of information in their shape of wave form. It is believed then that the temporal patterns of spike trains are the information carriers in many ares in the CNS. The purpose of many studies is to investigate the temporal behavior of spike trains in many areas in the CNS under different controlled conditions, that is, in the presence and absence of external stimuli, and under normal as well as experimentally modified conditions.

E. J. Wegman, M.K. Habib / Neural systems

3. Statistical

analysis

of ion channel

9

activity

As has been mentioned in Section 2, the neuronal membrane is selectively permeable to different ions. That is, the membrane contains different types of ionic channels (or gates) which selectively allow the passage of specific ions across the membrane. When a channel allows the passage of a certain ion, it is said that the channel is open and otherwise it is said to be closed. Tuckwell (1989) briefly discusses a two-state Markov model that describes the transition between states of a single channel. However, it is well established the channel exist in several open and closed substates. One proposed scheme is to assume the channel has n open states, and m closed states, c,,c2, . . ..c.,?. This can schematically be or,o2,-..,o, represented as follows: o,*o,F?...

~o,~c,*c2~~~~~c,,,.

At any time the channel is assumed to be in one of the kinetically distinct states S= {S,,S,} where St = (o,,02, . . . . 0,) and S2= {c,,c2, . . . . c,,}. Let X, denote the kinetic state of the channel at time t. The level of conductance of the channel can be described by u(t) =

1 0 i

if X,ES,, if X,ES2.

Yang and Swenberg (1992) studied the kinetic transitions the observed current record B(t) of N channels where

of ionic channels

through

B(t)= f Uj(t), j=l

and addressed

4. Stochastic

some general

questions

models of and inference

of model

identifiability.

for the membrane

potential

Theoretical and experimental aspects of neuronal integration of synaptic input as reflected by the difference in voltage across the somal membrane of nerve cells have been extensively studied. Among the important factors influencing synaptic integration mechanisms are the geometry of the post-synaptic neuron (Rail, 1977), the spatial organization, the temporal patterns of synaptic inputs, and their amplitudes. For instance, Rall (1964) showed that the somatic potential is critically dependent on the timing of synaptic inputs arranged in an orderly sequence of distances from the soma of the post-synaptic neuron, the temporal patterns of activation of the synapses, and the shapes and amplitudes of the post-synaptic potentials (PSP). In this section we discuss models of the subthreshold behavior of the somal membrane potential; that is, its behavior between the time it is equal to a resting potential up to the time the potential reaches the neuron’s threshold.

E. J. Wegman, M. K. Habib / Neural systems

10

There is an extensive

literature

concerning

stochastic

models of membrane

poten-

tial of neurons (Johannesma (1968), Ricciardi and Sacerdote (1979), Cope and Tuckwell (1979), Kallianpur (1983)). These models are descriptive in the sense that they are not studied along with real data in order to investigate certain aspects of neuronal synaptic integrations, but rather, these models relate the subthreshold behavior of the transmembrane potential to important neurophysiological parameters. The so-called leaky integrator model of membrane potential appears to be more realistic than the rest of the models. In the leaky integrator model, the fact that the membrane potential decays between synaptic events is taken into account. A parameter representing the membrane time constant is then included in the model. In addition, the post-synaptic potentials (PSP) are modeled as point events, i.e. events occurring randomly in time according to Poisson process models. The excitatory post-synaptic potentials (EPSP) are assumed to arrive at the soma according to independent Poisson processes. Similar assumptions are made concerning the inhibitory post-synaptic potentials (IPSP). Parameters representing the rates of arrival as well as the amplitudes of the EPSPs and IPSPs are included. It is clear, then, that estimating these parameters from real data by recording intracellularly the voltage trajectory of the somal membrane of neurons should shed light on the important aspects of the mechanisms which underlie neuronal integration of synaptic input. These quantitative studies of neuronal integration lend themselves to studies of the neural basis of higher brain functions, such as learning and memory. For example, parameters of the models may be estimated under different conditions during the period of alteration of synaptic connectivity due to normal experience through the critical period. Another application is to estimate these parameters before and after applying experimental paradigms based on the principals of classical conditioning. We believe that advanced methods of inference of stochastic processes along with appropriate stochastic modeling should enhance the results of experimental studies. 5. Temporal

stochastic

It has conventionally

neuronal

models

been assumed

that the synaptic

inputs

to a neuron

can be

treated as inputs delivered to a single summing point on the neuron’s surface (the axon hillock). That such assumption is restrictive is clearly indicated by the wellestablished anatomical observation that several types of neurons in the CNS have extensively branched dendritic receptive surfaces and that synaptic inputs occur both on the somatic region and the dendrites. However, the temporal models are appropriate for modeling experimentally generated intracellular recordings of the membrane potential in the absence of any experimental information of the spatial aspects of synaptic input. The temporal stochastic model of the subthreshold behavior of the somal membrane potential we will discuss below is based on the following assumptions and experimentally established neurophysiological observations.

E. J. Wegman,

M.K.

Hobib

/ Neural systems

11

1. The subthreshold state of the neuron is assumed to be characterized by the difference in potential across its somal membrane (membrane potential) near a spatially restricted area of the soma at which the sodium conductance, per unit area, is high relative to that of the remaining somal membrane. This area is thought to be the action potential (or spike) generating area and is frequently called the trigger zone (axon hillock). The membrane potential at any time t is modeled as a stochastic process, V(f), defined on a probability space (Q, Cq, P). V(t) is assumed to be subject to instantaneous changes due to the occurrence of (idealized) post-synaptic potentials (PSP) of two different types excitatory post-synaptic potentials (EPSP) and inhibitory post-synaptic potentials (IPSP). In the absence of post-synaptic activity, the membrane potential decays exponentially with rate Q. Therefore an incremented decay A V(t) during a small time interval At is given by A V’(f) = -Q V(f)4 t. 2. There exist thousands of synapses on surfaces of certain types of neurons. For example, there exist on the order of 20000 synapses on the surface of a typical motorneuron. The number of synapses may increase to 100000 for some types of sensory neurons. In response to a stimulus, a rather small number of synapses are activated. Tanaka (1983) established that approximately 10 to 30 synapses are activated on the surface of some types of simple and complex cells in the visual cortex in response to certain visual stimuli. The rest of the synapses, though, may still be active due to the spontaneous or other activity of the presynaptic neurons projecting to the neuron under study. We, therefore, model two types of synaptic activity. The first post-synaptic input received due to stimulus presentation, we assume that there are n, excitatory synapses and n2 inhibitory synapses. The EPSPs are assumed to arrive according to stochastic point processes N;(t), with stochastic intensities A:(t) and stochastic amplitudes (or potential displacements) a;(t), k = 1,2, . . . , n, . Similarly, the IPSPs are assumed to arrive according to stochastic point processes N;(t), with intensities AL(t) and amplitudes a’,(t), k = 1,2, . . . , n2. It is also assumed that this synaptic input is summed linearly at the trigger zone. 3. The rest of synaptic input is lumped together and it is assumed that their potential displacements are small in magnitude and occur frequently and independently. This input then may be modeled by a diffusion process driven by a standard Wiener process I+‘(t). See Kallianpur (1983). Based on the stated experimental observations and theoretical model V(t) as the solution of the following stochastic differential

asumptions, equation.

we

dI’(t)=@-eV(t))dt+odW(t) nI +

C u;(r) dN,(t) - 2 a;(r) dN;(r). k=l

k=l

(5.1)

The special case that a:(t) = Q~[V,’ - V(t)] (a;(t) = GL[V~ - V(t)]) takes into account an important physiological property, namely reversal potentials. It is well established experimentally that the amplitudes of the post-synaptic potentials depend on the pre-existing value of the membrane potential. It is also well established

12

E.J.

Wegman, M.K. Habib / Neural systems

that the arrival of an action potential at a presynaptic terminal causes a release of a transmitter substance (for the cerebral cortex this could be a variety of substances including acetylcholine, glutamate, or glycine). In any case, a transmitter’s action on the neuronal membrane at a given synaptic junction can be characterized by means of the experimentally observable reversal potential. This is the membrane potential at which the observed change in membrane potential caused by transmitter induced conductance change is zero. Reversal potential have been utilized in deterministic modeling of neuronal membranes (Rall, 1964). It is desirable to extend model (5.1) in various directions in order to enhance its applicability. For instance, we may replace the term p -Q V(t) by a continuous functionf=f(V(t), t) which may not necessarily be linear in I/. The Wiener process W may be replaced by a second order martingale, MC, with continuous trajectories. This results in the following more general model: dV(t) =f(V(t),

t) dL(t) + a(t) dMC(t) + i

G(L’(t-), u)N( du, dt),

(5.2)

where N is a linear combination of compensated point processes representing the stimulus evoked synaptic potentials, and U is a measurable space. In the special case when M= W and N is a compensated Poisson process (i.e. and N(dtl, dr) = N(du, dr) - p(du) dr; with EN(du, dt) = p(du) dt), Kallianpur Wolpert (1984) discussed the existence and uniqueness of a strong solution of (5.2). Another important characteristic of central nervous system information processing is the dependence of both the magnitude and time course of the post-synaptic potential evoked by a given synapse on the spatial location of the active junction. This important feature is not considered in most existing stochastic models of single neurons, which have concerned themselves only with the influences of temporal summation of synaptic inputs. More specifically, as mentioned earlier, it has conventionally been assumed that the synaptic inputs to a neuron can be treated as inputs delivered to a single summing point on the neuron’s surface (triggering zone). That such an assumption is unjustified is clearly indicated by the well-established anatomical fact that a great number of the neurons in the CNS have extensively branched dendritic receptive surfaces, and that synaptic inputs may occur both on the somatic region and the dendrites. Another common assumption is that synapses located on distal dendritic branches have little effect on the spike initiation zone of a neuron. According to this view, distally-located synapses would merely set the overall excitability of the neuron and would be ineffective in generating neural discharge activity. Synapses located near the soma of a neuron, on the other hand, are widely believed directly and strongly to influence neuronal firing behavior. A major extension of this view was suggested by Rall (1977), based on calculations of passive electronic current spread through the dendritic tree. Rall’s work showed that distal synapses can play a functionally much more interesting role than previously assumed. More specifically, if the synaptic input to the dendrite has the appropriate

E.J.

spatio-temporal

Wegman,

characteristics,

A4.h’. Habib

distal

13

/ Neural systems

synapses

can influence

neuronal

firing

to a

much greater extent than is predicted on the basis of their dendritic location. In view of Rail’s demonstration and in recognition of the suggestions (based on experimental evidence) that such a mechanism plays an important role in feature-extraction by single sensory neurons (Fernald, 1971), it seems necessary to carry out modeling studies to evaluate the potential for different spatial distributions of synaptic inputs to influence sensory neuron behavior. The stochastic model (5.1) may be extended then to incorporate the important feature of synaptic spatial distribution. This extension is based on Rail’s model neuron (Rail, 1977). In Rail’s model neuron the cable properties of a system of branched dendrites are reduced to a one-dimensional equivalent dendrite. Considering the nerve cell as a line segment of finite length L, the subthreshold behavior of the membrane’s potential, V(t,x) may be modeled as a solution of the following stochastic partial differential equation: a2 v(t, X)

a2 +

tp >

dt+adW(t,x)

2 C$a(X-X;)[V,:(x)-v(t,x)]

dN(A;(t),x,t)

j=l

-k;,cfX.a(x-x#$x)

- V(t,x>] dN($(f),x,

t),

(5.3)

where 8 is the delta function (or delta distribution), and x; (xi) is the location of the excitatory (inhibitory) synaptic inputs which occur according to independent pointprocesses with rates 1; (Ai) and amplitudes of o; (c$),j= 1,2, . . . . nl; k= 1,2, . . . ,n,. The solution of (5.3) is a stochastic process, { V(t,x), OO}, which can be represented as a stochastic integral with respect to N(r; x). This model may be further extended along the same lines of model (5.2). See Habib and Thavaneswaran (1990) for a discussion of parameter estimation for models similar to (5.2) and (5.3).

6. Statistical

analysis

of spike trains

The transmission of trains of action potentials (or spike trains) between neurons is the primary means of communication of information in the nervous system. There exists an extensive body of literature on experimental and quantitative studies of the properties of extracellularly recorded spike trains, using microelectrodes, from single or multiple neurons. The goals of these studies are to elucidate the different functional roles of neurons in various areas in the nervous system in encoding and transmitting information in response to stimuli and to analyze patterns of spike trains generated by the neurons that are spontaneously active (i.e. activity in the absence of stimulation). The statistical analysis of stimulus evoked or spontaneously

14

E.J.

Wegman, M.K. Habib / Neural systems

generated spike trains may lead to estimates of physiological and anatomical properties of interest. For example, in sensory areas of the brain, such as the visual, auditory and somatosensory cortices, it is of great interest to determine the characteristics of patterns of neuronal responses to different stimuli, and the feature-extracting capabilities of the cortical neurons must be assessed with respect to each stimulus parameters (such as direction of movement, orientation, and velocity of each stimulus). This will lead to understanding the neuronal coding and information representation concerning the various stimulus parameters. For experimental studies of spike trains activity of single neurons see Whitsel et al. (1972). On the quantitative side there has been a large volume of studies in which spike trains were modeled as realizations of stochastic point processes, see for example Perkel, Gerstein and Moore (1967), Fienberg (1974), Yang and Chen (1978), Cope and Tuckwell (1979), De Kwaadsteniet (1982). Only weakly stationary stochastic point process models have been investigated and only stationary segments of spike trains have been selected in most published studies. An exception is the work of Johnson and Swami (1983) who proposed a non-stationary point process model to describe the spike activity of single auditory nerve fibers. The associated counting process of their model possessed a multiplicative random intensity. Studied was the effect of the refractory period on the post-stimulus time histogram. The studies mention so far were concerned with single unit behavior. Quantitative neurophysiological studies of two or more simultaneously recorded spike trains using measures of cross-correlation and related statistical techniques have been proven effective in indicating the existence and type of synaptic connections and other sources of functional interaction among studied neurons. For some of the influential theoretical studies of association between simultaneously recorded spike trains see Perkel, Gerstein and Moore (1967), Gerstein and Perkel (1972), Palm, Aertsen and Gerstein (1988). On the experimental side see Toyama, Kimura and Tanaka (1981), Michalski et al. (1983), Tanaka (1983), Bach and Kruger (1986). In these studies the spike trains were assumed to be jointly weakly stationary. This stringent assumption is not likely to hold in reality, in particular for stimulus driven neurons. The incorporation of non-stationary processes is then crucial to studies of discharge spike activity of neurons driven by external stimuli, see Johnson and Swami (1983) for a discussion of certain classes of neurons in the auditory cortex that fire in a non-stationary fashion. Non-stationary models are particularly suitable for studies of neuronal aspects which change due to experience. Habib and Sen (198.5) proposed the cross-correlation surface to study the non-stationary association between simultaneously recorded spike trains. Borisyuk et al. (1985) considers a non-stationary model of spike trains in which the counting process has a multiplicative random intensity with finite dimensional parameters. A similar model was proposed by Chornoboy, Schramm and Karr (1988). A discussion of the dynamics of neuronal firing correlation for non-stationary spike trains was recently presented by Aertsen et al. (1989). Let us consider non-stationary point process models of spike trains. Let

E. J. Uiegman, M.K. Habib / Neural systems

T,, T2, T,, . . . be the (idealized)

times of the occurrence

of action

15

potentials

extra-

cellularly recorded from a certain neuron. These times are modeled as a realization of a point process defined on a complete probability space (Q, @, P). Let N(f) be the counting process associated with the sequence {T,}, i.e. N(t) counts the number

of spikes in the interval

(0, f], with N(0) = 0, or

where IA is the indicator of the set A. Let {3+-,,06 t< T} be a non-decreasing family of sub-a-fields of 9 and define the internal history of N(r) by Xl = cz{N(s), 0
N(t)=il(f)+M(t),

where M(r) is a martingale, i.e. E[M(t)]
if we assume

that N satisfies

the two conditions

P{NQ+dt)-N(f)=1

1Xl}

=A(t)dr+o(dt),

P(N(f+df)-N(f)32

1X,} =O(nt),

and then E[N(t+df)-N(t) and hence n(t) may be obtained

1X,]=~(r)~f+O(~r), as

In this case /z is called the conditional intensity of N. Modeling ,I is of great importance from a practical point of view. For example, if the neuron fires slowly and it is believed that it recovers before the next action potential is generated, it may be appropriate then to assume that A(f) is X0-measurable, i.e. A(I) does not depend on N(f). In this special case, N(t) is a doubly stochastic Poisson process. This model was employed by Habib and Sen (1985) in modeling non-stationary spike trains. Borisyuk et al. (1985) modeled J. by

(6.1) where i, is a base line process, Z(t) is a vector process which represents the influence of the pre-synaptic neurons projecting the neuron of interest and B is a finite-

16

dimensional Chornoboy,

E.J.

parameter Schramm

Wegman,

M.K.

Habib

/ Neural

systems

which represents the degree of influence or synaptic and Karr (1988) considered the following model:

where La is a constant

representing

spontaneous

activity,

weights.

N,, j = 1,2, . . . , J, is the in-

put received from the pre-synaptic neurons and h,(r) are kernels representing the synaptic weights. An interesting model of d which reflects both the influence of a limited number of influential cells (those pre-synaptic cells which are activated in response to a stimulus) as well as a large number of the spontaneously active presynaptic cells may be given by dA(r)=a

dt+p

dM(t)+

;

W’ dNj(t),

(6.3)

j=l

where M(t) is a martingale with continuous trajectory. In this case we have a finite number of parameters to estimate. See Habib (1985) for the estimation of a and p with M assumed to be a Wiener process in the absence of the Wj’s. Habib and Thavaneswaran (1992) considered the case where M is a second order martingale, but still in the absence of the point process input. The general case (6.3) has not yet been treated in the literature. Another interesting model for I is one in which A is modeled by ‘II n(t) = A, +

I TO

h(t -s)

dN(s) + ” g(t -s) \ CO

dX(s),

(6.4)

where {X(t)} may be either an observed point process or a mixed process. In (6.4), A depends on N(t) and, hence, extends the previously discussed models for A. In (6.4) h and g may be assumed to be square-integrable functions on [0,7’] with the inner product

(f,s>=

“SWgW du.

i .O

In this case h, g are infinite-dimensional parameters, i.e. functions. The method of maximum likelihood fails in this case since the likelihood function is unbounded in h and g. The method of sieves, anticipated in Wegman (1975) and developed by Grenander (1981), provides a way to estimate h and g. The idea behind the method of sieves of estimating parameters in an infinitedimensional space is to construct a nested sequence of suitable finite-dimensional subspaces of the parameter space. The likelihood function is the maximized on the finite-dimensional sieves yielding a sequence of ‘sieve’ estimators. This is accomplished in such a way that as the dimension of the sieves increases (at a proper speed), the sieve estimators exhibit the desirable asymptotic properties such as strong consistency and appropriate asymptotic distribution (see e.g. Nguyen and Pham, 1982).

E. J. Wegman, M. K. Habib / Neural systems

sequences

A,,

Consider

as sieves

increasing

dimensional

subspaces

of L2[0, T] with dimension

and

B,,

17

n = 1,2, . . . , of

finite-

d, and d; such that A, c A,,

I

and B, c B,,, ,, n = 1,2, . . . , and U ,,,, A,, and U,,, B, are dense in L2[0, T]. Let {&, I$~, . . . . Qd,,} and {u/,, v2, . . . . I,u~;,} be bases i/n A,, and B, respectively, for /7=1,2,.... Now the projection of the function h(t) and g(f) on A, and B, respectively are given by /P)(t)=

; h,&(t) i= I

and d;l

g’W=

c g;vjw. J=I

Now, the finite-dimensional parameters { hl, h,, . . . , hd,) and (gl, g2, . . . , gdr;) can be estimated using the methods of maximum likelihood or the method of optimal estimating functions (see e.g. Thavaneswaran and Thompson (1986), Habib and Thavaneswaran (1990)). McKeague (1986) established strong consistency of the sieves estimator for a model similar to the one considered here under the condition that d,, = O(n) and d; = O(n).

7. Stochastic neural networks The subject of neural networks (or artificial neural systems) has recently received considerable attention from scientists in several disciplines such as theoretical and experimental neurobiology, psychology, physics, computer science, electrical and computer engineering, linguistics, etc. Statisticians and mathematicians are yet to make their mark on this important area of research. Neural networks model biological systems (e.g. visual and auditory systems and their applications to robotics), cognitive simulation (e.g. learning machines), computation and optimization theory and pattern recognition. Existing neural networks consist of a set of (extremely simple) neuron-like computational units that are connected to one another through (synaptic) junctions. The efficacy of these junctions is controlled by the synaptic weights (which emulate the amplitudes of post-synaptic potentials in biological neurons). These weights are assumed to be modifiable through experience or learning in order to enhance or improve the performance of the network. There are many different types of networks, most notably associators, which are single (or double) layer networks of neural elements that are fully connected to each other (see e.g. Rosenblatt (1959), Hopfield (1982), Hopfield and Tank (1986), Anderson, (1986) and Wacholder, Han and Mann (1989)). The second type of network is called a pattern classifier network. These are multilayer networks in which all connections are feed-forward connections between layers where the layers between the input and output layers are called the hidden layers (see e.g. Grossberg

18

E.J.

Wegman, M.K. Habib / Neural systems

(1982, 1987), Rumelhart and McClelland (1986), Sejnowski and Kienker (1986), Fukushima (1988), Kohonen and Makisara (1989), White (1989) and Alkon et al. (1990)). Before discussing these two types of networks in some detail and proposing several extensions of the models governing the dynamical states of the units, we briefly discuss two pioneering neuronal models, namely the McCulloch-Pitts model and the perceptron of Rosenblatt. McCulloch and Pitts (1943) proposed one of the earliest models to represent the dynamical state of a neuron. In this model, the neuron is modeled as a binary device; that is, it can be in only one of two possible states, (0 or l}, say. This emulates the firing behavior of biological neurons decribed in terms of trains of action potentials (or spikes) generated by the neuron. The neuron receives inhibitory and excitatory synaptic input; however, it cannot fire if it receives inhibitory synaptic input. This simplifying assumption clearly violates well-known observations concerning the firing mechanisms of real neurons. The artificial neuron then multiplies the synaptic input by a certain weight and linearly sums the input. This linear combination of synaptic input then is compared to a threshold, and if it exceeds the threshold it fires (i.e. it exhibits the state 1 and otherwise it stays in state zero). That is, if x,, i= 1,2, . . ..n. are the input projected within active period of length S to the neuron and w,~; i,j = 1,2, . . . , n, are the corresponding synaptic weights connecting neuron j to neuron i, then the state of the neuron can be modeled by Yj(t)=

I

~ WijX;(t)> /

i#j

ej

)

(7.1)

I

where I is the indicator function and ej is the neuronal threshold. The authors showed that such elements could be used to construct sequential networks that can perform combinations of logical operations of high degree of complexity. This model generated considerable interest following its presentation and became known as the McCulloch-Pitts model. Rosenblatt (1959) generalized the McCulloch-Pitts model by adding a learning rule and introduced a three layers network called the perceptron. This model was built to emulate some functions of the visual system. The visual system roughly consists of three areas. The first is the retina which is sensory part of the system that receives information from visual stimuli. The retina then projects its output signals to an association area called the lateral geniculate nucleus (LGN) that in turn projects to the visual cortex. Rosenblatt designs his perceptron as follows: a number of units in a region in the retina projected to a single A-unit (association unit) in the LGN and in turn the association units projected to R-units (or response units) in the visual cortex. The goal of a trained perceptron is the activation of the appropriate R-unit in response to a given input or stimulus. For simplicity, it was assumed that only one R-unit be active at a time. To guarantee this a set of reciprocal inhibitory connections was used, so that an R-unit inhibited all the Aunits that did not project to it. Therefore, when an R-unit was activated, it indirectly

19

E. J. Wegman, M. K. Habib / Neural systems

suppressed

its competitor

the scientific

community

7.1. Network

R-units.

The perceptron

also created

great excitement

in

at the time.

with no hidden layers

Hopfield (1982) proposed a neural network that consists of McCulloch-Pitts threshold units which is capable of solving certain optimization problems. The external

state (or output) Xi=Z

1

of each neuron

1

WijX,+Zi~Bi

,

L itj

was modeled

by

i= 1,2, . . . . n,

(7.2)

where I is the indicator function, wij is the weight of the synapse connecting neuron i to neuron j,~, is the input received from neurons external to the network, and 0; is a neuronal threshold. The global performance of the network is measured by an energy function E, of the form

E= -iC

C i+j

WoX;Xj

C Z;X; +

C

(7.3)

B;X;.

i

i

The objective is to incrementally modify the synaptic weights Wij in order to minimize the energy or objective function E which is developed in such a way that the global minimum of E corresponds to the optimal solution of the problem at hand. The increments of wU are modeled by A Wij= E[X;Xj ])

(7.4)

where the average, E, is taken over a part time period. Substituting for w,~ in (7.3) by wiJ + A Wij will produce an increment Ax, which induces a change AE in E as follows:

AE=+f Hopfield

i=l

[- C

(1982) imposed

i#j

W,Xj+Z;-B;]AX;.

the condition

that wij = wj, for all i and j to guarantee

a

solution. This is the steepest descent algorithm which may lead to local minima of E (i.e. a suboptimal solution) rather than the global one. In order to get around this difficulty the algorithm is then implemented several times with different initial values for the synaptic weights until a satisfactory solution is obtained. Hopfield (1984) extended this network to take into account the internal states of the neurons. The internal state of each neuron was modeled by the differential equation Ci

($ >=-s+ 1

where the external xj

=d"j),

C WijXj+Zi,

(7.5)

j#i

state xi of the j-th

neuron

is modeled

as

(7.6)

E.J.

20

where

g is a sigmoid

Wegman, M.K. Habib / Neural systems

function,

resistance and capacitance solution of the differential

e.g.

g(u) = l/(1 +e-“),

of the neuron. equation

The energy

Rj and function

Cj represent

the

in this case is the

(7.7) Together with the boundedness of E, equation (7.7) shows that the time evolution of the system is a motion in state space that seeks out minima of E. Notice that the system is purely deterministic and hence the solutions of the differential equations governing the internal states of the units are functions of the arbitrary values initially chosen for the synaptic weights. Hopfield and Tank (1986) used this network to store several combinatorial optimization problems, among them the traveling salesman problem (TSP). See also Wacholder et al. (1989). In order to improve the performance of the Hopfield network in the search for the optimal solution or global minimum Levy and Adams (1987) used a hybrid of the method of simulated annealing, discussed by Geman and Hwang (1982), and the neural approach of Hopfield and Tank (1986). They proposed the following diffusion model of the internal state of each neuron N

du. /=

-ui+

dt

C

r;

r=,

w,-Xj+Z;+~COsh

where Zi(r),i= 1,2, . . . . N, are independent zero-mean Gaussian cesses, ,I and T are parameters and r plays the rule of temperature nealing. As r tends slowly to 0 the algorithm then leads to a global The performance of this network was evaluated by Cervantes (1987). We propose the following model to describe the internal state

dui(t)=

-‘u;(t)+t ri

wij(t) dNj(t) + dMi(t),

(7.8) white noise proin simulated anminimum of E. and Hildebrant of each neuron (7.9)

j=l

where Nj(t) is the counting process that counts the number of action potentials (spikes) generated by the j-th (pre-synaptic) neuron in the time interval (0, t), Wij are the synaptic weights, and M,(t) are continuous trajectory martingales that describe the collective input received from neurons external to the network. Model (7.9) extends all previously proposed model neurons and provides flexibility in the modeling process. For example, the counting processes Nj(l) may be assumed to be homogeneous Poisson processes or more generally counting processes with stochastic intensities n,(t). These intensities may be modeled in a meaningful way to take advantage of specific features which the problem at hand may have. The martingales Mi(l) may be taken to be Wiener processes if one can appeal to a central limit theorem derived argument. Hopfield’s learning rule (7.4) is an example of correlation-based rules which are motived by the well-known Hebb’s hypothesis of synaptic plasticity. Hebb’s

E.J.

hypothesis

(Hebb,

21

Wegman, M.K. Habib / Neural systems

1949) basically

states

that

if the firing

pattern

of a neuron

A

repeatedly contributes to the firing of neuron B, then the weight of the synapse joining A to B increases. This hypothesis has been modified to include the reverse hypothesis also, that is, where the firing patterns of A and B are frequently uncorrelated the synaptic weight decreases. Another example is the Delta learning (or Widrow-Hoff) rule with a teacher. This rule is used of neural networks involved in pattern recognition. That is, if there is target activation o;(r) provided by the actual pattern

or teacher

the synaptic

weights

are modified

according

to the rule

ow,=cw[u;(r)-U;(l)]X,(f).

(7.10)

The amount of learning is proportional to the difference between the actual activation level of the network at a certain time t and the desired activation level provided by the teacher. Another important learning rule is the competitive learning rule without a teacher, (see e.g. von der Malsburg (1973), Crossberg (1982), Fukushima (1975, 1978), Fukushima and Miyaki (1982), Kohonen, (1984)). 7.2. Network

with hidden layers

Neural networks without hidden layers have proven to be useful in a wide variety of applications. A characteristic feature of such networks is that they map similar input patterns into similar output patterns. This constraint in performance may lead to an inability of these networks to learn certain mappings from input to output. Minsky and Papert (1969) argued that the addition of hidden layers allows the network to develop internal representations of the input patterns in the hidden units that support any required mapping from input to output units. Rumelhart, Hinton and Williams (1986) developed a learning rule which they called the generalized learning rule to govern the modification of the synaptic weights of the output as well as the hidden layers. To be specific, consider a neural network with one hidden layer of m units, one input layer of n unit and an output layer of one unit. The activation level (or external state) of each hidden unit, hi, j = 1,2, . . . , m, is assumed to be a non-linear function ly of a weighted sum of inputs x= (x1,x2, . . . ,x,JT: j=

1,2 ,...,m,

(7.11)

where vij is the synaptic weight of the synapse connecting the i-th input unit to the j-th hidden unit and w, is a non-decreasing differentiable (or sigmoid) function. Also, assume that the activation level (or external state) of the output unit is given by

111

Y=F

(7.12)

C Xj(VTx)P 9 > ( j=I

where F is another Y=f(.G@,

sigmoid

function.

To simplify

notation,

we rewrite

(7.12) as

22

E. J. Wegman, M. K. Habib / Neural systems

where 0=(/I’, modify 8:

v’). Rumelhart

e, = s, where

et al. (1986) proposed

, + a rf’(x,

e, _ ,)( Y, -f(x,

Y is the vector of the target output

the following

learning

rule to

(7.13)

8, _ I)),

and 01 is the learning

rate. This recursive

method of modifying 0 is called the method of back-propagation. White (1989) used methods of stochastic approximation to prove that 0, converges with probability 1 to an optimal solution 0” or en --t 00 with probability 1. White also established the asymptotic normality of Q,,. The dynamic model (7.12) may be extended to accommodate more realistic features of biological neurons as follows: let u;(t) be the internal state (or subthreshold potential) of an output (or a hidden) unit in the network which is modeled as a solution of the stochastic differential equation duj(t)=pj

dt-Q;uJt)

dt+

where wti is the weight of synapse i= 1,2, . ..) n, are counting processes analytical tractability and w,(t) is a in general). The recursive estimates calculated as in Thavaneswaran and

i wij dN;(t)+Gj i=l between

dWj(t),

the i-th

unit

and j-th

unit.

Nj(t),

which may be assumed to be independent for Weiner process (or a second order martingale, of the synaptic weights in this case may be Habib (1988).

References Aertsen,

A.M.,

modulation Alkon,

D.L.,

Cl. Gerstein,

M.K. Habib

of ‘Effective

Connectivity’.

K.T. Blackwell,

an artificial Anderson,

network

and Cl. Palm (1989). Dynamics J. Neurophysiol.

G.S. Barbour,

derived

firing correlation:

A.K. Rigler and T.P. Vogl (1990). Pattern-recognition

from biological

neural

capabilities

of a parallel

J.A. (1986). Cognitive

of neuronal

61, 900-917. by

Biol. Cybernet. 62, 363-376.

systems. system.

In: E. Bienenstock,

E. Fegelman,

F.

Soulie and G. Weisbuck, Eds., Disordered Systems and Biological Organization. Springer, New York. Bach, M. and J. Kruger (1986). Correlated neuronal variability in monkey visual cortex revealed by a multi-electrode. Borisyuk,

G.N.,

Exp. Brain Res. 61, 451-456. R.M.

Borisyuk,

statistical method for Cybernet. 52, 301-306.

A.B.

identifying

Brillinger, D.R. (1988). Maximum Cybernet. 59, 189-200.

Kirillov,

E.I.

interconnections likelihood

Kovalenko

and V.I.

between

analysis

neuronal

of spike trains

Kryukov network

of interacting

(1985).

‘A new

elements.’

Biol.

nerve cells. Biol.

Cervantes, J.H. and R.R. Hildebrant (1987). Comparison of three neuron-based computation Proceedings of the IEEE First International Conference on Neural Networks, 657-667.

schemes.

Chornoboy, E.S., point processes.

of neural

L.P. Schramm and A.F. Karr (1988). Maximum Biol. Cybernet. 59, 265-275.

Cope, D.K. and H.C. Tuckwell (1979). Firing rates of neurons J. Theoret. Biol. 80, 1-14. De Kwaadsteniet, J.W. (1982). Statistical analysis and stochastic

likelihood

with random modeling

identification excitation

and inhibition.

of neural spike train activity.

Math. Biosci. 60, 17-71. Fernald,

R.D. (1971). A neuron

model with spatially

distributed

synaptic

input.

Biophys. J. 11, 323-340.

23

E. J. Wegman, M.K. Habib / Neural systems Fienberg,

S.E. (1974). Stochastic

Fukushima,

models

K. (1975). Cognitron:

for single neurons

a self-organizing

firing

Biomerrics 30, 399-427. Biol. Cybernet. 20,

trains.

multi-layered

neural

network.

121-136. Fukushima,

K. (1978).

Self-organizing

neural

network

with a function

associative

memory:

feedback

Biol. Cybernet. 28, 201-208.

cognitron. Fukushima,

K. (1988). Neocognitron:

a hierarchical

neural

network

capable

of visual pattern

recogni-

tion. Neural Networks 1, 119-130. Fukushima,

K. and S. Miyaki

(1982). Neocognitron:

a new algorithm

for pattern

recognition.

fatiern

Recognifion 1.5,455-469. Geman,

S. and G.R. Hwang

(1982). Nonparametric

likelihood

estimation

by the methods

of sieves. Ann.

Star&. IO, 401-414. Gerstein,

G.L.

and D.H.

Perkel

(1972).

Mutual

temporal

relationships

among

neuronal

spike trains.

Biophys. J. 12, 453-473. Grenander, U. (1981). Abstracf Inference. John Wiley, New York. Grossberg, S. (1982). Studies of Mind and Brain: Neural Principles of Learning, Perception, Cognition and Motor Control. Reidel, Boston. Grossberg, S. (1987). Competitive learning: from interactive activation to adaptive resonance. Cognitive Sci. Ser. 11, 23-63. Habib, M.K. (1985). Parameter estimation for randomly stopped process and neuronal modeling. Mimeo Series No. 1492, Chapel Hill, Institute of Statistics, The University of North Carolina. Habib,

M.K. and P.K. Sen (1985). Non-stationary

with applications

stochastic

and Environmental Sciences, Elsevier/North-Holland, Habib,

M.K.

point-process

models

in neurophysiology

In: P.K. Sen, Ed., Biostatistics: Statistics in Biomedical, Public Health

to learning.

and A. Thavaneswaran

(1990).

Amsterdam,

Inference

481-509.

for stochastic

neuronal

models.

Appl. Math.

Comput. 38, 51-73. Habib,

M.K. and A. Thavaneswaran

(1992). Optimal

estimation

for semi-martingale

neuronal

models.

J. Statist. Plann. Inference 33, 143-156. (This issue.) Hebb,

D. (1949).

Heyde,

Organization of Behavior. John Wiley, New York.

C.C. (1992). New developments

in inference

for temporal

stochastic

J. Stafist. Plann.

processes.

Inference 33, 121-129. (This issue.) Hopfield,

J.J.

(1982).

Neural

Hopfield,

J.J. (1984). Neurons

of two-state Hopfield,

neurons.

J.J.

Johannesma, Johnson,

sory

and physical

systems

with emergent

collective

computational

with graded

response

have collective

computational

properties

like those

Proc. Nat. Acad. Sci. U.S.A. 81, 3088-3092.

and D.W. Tank

(1986). Computing

P. (1968). Diffusion

Neural Networks, Springer, terns. Juliano,

networks

Proc. Nat. Acad. Sri. U.S.A. 79, 2554-2558.

abilities.

with neural

models for the stochastic

New York,

circuits:

activity

a model.

of neurons.

Science 233, 625-633.

In: E.K. Caraniello,

D.H. and A. Swami (1983). The transmission

of signals by auditory-nerve

fiber discharge

S., P.J. cortex

Kallianpur, In: P.K.

Holland

and B. Whitsel (1981). Patterns

of increased

Macaca fascicularis, subjected study. J. Neurophysiol. 46, 1260-1284.

of monkeys

metabolic

to controlled

activity

cutaneous

G. (1983). On the diffusion approximation to a discontinuous model Sen, Ed., Contributions to Statistics. North-Holland, Amsterdam.

in somatosenstimulation:

T. (1984). Self-Organization and Associative Memory. Springer,

a

for a single neuron.

Kallianpur, G. and R. Wolpert (1984). Infinite dimensional stochastic differential equation models spatially distributed neurons. Appl. Math. Opfim. 12, 125-172. Kuffler, S.W. and I.G. Nicholls (1976). From Neuron to Brain. Sinaur, Sunderland, MA. Kohonen, Le Cam,

pat-

J. Acoust. Sot. Amer. 74, 493-501.

2-deoxyglucose

Kohonen,

Ed.,

116-144.

Berlin.

T. and K. Makisara (1989). The self-organizing feature maps. Phys. Ser. 39, 168-172. L. (1986). Statistical Methods in Asymptotic Decision Theory. Springer, New York.

for

24

E.J.

Levy, B.C. and M.B. Adams

Wegman, M.K. Habib / Neural systems

(1987). Global

optimization

with stochastic

of the IEEE International Conference on Neural Networks. Malsburg, C. von der (1973). Self-organization of orientation Kybernetik 14, 85-100. McCulloch, W.S. and W. Pitts (1943). A logical

calculus

neural

sensitive

Proceeding

networks.

cells in the striate

of ideas imminent

in nervous

cortex,

activity.

Bull.

Math. Siophys 5, 115-133. McKeague,

I.W. (1986). Estimation

for semi-martingale

of sieves. Ann. Statist.

model using the method

13, 579-589. Michalski,

A., G.L.

cortex Minsky,

Gerstein,

S. Czarkowska

and T. Tarnecki

(1983). Interactions

between

cat striate

Exp. Brain Res. 51, 97-107.

neurons.

(1969). ferceptrons.

M. and S. Papert

Neher, E. and C.F. Stevens (1977). Conductance

MIT Press,

Cambridge,

fluctuations

and ionic pores in membranes.

MA. Ann. Rev.

Biophys. Bioeng. 6, 345-381. Nguyen,

H.T. and T.D. Pham

(1982). Identification

of non-stationary

diffusion

models by rhe method

of sieves. SIAM J. Control Optim. 20, 603-611. Palm,

G., A.M.H.J.

neuronal

Aertsen

and G.L.

Gerstein

Biol. Cybernet.

spike trains.

(1988).

On the significance

Perkel, D.H., G.L. Gerstein and G.P. Moore (1967). Neuronal cesses II. Simultaneous spike trains. Biophys. J. 7, 419-440. Rall, W. (1964). Theoretical

of correlations

among

59, 1-l 1.

significance

of dendritic

Reiss, Ed., Neural Theory and Modeling,

spike trains

trees for neuronal

Stanford

University

and stochastic

input-output

Press,

point pro-

relations.

Stanford,

In: R.

CA.

Rail, W. (1977). Core conductor theory and cable properties. In: Handbook System I, Vol. I. American Physiological Society, Bethesda, MO.

of Physiology. The Nervous

Ricciardi,

as a model for neuronal

L.M. and L. Sacerdote

(1979). The Ornstein-Uhlenbeck

Rosenblatt,

F. (1959). Two theorems

Processes: Proceedings Rumelhart,

D.E.,

pagation. Rumelhart,

G.E. Hinton

and J.L.

ac-

of Cognition, and P.K.

Thavaneswaran,

MIT Press,

Vol. II: Applications.

(1986). Learning

A. and M.K. Habib

analysis

Cambridge,

symmetry

groups

MIT Press,

with hidden

of

geniculate

neuronal

ExploraMA. in the

Cambridge.

units:

beyond

the

relationships

(1988). Recursive

estimation

for semi-martingales.

in

cats.

J.

J. Appl. Math.

19, 1901-1909.

Thavaneswaran, A. and M.E. Thompson bab. 23, 409-417. Toyama, K., M. Kimura and K. Tanaka in cat visual cortex. Tuckwell, H.C. Mathematics,

J. Neurophysiol.

(1986). Optimal

estimation

(1981). Cross-correlation

for semi-martingales. analysis

of interneuronal

(1989). Stochastic Processes in the Neural Sciences. Society Philadelphia, PA.

E., J. Han and R.C. Mann

H. (1989). J. Amer.

Some asymptotic

J. Appl. Proconnectivity

40, 191-201.

(1989). A neural

salesman problem. Biol. Cybernet. 61, 11-19. Wegman, E.J. (1975). Maximum likelihood estimation 21 I-224. models.

by error pro-

(1986) Eds., Parallel Distributed Processing: Explorations

Tanaka, K. (1983). Cross-correlation Neurophysiol. 49, 1303-1318.

Wacholder,

I, 1.

representation

Parallel Distributed Processing:

Vol. I.: Foundations,

Vol. I: Foundations,

Kienker

Eds.,

internal

Physica D 22, 260-275.

perceptrons.

Comput.

Mechanization of Thought

in perceptron.

(1986). Learning

McClelland,

of Cognition,

D.E. and J.L. McClelland T.J.

separability

and R.J. Williams

In: D.E. Rumelhart

Microstructure Sejnowski,

of statistical

of a Symposium Held at the National Physical Laboratory,

tions in the Microstructure

White,

process

Biol. Cybernet. 35, l-9.

tivity.

results

for learning

Statist. Assoc. 84. 1003-1013.

network

for Industrial

algorithm

of a probability

for the multiple

density.

in single hidden-layer

and Applied traveling

Sankhya A Ser. 37, feed-forward

network

E.J.

Whitsel,

B., J. Roppolo

primate

Wegman,

and G. Werner

skin. J. Neurophysiol.

M.K.

Habib

(1972). Cortical

/ Neural systems

25

information

processing

of stimulus

in neuronal

spike-train

analysis.

motion

Yang, G.L. and T.C. Chen (1978). On statistical

methods

Math.

Biosci.

38, l-34. Yang,

G.L.

in channel

and C.E.

Swenberg

experiments.

on

35, 691-717.

(1992). Estimation

J. Statist.

Plann.

of open dwell time and problems

Inference

33, 107-l 19. (This issue.)

of identifiability