Neural network architectures for motion perception and elementary motion detection in the fly visual system

Neural network architectures for motion perception and elementary motion detection in the fly visual system

,M'ural \etuork~. Vol. 3. pp. 487 51)5, 1990 P)mlcd in the USA. All rights)cscrvcd. ()893-fiOSO,'90 $3.00 + .00 ('op~right ' 19901)crgamon Pressplc ...

2MB Sizes 0 Downloads 94 Views

,M'ural \etuork~. Vol. 3. pp. 487 51)5, 1990 P)mlcd in the USA. All rights)cscrvcd.

()893-fiOSO,'90 $3.00 + .00 ('op~right ' 19901)crgamon Pressplc

ORIGINAL CONTRIBUTION

Neural Network Architectures for Motion Perception and Elementary Motion Detection in the Fly Visual System H.

O(~MEN

University of Houston and

S. GAGNE Universitd Laval (Received 27 March 1989: revised and accepted 18 January 1990) Abstract--Two distinct but complementao, neural architectures [or motion perception are presented. Part I o] the paper describes a directionally selective, local motion detector bused on the fly visual system. Model predictions are compared with experimental data and the relationship o f the neural model with the Reichardt fimctional model is discussed. Part II of the paper introduces a motion-sensitive network, insensitive to the direction of motion. First, local temporal discrimination problem is' addressed in shunting networks and the limitations of the local scheme are used to motivate .spatial interactions. Then, asynchronous interactions are studied in centersurround antagonistic shunting networks. Some properties o f resulting asynchronous shunting networks are established. Finally, extensions o f the work and integration of these architectures into a global model are discussed.

Keywords--Motion perception, Vision, Gated dipole, Asynchronous shunting networks, Fly visual system. chardt & Guo, 1986; Reichardt & Poggio, 1979; Reichardt, Poggio, & Hausen, 1983, Reichardt & Egelhaaf, 1988: Riehle & Franceschini, 1984: de Ruyter van Steveninck, Zaagman, & Mastebroek, 1986: Schmid & Biilthoff, 1988: Srinivasan & Dvorak, 1979, 1980, Torre & Poggio, 1981; Zaagman, 1977: Zaagman, Mastebroek, & Kuiper, 1978). For philosophical and historical reasons, the early work on fly perception took a behavioristic approach (Poggio & Reichardt, 1973: Reichardt, 1961, 1979). By considering the animal as a "black box" and using Volterra series representation for nonlinear systems, a f i m c t i o n a l model based on the opto-motor response was proposed (the Reichardt correlator) and its predictions have been tested successfully in various experimental paradigms (rev: Reichardt, 1979). This functional model has been the major building block of more elaborated constructs, and its modified versions have been proposed for the human visual system (elaborated Reichardt detectors (ERD)) (van Doom & Koenderink, 1983: van Santen & Sperling, 1984, 1985: Wilson, 1985). Although this model tells us interesting characteristics about the functional properties of motion detection and helps to predict plausible neural models, it does not give precise details concerning its real implementation. Too many

1. I N T R O D U C T I O N

Vision is, for all but a few animals, the most important sensorial interface between the environment and the central nervous system. Seeing is particularly important for a fast moving animal such as the fly which is subject of interest herein. Perception and evaluation of motion in real time are prerequisites for achieving its high degree of performance. No artificial system has comparable optimal properties. It is then not surprising that many research efforts have been directed toward the understanding of motion perception in the visual system of this invertebrate (e.g., Egelhaaf, 1985a,b,c; Egelhaaf & Reichardt, 1987: Egelhaaf, Hausen, Reichardt, & Wehrhahn, 1988: Franceschini, 1985" Guo, 1988; Hausen, 1984; Lenting, Mastebroek, & Zaagman, 1984; Marmarelis & McCann, 1973; Mastebroek, Zaagman, & Leming, 1982: McCann, 1973; O~men & Gagnd, 1987a,b,c, 1988a: Reichardt, 1961, 1979, 1986; ReiSupported in part under Grant RIG-19381 by Universityof Houston, (}rant A-9731 by NSERC (Canada), and (}rant EQ2830 by FCAR (Quebec). Requests for reprints should be sent to Haluk Ogmcn, Department of ElectricalEngineering,Universityof Houston,Houston. TX 77204-4793. 487

488 different mechanisms can be inferred from such a phenomenological model because functional properties of the global system cannot uniquely be related to its building blocks. For example, a highly nonlinear block when followed by a block with low-pass filter characteristics ~ can yield a global system that behaves like its linear approximation (i.e., high-order harmonics are negligible). 2 This is the basis of a method known as the Kochenburger approach in classical nonlinear control theory (Gille, 1977). Moreover, in general a given experimental procedure probes a limited subset of the properties of a system and a coherent set of mechanisms of the model cannot only generate this subset but also predict other emergent properties whereby multidimensional behavior is synthesized. This paper, by proposing a neural model of motion detection, aims to bridge the gap between functional models and neural data. Visual motion detection is an early preattentiw; mechanism preceding many higher levels of information processing (Nakayama, 1985). The visual system of the fly does not elude this general scheme. The fly uses motion detection in many important aspects of perception like target detection in camouflage, and target tracking without any elaborate perception of form. In the human visual system, these basic tasks are interrelated by a complex, threedimensional form perception. It is known that the peripheral part of the human visual system can detect motion without form perception, and initiate foveation of the moving target to induce form perception (Gregory, 1978). We suggest that there exist different specialized architectures to perform these complementary, but distinct tasks. In Part I of the paper, a neural model for directionally selective motion detection in the visual system of the fly is proposed. This type of architecture deals with the general problem of motion detection and target tracking. In Part II, neural filters that are insensitive to the direction of motion are introduced. These filters are proposed as bottom-up components in a general scheme explaining multidimensional perception. The two parts are self-contained and can be read independently. Real-time model setting requires a careful analysis of image transients. This is especially true for motion

This is the case in most of the practical systemscontaining mechanicalstructures on whichacts the nonlinearsubsystem.For the fly,the measuredtorque is a smoothed(i.e., low-passfiltered) version of the neural commandsignal which can possess abrupt variations. z Wiener-typesystemidentificationmethodsappliedto motion detector cells in the fly visual systemshowed that the predominating nonlinearityat the level of these cells was of degree one (Marmarelis & McCann, 1973).

H Ogmen atzd 5. Gagn( perception. However, most of the models proposed so far, such as the original Reichardt eorrelator (Reichardt, 1961) and its elaborated versions (van Santen & Sperling, 1985), contain explicitly or implicitly a steady state assumption. These formulations are valid only for stimuli moving with a consta~tt velocity. This is a severe restriction and although some recent etforts have considered this aspect (Egelhaaf & Reic h a r d t , 1987; R e i c h a r d t & G u o , 1986) many questions still remain open. The main motivation behind our approach is to consider image transients as a central issue rather than a side-effect, and then rigorously analyze adaptation eff¢cls to tonic input signals, transformation of tonic signals into phasic signals and build a real-tmw neural network architecture based on these results. There arc many neural mechanisms--such as transmitter processes. ;-~synchronous interactions between ,~xcitatory and inhibitory potentials, and w)ltage dependent c
Motion Perception

through spatial and temporal adaptation mechanisms yielding nonlinear intensity processing and adaptivity properties. For a general review of motion perception the reader is referred to Nakayama (1985) and Uliman (1981). 2. PART !: NEURAL CORRELATES OF MOTION DETECTION IN THE FLY VISUAL SYSTEM 2.1. The Fly Visual System The primary visual system of the fly is composed of three optic ganglia: lamina, medulla, and lobula complex (lobula and Iohula plate) (Figure 1) Photoreceptor cells JR1-6] ~ of the compound eye send their axons in an orderly fashion to well-defined anatomical and functional units of the lamina named cartridges. This retinotopy is preserved at the medulla level and the whole structure converges to a large class of motion sensitive cells in the lobula complex. One of these cells, the H1 cell, is a wide-field directional motion detector sensitive to ipsilateral regressive motion (Hausen, 1984). Accumulating evidence indicates that this cell is fed by small field motion detectors located in the medulla (rev.: Buchner, 1984). Some experimental results suggest that these elementary motion detectors (EMD) have their inputs coming from a subsystem of the lamina cartridge (Franceschini, 1985: Laughlin, 1984; Riehle & Franceschini, 1984; Srinivasan & Dvorak, 1980). This subsystem conveys information centripetally by action potentials and its spatial and temporal properties have been well characterized (Arnett, 1972; Mimura, 1974, 1976). It has two types of units: one with a phasic on-off response, and one with a tonic response characterized with an overshoot and a plateau (Arnett, 1972). Although the exact anatomical mapping of this circuitry is not a certitude, a large body of experimental findings support these views. In any case, exact anatomical localizations are not critical in this study. 2.2. Models for Lamina Units Neural models for on-off and sustained units of lamina have been proposed and compared with experimental data (Ogmen & Gagnd, 1987a, 1990), These models stem from the study of transmitter processes as a mechanism to produce temporal adaptations. The square synapses in Figure 2 have to be consid-

The fly has two receptor groups: The R I - 6 group concerned with low luminancc detection and the R7-8 which is involved in color dctection at highcr intensities.

489

ered as "special synapses" with a slowly accumulating transmitter package that gates the input signal. The amount of available transmitter at this synapse is denoted by z ( t ) . The dynamics of the transmitter is characterized by the following differential equation: dz(t) = ,~(/~ _ z(0) -~(t)z(;), dt

(1)

which states that the input signal ,~J(t) depletes the transmitter proportionally to its strength and to the available amount of transmitter z ( t ) . A mechanism replenishes the consumed transmitter, In the passive accumulation interpretation, the term c~(// - z ( t ) ) says that transmitter is produced at a rate proportional to the depleted amount (fl - z ( t ) ) . In the active production interpretation, the term a[] is the transmitter production rate and - c < (t) represents feedback inhibition by the end product. The gating effect of the transmitter is expressed at the output by the multiplication of the input signal with the available transmitter amount. Thus, the output signal S ( t ) of this first stage is given by: S(t) = .,i(t)z(t),

(2)

which simply states that the signal transmitted to the postsynaptic cell S ( t ) is not only proportional to the input signal .,J(t) but also to the available amount of transmitter z ( t ) . To show that a temporal adaptation occurs, suppose that a step input (i.e., ~J(t) = 0 for t < 0 and ,~(t) = ,~J(a positive constant) for t -> 0) is applied. Since ,~J(t) = 0 for t < 0, initially the transmitter is fully accumulated, that is z(I)- ) = ft. The solution of eqn (1) is: z(t) -

O~ + ,q

(1 - e ..... ") + fie .......

(3)

for t -> 0. Substituting this expression in eqn (2), one finds for t -> 0: O~/t'],q

S(t) -

+ r~

(1 - e ~...... ) + /S~e .......

(4)

which shows that after the onset of the stimulus, the transmitted signal overshoots to the value fl.~J. Following this overshoot, the transmitter depletion produces an " a d a p t a t i o n " in the response since S ( t ) decays exponentially to the plateau value afl.~J/(c~ + .~) which represents the "adapted response." In summary, we have just shown that transmission with depletable transmitters results in t e m p o r a l a d a p t a t i o n which is the discrimination of temporal activity from temporal background (a signal constant in time). The second problem faced by the peripheral neurons is the discrimination of spatial inputs from spatial background (part of the signal which is constant in space). Evidently, the solution of this problem requires lateral interactions since a single channel

490

H. ( ~(,n.'u~i atnl 5. ( ; a g m

FIGURE 1. Fly visual system: A horizontal cut through the fly compound eye, OM: ommatidium, LE: lens, CO: crystal cone, SC: semper cells, RC: retinula cell, RH: rhabdomere, Ph pigment granules, LA: lamina, CE: external cblasma, NE: neuron cell body, ME: medulla, Ch inner chlasma, LO: Iobula, LP: Iobula plate, CB: central brain, AA: a ~ e angle. Inset: A section through an ommatidium perpendicularly to the main axis of the ommatidium (from Zaagman, 1977).

does not carry information about the background stimulus. For simplicity, let us start with two channels and introduce lateral inhibition between gated signals at a second stage as shown in Figure 2. Mathe m a t i c a l c h a r a c t e r i z a t i o n of i n t e r a c t i o n s at this second stage is given by the following equations:

dxon dt

-

Axo,, + ( B -

dt

-

Axo~ + (B -

low -

o

x,,°)ll + J,,o]Z,,° ( D + x..)lz,,f,.

(5)

+ x,,,~)[l + J,,,,]z ....

(b)

dx,,,,

ON

x,,,)Iz,,, -

(D

where A gives the rate of spontaneous decay of the cell activity. 4 B and - D relate to the upper and lower levels of saturation of the activity. The term multiplying (B - x) is the excitatory input while the term

4 We will use x to designate both the ceil and its activity. The distinction should follow from the context.

J I FIGURE 2. The gated dipole is ~ oft~ o~nt channels denoted as "on" and "off' ~ l s . ~ . nels receive the same arousal s i g ~ L In ~ to this arousal signal, the on.channel receives an extra input J. ~ n apses marked by squares denote synapses ~ h m u l t ~ cativety acting (Le., g a t i n g ) ~ ' ~ - - n ~ e r s , Following ~ stage C h ~ S oompete ~ r ~ ~ i inhibition.

Motion Perception

491

multiplying (D + x) is the inhibitory input. This is a s h u n t i n g e q u a t i o n and a review of general properties of shunting equations can be found in Grossberg (1988b). It is important to note here that shunting equations are equivalent to the m e m b r a n e equation through a simple change of variables (Ogmen & Gagne, 1990). 1 is an arousal signal feeding both the ON and OFF channels. Furthermore, it is assumed that variations in the transmitter amount are much slower than variations in the cell potential. We study first the case where no external input is applied to the off-channel, that is K = 0 and the onchannel receives an additional input j,,,,.5 Thus, .~J = 1 -~ J,,, for the on-channel and .~J = I for the offchannel. The slow time scale of transmitter dynamics with respect to cell dynamics allows us to write eqns (5) and (6) in their equilibrium state as: .v,,,, =

B[I + J,,,,lz,,,, - Dlz,,, A + [l + J,,,dz,,,~ + lz,,,t"

cific input (Jon(t) = 0), inspection of eqns (8), (10), and (I1) shows that the off-cell will rebound. This is called a n t a g o n i s t i c r e b o u n d and since Xof~responds to the offset of the stimulus it is called the off-cell. We will now show that lateral interactions generate spatial adaptation. If a spatially homogeneous input is applied, then J = K. Equations (7) and (8) become: afl(B - O ) ( / + J)

x.,, = xo. = A(c~ + I + J) + 2c~fl(l + J)"

For definiteness, suppose that the cell fires iff x > 0. Then, it is sufficient to impose B <- D to have the homogeneous activity suppressed regardless of the fluctuations on J,,n" Suppose now that we form an array of such gated dipoles as illustrated in Figure 3. Then, the dynamics of the ith cell in this array is characterized by the following equations:

(7) dv --

dt

=

Ax,

and

+

(B

-

+

x,)[l

J~]z,

- (D + x,) Blz,,, -

D[I + J,,,,]z ....

.v.,: = A + [1 + J,,,,]z,,,, + l z . , "

(B

A

D ) I z ....

-

+ 2 I z ....

(9)

with z ....

z,,,: -

~+I"

(10)

Let us now' consider what happens when Jo~ is turned on (J.,(t), a positive constant) at t = 0 + . Substituting this value in eqns (7) and (8) and comparing with eqn (9). since z,,,,(0 ~ ) ~ z,,,(O ~) = o4]/(c~ + I ) . one can see that x,,. will overshoot and x . . will undershoot. Thus. x,,n responds to the o n s e t of its specific input J,,,. Solving eqn (1) in these conditions, one can see that following the overshoot, the transmitter at the on-channel will equilibrate to: z,,,,

-

c~+l+J'

(11)

while the amount of transmitter at the off-channel will stay unchanged. This decrease in the transmitter amount will cause a decrease in the activity of x ..... that is the overshoot will be followed by a plateau (temporal adaptation). Now, if one turns off the spe-

'Notc that in genera/ this case applies even when K ¢ 0 through a change of variables.

[Y + Lk

.

dz, d--/ = ~(/~ - z,) - (/ + Y.)z,.

(8)

The reason for calling these cells ON and OFF cells is the following: Suppose that J,,n(t) = 0 for t < 0. Then the activities will equilibrate to the following value: .r,,. = x.,,

(12)

(13) (14)

According to the network connections illustrated in Figure 3, lateral connections are limited to the immediate neighbors, that i s i - ] -< k - < i + 1. I f a given location i is stimulated, then the cell xi will respond with a sustained activity composed of an overshoot and a plateau. By comparing this architecture with the basic gated dipole (cf. Figure 2) it is easily seen that cells &+l and x~ ~ are equivalent to the off-cells for this stimulation: they respond to the offset of the stimulus at location i. Predictions of this model and comparisons with electrophysiological data can be found in O~men & Gagnd (1987a, 199[)). These data show that sustained units exhibit spatial and temporal adaptation as well as transient off responses as predicted by the model.

Xi nil

)+ mmm

)

(

II

Ji FIGURE 3. The model for sustained units is an array of gated dipoles. Each channel receives an input Jr, as well as the arousal signal I. Lateral inhibition takes place after the gating stage of depletable transmitters.

492

A slightly modified version of this architecture has been proposed for the on-off units of the lamina (O~men & Gagn6, 1987a, 1990). This architecture is depicted in Figure 4. As mentioned earlier, the response xo, is composed of an overshoot followed by a plateau due to the habituation process of the transmitter. By setting a high signal threshold, one captures essentially the transient phase of the response and thus obtains a cell that responds phasicly to the onset of the input. The on-off unit is then synthesized by bringing together this response and the antagonistic rebound at a third-order cell Yi characterized by:

H. ();~men attd 5. Gag,c

Yi + Xon

+

)

Xoff t-

dy, _ - A y , + (B - y,)(F[2.....- i'.,.1' dt

where [a] ÷ = max(O,a), F, G are constants and P is the signal thresholdP We call this structure an augm e n t e d gated dipole.

2.3. Directionally Selective Elementary Motion Detectors We intend to present an elementary motion detector (EMD) model for the insect visual system using both the augmented gated dipole and the array of gated dipoles. Theoretically, detection of motion requires a minimum of two spatially distinct channels. In fact, experimental evidence suggests that the fly visual system operates with this minimal construct, that is the fly can detect motion by two spatially adjacent channels (Riehle & Franceschini, 1984). Our model for the EMD is depicted in Figure 5. Each lamina cartridge contains a sustained (box marked S) and an on-off (box marked O) unit. Sustained units are laterally connected to achieve spatial adaptation as explained in the previous section. Comparing Figures 5 and 3, one can see that each box S in Figure 5 corresponds to a cell xi in Figure 3. Internal architecture of each box O is given in Figure 4. Neighboring sustained and on-off units make excitatory connections to the EMD cells located in the medulla (box marked EMD). The output of the sustained unit is relayed through a delaying function (box marked D) denoted by f(.). A wellknown candidate for this function is a pure delay created by axonal transmission. However, a cell with slow dynamics is also possible. Excitatory connections from sustained and on--off units enable the EMD to sense the presence and temporal variations of stimuli exciting two spatially adjacent channels,

" Note that while we use a relation of the type linear above threshold (i.e., [.]+) a more general formulation can make use of sigmoidal functions.

I FIGURE 4. The model for on-off units: The augmented ~ dipole. A third-order cell is added to the basic ~ d i p o l e anatomy of Figure 2. The threshold of the ceil ~o, is chosen such that mainly the overshoot of its r e M is t ~ i t t e d . The third-order ceil sums this "on" response with the "off" response transmitted by ~,.

The temporal adaptation mechanism of the sustained unit enhances novel stimuli. Therefore. the information from the sustained unit is complex and reflects the presence, intensity, and the temporal variations of the stimulus. The spatial adaptation mechanism of the sustained units suppresses the background activity and only the presence of a local stimulus is signaled by these cells The on-off units signal to the EMDs a local change. The basic function of the EMD is to compare past activities on a channel with present changes on the neighboring channel. If there is a correlation, then a local cue for motion is signalled to higher-order integrating neurons (cf. section 2.4). A simple mechanism capable of this correlation measurement is a facilitatory one: the output of the sustained unit is delayed and compared with the output of the neighboring on-off unit by adding these signals and by thresholding the sum. The threshold F~,,d is chosen such that, in normal conditions, a single input to the EMD (i.e., the delayed output of a sustained unit or the output of an on-off unit) induces no output by itself. However, when the inputs to the EMD coincide in time. the pooled activity can be high enough to cause the cell to fire. This nonlinear facilitatory mechanism has already

Moticm Perception

493

/ / / X Lamina \

\

Medulla

Lobula Plate FIGURE 5. Model of the elementary motion detector (EMD) of the fly. Each lamina cartridge contains a sustained (box marked S) and an "on-off" (box marked O) unit. As shown by lateral connections, the sustained units are laterally connected (cf. Figure 3). Each cartridge receives its light input according to its optical axis as shown in the figure. The outputs of the sustained and on-off units feed the elementary motion detectors located in the medulla. The outputs of sustained units are relayed through a delaying mechanism (Box D). (This delaying element exists in every cartridge. Only one is shown in the figure.) Each cartridge of medulla contains EMDs with opposite preferred directions represented with boxes labeled EMD. The arrow at the bottom of the box shows the preferred direction of the EMD. Each EMD receives excitatory inputs from sustained and on-off units, and the spatial location of these units determine the preferred direction of the EMD. The curve at the top of the box shows that the EMD has a threshold type nonlinearity. If the delayed activity of the sustained unit coincides in time with the activity of the neighboring on-off unit and if the pooled activity exceeds the threshold of the EMD, the EMD will fire. Note that only EMDs tuned to horizontal movement are shown in a single cartridge. Other medulla cartridges contain similar EMDs. Finally, the Iobula plate cell H1 integrates the outputs of these horizontally tuned EMDs (only one is shown). This integration process also implements the opponency: regressive motion selective EMDs make excitatory connections while progressive motion selective EMDs make inhibitory connections. Note that the exact anatomical localization of some of these units is not known with accuracy.

been proposed for evaluating coincidence of patterns (Grossberg, 1970) and for computing multiplication of neural signals (Srinivasan & Bernard, 1976). Because of this nonlinearity and intensity dependence, this scheme is called facilitation at low input intensities and occlusion at high input intensities. The facilitatory activity from the sustained unit is graded and gradually decays. In order to obtain a temporal overlap of activities--so that the threshold can be exceeded and motion be signaled--the neighboring stimulation should occur within a time interval set

by the decay of the facilitatorv activity. Thus, the EMD is tuned to a range of velocities. The tuning curve is graded and roughly follows the decay of the facilitatory input coming from the sustained units. Moreover, the firing threshold of the EMD plays a critical role in the velocity tuning curve of the EMD. A high threshold requires a good temporal match and results in a sharp velocity tuning curve. On the other hand, a low threshold permits matches over longer periods of time and results in a wider velocity tuning curve. Another aspect that deserves consid-

494

IL ()~<'mt'n aml 5. (;a~m;

eration is the intensity dependence of the facilitatorv activity. It implies that the output of the EMD is a function of the input intensity and a trade-off is required between velocity tuning and input sensitivity The above remarks can be translated into predictions: An EMD can be stimulated with static inputs with very high intensities and under these conditions the temporal adaptation effect will lead in general to a transient output. This is in agreement with data which shows that a single spot with a very high intensity can produce an activity at the motion detecting cells (Riehle & Franeeschini, 1984). The graded effect of facilitation predicts, m agreement with experimental data (Riehle & Franceschini. 1984), that a temporal overlap of stimulation of neighboring channels is not necessary to produce motion detection (cf. Figure 6B), The above remarks also suggest a strategy to deal with the intensity.' sen+ sitivity-velocity tuning trade-off: A nonspecific signal reflecting the total input can be used as an arousal signal to the EMD. However, the understanding of the extent at which such a strategy is used in the fly visual system requires further experimental studies. Thus, the general setting of the model is a "mcmorize-and-compare" scheme. The sustained units signal the p r e s e n c e of an input by a c c e n t u a t i n g transients and spatial discontinuities. This signal is delayed and compared with local temporal changes signaled by the neighboring on-off units, if there is a good match between these two activities, the EMD interprets these correlated activities as the consequence of a motion. It is important to note that there are two basic memory mechanisms in this model: delaying units with graded activity and transmitters. Habituation due to transmitter depletion is in fact a simple form of memory since the amount of available transmitter changes slowly and reflects previous levels of input. Let us denote the activity of the EMD at position i by x~,,d.~(t). The dynamics of an EMD with preferred direction from channel i to channel i + I is described by eqns (13)-(15); eqns (5), (6) with the following substitutions: x,,. --+ 2o,.~, x , , - + 2~,,.;, J,m --~ .L and =,,,,, z,+, obey the corresponding versions of eqn (I), and the following equation:

dl~,mt dl

Ax~,,,u, ×

+ (B (/(b,-

-- x,.,d

)

l';l ) + [+v,,- l , , ~ , l ) .

(t+.,>

2.4. Wide-field Integration and Motion Opponency This first stage of local motion detection somehow has to give rise to wide-field vision. In the fly visual system, it is known that the lobula cells realize that function (Hausen, 1984). The following equation is

proposed for the activity x~,,~+of the Iobula cells:

~/;Xhh dt

. . . . Aa,..,+ + (B -+ x~,,,,)~ H+,I-'~: ID + .,c,.d ~ J+A.,c:;: +

i'+,+ .+[ l -~:~+ +

+ +"+

where Xcmd,i. j and X~T~d~,] are elementary motion detectors with opposite preferred directions, the subscript j denotes EMD's spatial sampling distancC and the coefficients Hi,i and J~.i account fl)r the connection strengths. These connection strengths determine the receptive field profiles of lobula plate cells. Conse+ quently, the general expression (17) can be specialized to various classes of ceils Iound in the lobula plate by a proper choice of tt,., m~d . f , ] The right-. hand side of this equation has three terms. The first one is the spontaneous decay term accounting for the return to the resting potential uf the cell once the stimulation is turned off. The second term is an cx+ citatory input from EMDs having the same preferred direction as the lobula plate cell. The last term is an inhibitory input from EMDs with opposite preferred direction. Thus, we have a shunting competition between EMDs with opposite preferred directions and this results in motion o p p o n e n t ) If a single EMD for each direction is activated through stimulation of only two adjacent cartridges, summations in eqn (I 7) can be replaced by single inputs and at equilibrium this vields:

L27

BH,~Ia ....... - I'...,,u:.L[ D,I, i ~ . 5 .... - I ' x ~ : x ] it + H,,[x.,~+ .... - 1",,,,,+,! + / : [ x ~ , , - 1"~,]

'

(IS) The equivalence of some alternative models (Adelson & Bergen, 1985: Watson & Ahumada, 1985) to the Reichardt correlator being discussed elsewhere (van Santen & Sperling, 1985), we will compare our neural model with the functional model of Reichardt. Inspection of Figure 5 shows that, at the functional level, the sustained unit with the delaying unit correspond to the "k)wpass filters" while the augmented gated dipole (on-off units) can be identified as a "high pass filter." Moreover, the facilitation scheme realizes the multiplication and the H1 cell the opponency and spatial integration? Furthermore, between the two "logical circuit" models (NAND and AND) of Barlow-Levick (1965), the facilitatory scheme is better identified with the " A N D " model. However, our model is neither a purely excitatory model nor a purely inhibitory

• This "distance" is the difference uf the indices of sustained and on-off units. For example, in eqn (16) it is equal to 1+ '~Note that these identificationsare crude and our model has nonlinearities at an early stage.

Motion Perception

495

model since both excitation and inhibition play an important role. This issue is critical since a simplistic reduction of motion detection into a single synaptic interaction can yield contradictory interpretations of different experimental data as it will be discussed in the last section. Although our model is formulated for the fly visual system, it can be elaborated for the human visual system by introducing more complex receptive field profiles in eqn (13) to prevent spatial aliasing. Since such filters will be studied in Part II, comparisons in this respect with models proposed for the human visual system (Adelson & Bergen, 1985: wm Santen & Sperling, 1984; Watson & Ahumada, 1985) will be given in Part II and Appendix C. There are important differences between our model and these correlators. The "steady-state" assumption of the original Reichardt correlator and ERDs has already been mentioned in the Introduction. We will now look at adaptive properties of motion detectors. Experimental data show that the "filters" in the detector are adaptive, that is they have variable time constants (Borst & Egelhaaf, 1987; de Ruyter van Steveninck et al., 1986). Since our model is dependent on temporal and spatial adaptation, adaptive filters can be identified in many stages. For instance, if eqn (13) is written in the following form:

dXdt

1:~) + Bll + J,l=,

-

D ~, [t + a~lz~ l > )

one can see that this differential equation has time varying coefficients, that is it results in an adaptive filter. This is a corollary of shunting interactions and this issue will be discussed further in Part II. From experimental measurements of the saturation curve of the H1 cell, Lenting et al. found the following expression to fit the data (Lenting et al., 1984): s(x) -

x a+x'

(20)

where a is a positive constant. In the basic Reichardt correlator, and ERDs, saturation is not taken into account. In our model, saturation also is a corollary of the shunting mechanism. By putting [X~,-~d.,F~.~]" = 0 in eqn (118) (since the motion produced in the experiments is in the preferred direction), one can see that saturation of H1 has the same form in our model. Note that eqn (20) has the form of a Weber law and the relationship between Weber law and shunting equations will be further discussed in Part II. It is known that adequate stimulation of two adjacent channels (cartridges) in the fly visual system modifies the activity of lobula cells (Riehle & Franceschini, 1984). This experimental procedure is re-

markable since it makes possible to probe the activity of a single EMD while all other EMD's remain inactive. Therefore, in these conditions, eqn (17) can be replaced by eqn (18). This is the paradigm used in the Riehle-Franceschini experiments. Figure 6 shows the responses of the HI-neuron to different stimulations and corresponding model simulations. Two photoreceptor cells are microstimulated according to the sequence and duration times shown under the horizontal axis (time axis). Spike histograms of Figure 6A show that for the sequence corresponding to the preferred direction, the cell responds with a vigorous spike discharge while for the sequence corresponding to the null direction, no discharge is observed. Although only an excitatory effect is apparent in A, examination of histogram F shows that stimulation of the EMD in the null direction inhibits the activity of HI. This effect is revealed at the recordings shown in histograms F because in this case the H1 neuron had a spontaneous activity. Single spot and synchronous stimulations of channels give no spike discharge as shown in histograms C, D, and E. Histograms B show that a temporal overlap of stimuli is not necessary to generate the on-off response. These recordings suggest that stimulation of the first channel has a facilitatoo, effect on the onoff response of the adjacent channel and that EMD's with opposite preferred directions compete. The model also predicts that while transmitter habituation process is a useful mechanism to detect local variations from background, if not compensated with other strategies, it will lead to a decrease in the amplitude of the motion detector cell response. Thus, in a straightforward implementation, the model predicts that increasing the background intensity will cause a decrease in the cell response. In summary, Part 1 studied transmitter processes as a mechanism to produce temporal adaptation. When coupled with lateral inhibition to produce spatial adaptation, the resulting anatomy leads to a rather general scheme called the gated dipole. Variants of the gated dipole are used to model early processing units. The output of these units are combined to achieve elementary motion detection, that is local, directionally selective motion detection. Furthermore, spatially distributed activities of EMDs are integrated at a lobula level cell with competition between opposite directions. The resulting architecture is tuned to a range of velocities. Moreover, it is not a pure velocity sensor since other parameters such as contrast affect the response. 3. P A R T I1: A S Y N C H R O N O U S SHUNTING NETWORKS In this part of the paper, a motion-sensitive neural network, insensitive to the direction of motion, will

496

t / O~me~ am~ 5. (:h~'m"

'°°1

O.l

IO]

1 ~ •

'

6

\

ii ii ,NIl

!

k

A 10

-

F

Oo

SW,-1

Oli

|

01L ,

I

,

00

oOo ,

61~" I

-

O

a

-' ~ '

]

,,

',

j,

i

i

l ".

!

-

"-~~



6~--I

Oi~ L

. . . . . . .

.

. '"

]

.

.

_.

oO0 o'o

"

_-

_~---,-~,~_,-~,.~-

.,

.

,,

..e. • •

L

00

&

0 ()

S_

0 |

6

D O0

iJ

°17

1+6

i

il

1@6

O 00

(}

I L

• i

:

i

i

"

i

;

'

J

gee

*-'

F

• • •

0

o

1o o

0

~o

'

Ibo

~oo ms

FIGURE 6. Responses of the H1 cell to various stimuli as I n d ~ bel according to Its poeltlon In an omr~ The left column shows R l e h l e - ~ vigorous on-off spike discharge Is the null direction. R e s p o n m In hi response. Single spot stinmlat'm~','~ stimulation (histGgrame E) do n o t | was not ~ in h i ~ As F since H1 had a spontaneous a c t i . . . ,

. . . . . .

200

Time (ms)

o

. . . . . . . . . .

,

,

300

Motion Perception

497

be described. As stated in the Introduction, this part starts with a theoretical analysis of asynchronous interactions between excitatory and inhibitory signals as a possible mechanism for temporal adaptation: The temporal discrimination problem is addressed first and solutions are found thereafter. However, the local nature of processing imposes important limitations on the solutions. Similarly to Part I, the local temporal discrimination network is then generalized to include spatial interactions. Through these interactions, local temporal processing is generalized to spatio-temporal processing and the network becomes motion sensitive. The resulting asynchronous centersurround antagonistic shunting network is studied and compared with synchronous shunting networks. Moreover, Appendix C shows that many spatio-temporal models proposed in the literature can be expressed in terms of additive n e t w o r k s .

3.1. Local Temporal Discrimination The temporal discrimination is the detection of temporal variations in a signal while suppressing temporally constant components. In its simplest form it is equivalent to the mathematical differentiation operator. Let us start by considering the backward difference formula: (21)

~t

This equation suggests as a possible neural implementation the use of delay and inhibition. This is illustrated in Figure 7 (right) where the input is delayed with an interneuron which inhibits the neuron

m

x~ (t) ;

+WF

D~ x2(t)

x2(t)

x(t) = l(t) - x(t - At),

x1

(t)

given by: &~(t) dt

dx:(t)

axl(t) + (B - x l ( t ) ) ( l ( t ) - (D + xM))/][x.(t - a) - ~]'.

(23)

+ (B -,r4t))(~[.v,(t - r) - l'l'.

(24)

;'x:(t)

where xi is the average potential of the cell i, with c~, 7 (a > 0, ,' > 0) being the spontaneous decay rates. Parameters v and r (a -> 0, r -> 0) are the transmission delays while ~ , I" are the output signal thresholds. 1 ( I -> 0) is the external input/] and d are positive amplification factors. B and - D (B -> 0, D -> 0) correspond to the saturation levels. According to the connections of Figure 7 (left), the external input I excites the main cell whose activity x~ excites the auxiliary cell 2. Moreover, the activity of this auxiliary cell x: inhibits the main cell. We will study the response of this system for an asymptotically steady-state input, that is when I = lim,_,l(t) exists. The input intensity 1 is chosen such that it can drive and output signal without inhibition. `) that is: cd" 1 > _/ - ((B - F)"

t(t)

I(t)

FIGURE 7. Recurrent (left) and nonrecurrent (right) anatomies. In the recurrent anatomy the input signal excites the cell x, The output of this cell inhibits itself through the feedback loop via x2. In the nonrecurrent anatomy the input signal excites both the main cell x~ and an interneuron x2. This interneuron inhibits x~ and the overall anatomy is a feedforward circuit.

(22)

where x(t) is the output of the unit, has a purely imaginary pole, Therefore, the recurrent anatomy is ruled out. The next step is to generalize this result to n o n l i n e a r neuron models. Two theorems by Grossberg (1970) established that for additive equations only the nonrecurrent anatomy can achieve the desired temporal discrimination task. The following theorems are a straightforward extension of these theorems to shunting networks. The mathematical expressions for the shunting recurrent anatomy are

dt

l(t) - l(t - At)

Al(t) -

receiving the input. This is a feed-forward solution, that is a nonrecurrent circuit. A priori, similar recurrent circuit, as shown in Figure 7 (left) is also possible. In that case, the output is sent back with a delay to shut off the activity of the cell. If the nodes are linear, then the recurrent anatomy will lead to an oscillatory behavior for sustained inputs. This follows from the fact that the difference equation corresponding to the recurrent anatomy:

(25)

THEOREM 1 (Shunting recurrent). U n d e r the a b o v e hypothesis, either the limits" x, = lim'--~&(t) exist (i = 12) with Xl > F, or x~(t) oscillates a b o v e F infinitely often a n d at arbitrarily large times.

~ If l is smaller than this level, it is considered as noise and filtered by the system.

49~¢

tt. ¢ ~y.,ncn a~td 5, (iugtW

Proof. The proof of this theorem is identical to the one given in Appendix A of Grossberg (1970) with the exception that the inequality 1 > c~F is replaced by inequality (25). •

The nonrecurrent equations are given by: dxgt) tit

oa,(t) + (B - x M ) ) ¢ l ( t )

¢) - ~1+.

.- tO + x)(t))/~[x:(1

dx~(t) dt

7x:(t) + (B -.v~(t))dl(t).

(26) 127)

where the external input I excites both the main cell and the auxiliary collateral whose activity inhibits the main cell. Let us define: 0 -

(2s)

D+I'

.L = ~1

(29)

and

I

= lira, .d*(x,(t) -- fi).

(30)

THEOREM 2 (Shunting nonrecurrent. Let l(t) be a nonnegative signal function and suppose that I = l i m , _ J ( t ) exists and that eqn (25) holds.

if / >,I

-

fl(B

-

fl)

~,

(~1)+

I

o?

then xl(~c) = lim,_~.&(t) > 1". 11: I is b o u n d e d f r o m above such that

anatomy to such inputs. It shows that there exists parameter values for which temporal discrimination can be achieved. However, the theorem also shows that the dynamic range of the discrmamator is limited to the interval (/,1) where the lowe_~ limit determines the noise level and the upper limit is a consequence of the saturation property of cells '-[he inhibitory collateral saturates and b e c o m e s ineffective for higher input intensities. This proi4em is related to the noise-saturation d i l e m m a and it can be shown that shunting competitiorl can overcolnu this difficulty (Grossberg, 1973).

3.2. Spatio-Temporal Discrimination by Asynchronous Shunting Networks We n o w modify the nonrecurrent anatomy by introducing lateral interactions to obtain spatial adaptation through shunting competition. The resulting equation becomes: #x(s, t ) /It

~, v(s'. t) +- {B - a-
where * is the spatial convolution operator, s the spatial variable (s ¢ S), G,.. (;,. e¢., and u, are, respectively, excitatory kernel, inhibitory kernel, excitatory input, and inhibitory input. In this equation a continuous spatial variable s is used instead of discrete indices i for the cells. DEFINITION 1. The kernels" (;,. and G, are called, respectively, excitatory a n d inhibitoo, receptive fields.

(32)

l < 7 ~ J

DEFINITION 2. Equation (36) ;s s y n c h r o n o u s lumpable, matched) tff

and 6BM'(@(B - 1") + ~al')

(30)

/

7> ~

(33)

u,(s,t)

u,(:,, f)

(Le.,

Vs. f.

attd /~Ba](; + 61) ' ~: ;0,x] where A l

= ] -

(34)

1, then x ) ( = ) ~£ F.

Furthermore, !f -, + ~1 log~, + ff[ -

6AI(flB;,) '(60AI - fl(B - [D)

(35)

then f _ ( 1 ) = f_ Ii=t <- 6OAI and x , ( ~ ) <- F. Proof. The proof is given in Appendix A.

!! it is not synchronous then it is asynchronous.

The synchronism implies that excitatory and inhibitory signals are identical. Then the network topology determines the transformation of the input pattern into a new pattern of activities. The next definition introduces an important type of asynchronism which can arise in vivo lor example due to different transmission delays or low pass filtering with a small time-constant.



The first theorem asserts that for a sustained input either there is a parasitic leakage activity or the output is composed of oscillations. Thus, the recurrent shunting anatomy is not suitable for the suppression of temporally sustained inputs. The second theorem studies the response of the shunting nonrecurrent

DEFINITION 3. If" Ui(S, t) = uc(s, t " r) then the asynchronism is o f the simple delay type. This is the type of asynchronism used in local temporal discrimination in Theorems 1 and 2. In the center-surround organization, the input is not delivered to single channels but to a network of cells with

Motion Perception

499

varying weights. A delay in the inhibitory paths is added to include desired properties of the nonrecurrent local discriminator to those of antagonistic shunting networks. The next theorem investigates the implications of this classical delayed inhibition scheme within shunting networks. THEOREM 3 (Synchronism). Suppose that the azvn-

chronism is of simple delay Lvpe. Then, ,for a steadystate input the a~vnchronous shunting network is asymptotically synchronous. Moreover, ,for an input itt translation with velocity v the asynchronous equation is equivalent to a synchronous equation with a shtfted inhibito O, receptive field. The shtlft is a linear thnctio~t <~/ v. S

Proof. The proof is given in Appendix B.

-

B

In order to illustrate the meaning of this theorem, simulations of the synchronous and asynchronous networks are shown in Figures 9 and 10. The input for the simulations is illustrated in Figure 8. It represents a white bar in front of a black background. At time zero the white bar is turned on and kept steady for a given time period. Then it begins to move sideways and finally comes to a stop. This input pattern is chosen to illustrate the response of the networks to both static and moving stimuli. The kernels G~, and (7, are chosen as Gaussian functions of distance to yield Gaussian shaped receptive fields. As shown in Figure 9, the synchronous network detects nonuniform regions of the input pattern. To see this, put u,,(s, t) = u,(s. t) = u(s. t) in eqn (36) in order to obtain the synchronous equation (cf. Definition

(a) FIGURE 9. Response of the synchronous network to the input pattern of Figure 8. An intermediate snapshot is given in (a). The vertical axis corresponds to the activity of the cells and the other two axis are space and time. The network performs a satisfactory edge detection for static phases of the input. Its performance is severely hampered when the input is in motion.

2) which can be written in the following form:

ax(s. t)

~(s. t)x(s, t) + ~(,~. t).

-

,#l

(37)

The next step is to express the input as a function of pattern weights O(s, t) and the background activity U(t) by using the following definition: t)U(t)

(38)

U(t) = £ u(s. t) d.~

(39)

u(s,

t)

=

t~(,s',

with

and

£z~(s, t) ds FIGURE 8. The input pattern used in the simulations of the networks. Vertical axis corresponds to the intensity and the axis s and t are, respectively, space and time axis. A white bar in front of a black background is kept steady first, it then starts to move towards the right and comes to a stop.

=

1.

(4O)

Then we have:

cds. t) = A + U(t)(t~ * SOO)

(41)

sJ~(s, t) = VU(t)(t~ *

(42)

DOG)

500

ff (iv'mc~ and ,~i. (iu,<' z(

and Weber law modulation due to the form: <'IU v ----

(b)

(a) FIGURE 10, Response of the asynchronous network to the input pattern of Flgure 8. An intermediate snapshot is given in (a). The vertical axis corresponds to the activity of the ceils and the other two axis are space and time. The ASN shows a strong reaction to the onset of the sUmulus. It then exponentlally decays toward the response of the synchronous network (asymptotic synchronlclty). For stimuli in tootion, ASN detects with a strong response moving edges.

where D O G = G~(s) - G~(s) (difference of Gaussians), SOG = G,.(s) + G ~ ( s ) ( s u m of Gaussians), * is the spatial convolution operator, and we assumed B = D = V for simplicity. The differential equation (37) is a first-order linear differential equation with variable coefficients. Solving this equation for a step input (i.e., u(s, t) = 0 for t < 0 and u(s, t) = u(s) for t -> 0) and assuming zero initial conditions, one obtains: x(s, t) = V

U(~ * DOG) A + U(~*SOG) x (1 -~ e i~,i,l,~soc~,). (43)

The steady-state value of x(s, t) has the properties of infinite dynamic range since lim x(s, ~) = V ~ * DOG v .,. z? * SOG

(44)

(45)

(rev.: Grossberg, 1983). Many visual models making use of convolution and/or DOGs (e.g., Adelson & Bergen, 1985: Fleet et al., 1985, Marr, 1982: Richter & Ullman, 1982; Watson & Ahumada, 1985) are a special case of this expression. When expressed dynamically they can be classified as additive models and can not explain the above properties that are related to brightness constancy and brightness contrast properties of visual perception. Shunting con> petition changes the constant coefficient differential equation into a variable coefficient equation. This enables the network to tune its sensitivity by automatic gain control. Since we arc studying here the response of the network to dynamic inputs, transients are also of prime importance. Equation (43) show.~ that the gain term (t' determines the Fate whereby the steady-state is reached. Thus, m contrast to additive models, shunting models have variable reaction time, that is transients are a function of the input ((.3~men & Gagne. 1987c. 1988b). Basic properties of the additive networks and comparisons with alternative models proposed in the literature are given in Appendix C. ,ks shown in Figure 9 the network performs an edge detection through parallel convolution, The steady-state is reached with an exponential rate depending on the input intensity. Figure 9a shows an intermediate snapshot during motion where the ambiguity of the detected "actual" statistics (edges) with respect to decaying activity, reminiscent from static input, is shown. This arises from the fact that the reaction time of each cell will var~ according to the input intensity on its receptive lieM. As an example, a previously excited cell will dccax with rate A if the input is moved outside its receptive field. HoweveF, the activity of a newly excited cell will increase with a rate A ,- U(t~ ::: SOG). Figure 10 shows the response ot the asynchronous network and reveals a trade-off between position and motion measurements. Moving edges are enhanced since a shifted inhibitory field implies less inhibition at the peak position, that is detection of moving edges with higher activity than static edges (compare the amplitude of detected static and moving edges in Figure 10). This activity increase for moving edges is obtained at the expense of a loss in positional precision since wider receptive fields result from the shift of the inhibitory field (compare the width of static edges with moving edges in Figure 10). Moreover, the asynchronism induces a strong sensitivity to the onset of the stimulus as shown in Figure 10. The extension of local temporal discrimination to

Motion Perc{7~tion

ASN is motivated by the noise-saturation dilemma. The solution makes use of shunting inhibition and implies as a result the suppression of homogeneous regions. Although this side effect may seem undesirable at first, consideration of matching problems shows that by solving the local noise-saturation dilemma, the visual system also resolves the matching ambiguity > (Ogmen & Gagnd, 1987b). Although the synchronous network can explain fading of homogeneous regions and related data (Grossberg, 1987b, c), it does not explain fading of stabilized images as shown in the experiments of Ditchburn and Ginsborg (1952): Riggs, Ratliff, Cornsweet, and Cornsweet (1953). Theorem 3 shows that ASN has spatial homogeneity suppression properties of the synchronous network. It also shows that for moving stimuli the receptive field is functionally modified. This modification results in an enhanced detection of moving edges when compared to static edges as shown in Figure 10. Thus, if the output of this filtering stage has a threshold like nonlinearity (e.g., [xi(s, t)] +), depending on the threshold either static edges (and in general static inputs) will be suppressed and dynamic edges will be signaled or both static and enh a n c e d d y n a m i c f e a t u r e s will be signaled. An examination of Figure 10 reveals another property of the ASN: leading and trailing edges are detected antagonistically. This sensitivity to precedence can be eliminated by combining the output of an oncenter off-surround network with the output of an off-center on-surround network as illustrated in Figure ll. In summarv, in lhis part of the paper we studied asvnchronous interactions between excitatory and inhibitory signals as a mechanism to achieve temporal adaptation. Spatial interactions are introduced through shunting competition. The resulting networks are adaptive, and have nonlinear intensity processing properties. 4. CONCLUDING REMARKS AND EXTENSIONS

501

Off-Center On-Surround

+

On-Center Off-Surround

Input i FIGURE 11. Precedence insensitive oriented asynchronous shunting contrast filter. As shown in Figure 10, ASN detects antagonistically leading and trailing edges of the moving stimuli. This asymmetry is the result of an asymmetric shift in the effective receptive field profile. To synthesize a network which is insensitive to this precedence the outputs of an on-center off-surround network and an off-center on-surround network are combined after a thresholding nonlinearity. At the same spatial locations one on-center-offsurround and one off-center-on-surround network coexist. Their outputs are topographically mapped at a third level layer after a thresholding operation.

ORIENTED

DIPOLE FIELD

OFF[ -

"f

-

I

+

m

F+

ON

Recently, Schmid & Bfilthoff (1988) combined neuropharmacology and electrophysiology to study the role of inhibition in motion detection in the insect visual system. They found that the direction selectivity of the HI cell can be abolished by blocking the inhibitory neurotransmitter GABA. These observations seem to favor an inhibitory detection scheme as in the Torre-Poggio (1981) model (which is related

CC LOOP OC FILTER ORIENTED MASKS

INPUTS

z, Spatial(temporal) matchingof imagesrequires the detection of spatially (temporally) nonuniform regions, since uniform regions do not carry relevant informationand can bias the matching proccdurc.

FIGURE 12. The boundary contour system of GrossbergMingolla model: The oriented contrast filter (OC) sends its output to the cooperative-competitive loop (CC) (from Grossberg, 1987b).

to the vetoing model of Barlow and Levick (1965)) rather than a facilitatory additive mechanism as suggested by Riehle-Franceschini data. We suggest that the apparent contradiction between these two experimental findings arises from an oversimplification of the motion detection problem by reducing it merely to a single synaptic interaction. In our model. both excitatory and inhibitory interactions play an important role and extensions of our work include quantitative studies of both Riehte-Franceschini and Schmid-Biilthoff data. Extensions of the work also include testing of the E M D model for human perceptual data such as contrast modulation in Glass patterns (Pradzny, 1985). Neural models presented herein are developed as specialized architectures of a unified global model of visual perception. These models are not pure motion sensors and carry information about other aspects of the image such as texture, and contrast. For the model of Part I, Iobula cells interact to further process this information. This stage of interactions is under active investigation, and it is interesting to note that an opponent interaction strategy between directionally selective cells of lobula using the gated dipole offers a possible explanation for motion aftereffects observed in this visual system (Srinivasan & Dvorak, 1980). The boundary contour subsystem (BCS) of a global model proposed by Grossberg and co-workers is depicted in Figure [2. The asynchronous shunting network is proposed to substitute synchronous filters of the oriented contrast (OC) system. ASN corresponds to the simple cells and the precedence insensitive cells of Figure 11 to the complex cells of the OC filter. By such a modification, predictions of the BCS concerning long-range boundary completion elfects can be extended to illusions involving motion. Consider for example the grating shown in Figurc 13. When the vertical stripes are stationary, the cen-

FIGURE 13. Vertical gratings used inthe moving visual phan. toms illualons. When the input is static, the central region is perceived correctly as a h o m e s ~ . If the gratIngs move sideways, a boundary completion and filling-in is perceived across the opaque region.

tral region is perceived correctly as a uniform sm-, face. Tynan and Sekular discovered that a horizontal motion of these stripes induces perception ot boundary completion and filling-in across the opaque region (Tynan & Sekuler, 1975)~ The proposed m o d ification of the BCS offers a possible explanation ~or this p h e n o m e n o n since when acli,~iiies of detected static boundaries are not strong enough to activate long-range completion process, higher activities oi detected moving contours can drive the long,range cooperation loop to a new equilibriuin where com~ pletion of boundaries across thc opaque region ~ccurs, This result is also compatible with the finding that flicker can induce the same ii!usion (Genter & Weisstein, 1981) since ASN is slrongly sensitixe t~ flicker. Extensions of this work mctude extensw~computer simulations o f the modiiied BCS for i)c~ ceptual phenomena related to moving and flickering phantoms and its relationship ~;> target visibilit~ (Brown & Weisstein, 1988), depth perception and motion aftereffects (Smith & O r e , 1979: Weisstem Maquire, & Williams, 1c~8,~ ,', ),

REFERENCES Adetson, E. H., & Bergen, ,I. R. 11985). '$patiotemporal energy models tor the perception of motion, fmu'nal o1 the f)ptical Soeie O' ~! America A, 2, 2S4-299, Arnett, D, W. {1972) Spatial and tempered mtcgration propcrtic~, ~)l units in firsi optic ganglion of dip!c:rans. ,lourna/of Nef~rophy,~'iologv, 35, 429-444 Barlow. H. B.. & Levick. W. R. /1965). lhc mechanism oi directionally selective units in rabbits !i:ii+~a.Journal (,1 P/z.v.,ioh)g.v, 178, 477-504. Borst. A.. & Egelhaal, M. 11987) letnpordl modulation o{ luminance adapts time constant of fly movement detectors. Biological Cvbernetics, 56,209-215. Brown, J. M., & Weisstein, N, (1988). A phantom context cllccti visual phantoms enhance target visibililv Perception & P~v ~/zophy,vics, 43, 53-56, Buchner. E. (1984), Behavioural analysi,, ~1 spatial visio13m insects, tn M, A, All ted.), ['hotorec~Tm~*: and vision m m:'e~. u:brates (pp. 561. 621). New York: Plenum Press. Cornswcet, T. N. (1070). Visualpercel~tio~,~ New York: Academi~ Press, Ditchburn, R. W., dx: Ginsborg, B. 1_, ( i~'>2,i Vision witi~ a ~,ta: bilized retinal image. Nature, 17(I, 36 ~,s van Doorn. A. ,I.. & Koenderink, .~. J. (1983). The strucmrc oi the human motion detection system /I:EE Transactions on System, Man, attd Cybernetics, 13, t)1~--~22. Egelhaai', M. (1985a). On the neuronal basis of figure-ground discrimination by relative motion in H~c visual system of the fly 1. Behavioural constraints imposed on the neurona! network and the role of the optomotor system. Biological Cybernetics, 82, 123-140, Egelhaaf, M. (1985b). On the neuronal basis of figure-ground discrimination by relative motion in the visual system Of the lly II. Figure-detection cells, a new class of visual interneutones. Biological Cybernetics, S2. 195:-209. Egelhaaf, M. (1985c). On the neuronal basis of figure-ground discrimination by relative motion in the visual system of the fly lli. Possible input circuitries and behavioural significance of the FD-cetls. Biological Cybernetics, 52. 267-280.

Motion Perception Egclhaaf, M., & Reichardt. W. 11987). Dynamic response properties of movement detectors: Theoretical analysis and clectrophysiological investigation in the visual system of the fly. Biological ('ybernetics, 56, 69-87. Egclhaaf, M., Hauscn, K., Reichardt, W., & Wehrhahn, C. 11988). Visual course control in flies relies on neuronal computation of object and background motion. Tren& in Neuroscie,wes, 11, 351-358. Fleet, D. J., Hallet, P. E., & Jepson, A. D. ( 19851. Spatiotemporal inseparability in early visual processing. Biological Cvbernetits', 52, 153-164. Franceschini, N. ( 19851. Early processing of coLour and motion in a mosaic visual system. Neuroscience Researeh (Suppl. 2), S17 $4~. Gentcr 11, C. R., & Weisstein, N. 11981). Flickering phantoms: A motion illusion without motion. Vision Research, 21,963GilD. J.-C. 119771. Introduction aux svst~mes asservis non li;t6aires. Paris: Dunod. Gregory, R. L. 11978). Eve and brain, the psychology O/'seeing (3rd ed.L Ncw York: McGraw-Hill. Grossbcrg, S, (19711). Neural pattern discrimination. Jourtlal of Theoretical Biology, 27, 2~)1-337. Grossberg, S. (1972a). A neural theory of punishment and avoidante. I. Qualitative theory. Mathematical Biosciences, 15, 3907. Grossberg, S. (1972b). A neural theory of punishment and avoidancc. II. Quantitatixc theory. Mathematical Bioscience,s, 15. 253-285. Grossbcrg, S. (1973). Contour enhancement, short-term memory. and constancies in reverberating neural networks. Studies i~1 Applied Mathemati{w, 52,217-257. Grossberg, S. [ 1982). Studies of mind and brain: Neural principle,~ ~/ learning, perception, development, cognition, and motor control. Boston: Reidel Press. Grossbcrg, S. (1983). The quantized geometry of visual Space: The coherent computation of depth, form. and lightness. The Behavioral atut Brain Sciepzces, 6. 625-657. Grossberg, S. (1987a). The adaptive brain, Vols. J and II. Amsterdam: North Holhmd. Grossberg, S. (1987b). Cortical dynamics of three-dimensional form, color, and brightness perception, 1: Monocular theory. Perception & P~yehophvsics, 41, 87-116. Grossbcrg, S. (1987c). Cortical dynamics of three-dimensional lorm, color, and brightness perception, II: Binocular theory. Perception & Ps'ychophysics, 41. 117-158. Grossberg, S. (1988a), Neural networks and natural intelligence. (ambridge, MA: M1T Press. Grossberg. S. (1988b). Nonlinear neural networks: Principles, mechanisms, and architectures. Neural Networks, 1, 17-61. Guo, A. (Ic~88). Figure-ground and pattern discrimination in the visual system of the fly. In D. T. Yew, K. F. So. & D. S. C. "lsmlg (Eds.). Vi,~ion: Strnclure and function (pp. 489-549). Singaporc: World Scientific. Hausen, K. (I 9841, The [obula-complex of the fly: Structure, function, and significance in visual behaviour. In M. A. All (Ed.), Photoreception and vision in invertebrates, (pp. 523-559). New York: Plenum Press. Laughlin. S. ( 19841. The roles of parallel channels in early visual processing by the anthropod compound eye. In M. A. All (Ed.). Photoreeeption and vision in invertebrates, (pp. 4574Sl). New York: Plenum Press. Lenting, B. P. M., Mastebroek, H. A, K., & Zaagman, W. H. ( 19841. Saturation in a wide-field directionally selective movement detection system in fly. Vision Research, 24, 1341-1347. Marmarelis, P. Z., & McCann, G. D. 11973). Development and application of white-noise modeling techniques for studies of insect visual nerwms system. Kybernetik, 12, 74-89. Marr, D. (I982). Vision. San Fransisco: Freeman.

503 Mastebroek, H. A. K., Zaagman, W. H., & Lenting, B. P. M. (19801. Movement detection: Performance of a wide-field element in the visual system of the blowfly. VMon Research, 20. 467-474. Mastebroek, H. A. K., Zaagman, W. H., & Lenting. B. P. M. 119821. Memory-like effects in fly vision: Spatio-temporal interactions in a wide-field neuron. Biological Qvbernetie,~, 43, 147-155. McCann, G. D. (19731, Thc fundamental mechanism of motion detection in the insect visual system. Kvbernetik, 12, 64-73, Mimura, K. (1974). Analysis of visual information in lamina neurons of the fly. Journal 0[ Comparative l'hysiology A, 88,335372. Mimura, K. ( 19761. Some spatial properties in the first optic ganglion of the fly. Journal ~[' Comparative t'hysiology A, 62.871114. Nakayama, K. 119851. Biological image motion processing: A review. Vision Research, 25,625-660. Ogmen. H., & Gagn6, S, (1987a). Early spatio-temporal integration by gated competitive nctworks. Proceedings of IEEE First International ConlOrence on Neural Network.s, San Diego (pp. 805-811 ). O~men, H., & Gagnc, S. (1987b). Dynamic pattern matching with filling-in. Proceedings Of lEEE International CTonj?rence on Systems Matt and Cybernetics (pp. 733-738). O~men. H., & Gagn6, S. (1987c). Low level encoding of motion by asynchronous shunting networks. Proceedings O/Miconex "87, Ogmen, H., & Gagne, S. (1988a). Short-range motion dctcction in the insect visual system. Neural Networks, 1 (Suppl. 11,519. O~men, H., & Gagn6, S. (1988b). Reaction time and gain control in neural dynamics. Proceedings ol the ("anadian Medical and Biological Engineering Socie(v ('onl?re~ce (pp. 53-54/. O~mcn, H., & GagnC'. S. (1990), Neural models for sustaincd and on olf units of insect lamina. Biolo~,,ical ('vber~tetics'. Poggio, T., & Rcichardt, W. ( 19731. Considerations on models of movcment detection, Kybernetik, 13, 223-227. Pradzny, K. (1985). Studies of some new phenomenon of motion perception. Biological Cvberneties, 52, 187-194. Rcichardt, W. (1961). Autocorrelation, a principle for evaluation of sensory information by the central nervous system. In W. A. Rosenblith (Ed.), Principles of ~'en,~orv communications (pp, 303-3171. New York: Wiley. Reichardt, W. (1979). Functional characterization of neural interactions through an analysis of behavior. In F. O. Schmitt, & F. W. Worden (Eds.), The neuroscience,s./ourth study program (pp, 81-1031. Cambridge, MA: MIT Press. Reichardt, W. 119861. Processing ol optical information by the visual system of the fly. Visiol~ Re,search, 26, 113-126. Reichardt, W., & Guo, A. (1986). Elementary pattern discrimination (behavioural experiments with the fly Musca Domestica). Biological (3,bernetics. 53, 285-30t~. Rcichardt, W., & Poggio, T. 119791. Figure-ground discrimination by relative movement in thc visual system of thc fly, part 1: Experimental results. Biological Qvherptetics, 35, 81-100. Reichardt, W.. Poggio, T., & Hauscn, K. 119831. Figure-ground discrimination by relative movement in the visual system of the fly, part II: Towards the neural circuitry. Biological Cybernetics. 46 (Suppl,), 1-311. Reichardt, W., & Egelhaaf, M. ( 19881. Properties of individual movement detectors as derived from behavioural experiments on the visual system of the fly. Biological Cybernetics, 58,287294. Richter. J.. & Ullman, S. 11982). A model for the temporal organization of X- and Y-type receptive /ields in the primate retina. Biological Cybernetics, 43, 127-145. Riehle, A., & Franceschini, N. (19841. Motion detection in flies: Parametric control over on-off pathways. Experimental Brain Research, 54, 390-394.

5¢)4

1t. t)gote~l a n d X C;agnd

Riggs, L. A . , Ratliff, F., Cornsweet, J. C., & Cornsweet, T. N. (1953). The disappearance of steadily fixated visual test objects. Journal o f the Optical Society o f America A , 43, 495501. de Ruyter van Steveninck, R. R., Zaagman, W. lt., & Mastebrock, H. A. K. (1986). Adaptation of transient responses ~f a movement-sensitive neuron in the visual system of the blowfly Calliphora Erythrocephala. Biological Cybernetics, 53, 451-463. van Santen. J. P, H.. & Sperling, G, (1984). Temporal covariancc model of human motion perception. Journal o f the Optical Society o f America A, 1,451-473. van Santen, J. P. H., & Sperling, G. (1985). Elaborated Reichardt detectors. Journal o f the Optical Society of America A , 2, 300321. Schmid, A,, & Biilthoff, H. (1988). Using neuropharmacology to distinguish between excitatory and inhibitory movement detection mechanisms in the fly Calliphora Erythrocepha& Biological Cybernetics, 59, 71-80. Smith, A. T., & Over, R. (1979). Motion attereffect with subjective contours. Perception & Psychophysics. 25, 95-98. Srinivasan, M. V., & Bernard. G. D (I976). A proposed m c c h anism for multiplication of neural signals. Biological Cybernetics, 21,227-236. Srinivasan, M. V., & Dvorak, D. R. (1979). The waterfall illusion in an insect visual system. Vision Research, 19. 1435-t437, Srinivasan, M. V., & Dvorak, D. R. (1980). Spatial processing ol visual information in the movement-detecting pathway of the fly. Journal q f Comparative Physiology A. 140, 1-23. Torre, V., & Poggio, T. (1981). An application: A synaptic mechanism possibly underlying motion detection, In W. Reichardt, & T. Poggio (Eds.). Theoretical approaches in neurobiology (pp. 39-46). Cambridge, MA: MIT Press. Tynan, P., & Sekuler, R. (1975). Moving visual phantoms: A new contour completion effect. Science, 188, 951-952. Ullman, S. (1981). Analysis of visual motion by biological and computer systems, t E E E ( o m p u t e r , 14, 57-69. Watson, A. B., & Ahumada, A. J. (1985). Model of human visualmotion sensing. Journal o ( the Optical Society o{ America A. 2, 322-342. Weisstein, N., Maguire, W., & Williams, M. C. (1982). The effect of perceived depth on phantoms and the phantom aftereffect. In J. Beck (Ed.), Organization and representation in perception. Hillsdale. N J: Erlbaum. Wilson, H. R. (1985). A model for direction selectivity in threshold motion perception. Biological Cybernetics, 51, 213-222. Zaagman, W. H. (1977). Some characteristics of the neural activity of directionally selective movement detectors in the visual system of the blowfly. Ph.D. dissertation, Rijksuniversiteit te Groningen. Zaagman, W. H., Mastebroek, H. A. K., & Kuiper, J. W. 11978). On the correlation model: Performance of a movement detecting neural element in the fly visual system. Biological ('ybernetics, 31, 163-168.

APPENDIX

Proof o f Theorem 2. Since lim cqn (27) one obtains:

A

l(t) = 1 exists, by integrating

.....

(46)

Now integrate eqn (26): BJ. - D [ f 1 +

,,~(1

-

1} v_- / ' ; ( B

[hen ,'4S,

~:

that is: fl(B

t~} tt2

which is in contradiction with 31 ). 1he second part of the theorgm follows from the fact that J ~s a concave function ot I in {/ II "1o prove the last part, first note that (33) implies:

Since x.4zct is a strictly increasing tuncnon ol l. mequalities t25) and (501 imply that the inhibition ~s activated that is [ f 1 r Making use of the following [unction: HI

fiB

/ -

( I

~

[

~- l , ~ -

o11

~5~ ~

obtains:

One

F t l ) - FU_) ,~I

-I~B

,,~} -~iog , ,at ..

I.l

m

~- 61

t52)

then by ~35~ F(I)

b(ll

.... ,r/,X/

/Xl

. 53"~

Now since d F l l ) / d l - t (I) the mean value theorem can be applied to prove that there exists an 1, ] ~: (I. -IL such that:

Since r is a strictly increasmg function ot i r

II} < 2OAI

Vl

"

t55

1

t (I~ > L¢~AI VI - I

t56)

By (55} we prove that J (1) "= ,OAI. Showing x , i x i ~ F is equivalent to sh~wing:

Smcc .t t.(ll

(I} < _OAI

V I - - }.

158

x~t;
,59~

gl) = ~ tl ~.

which implies that .... Lit --" , (1}

#( l . l l l

VI:,

1, .

160}

since t ,s a concave function ot 1 n~ this interval This is in contradiction with the fact that 157} holds for ] -< / < /. This proves the theorem. APPENDIX

B

Proof o f Theorem ,3 The asymptotic synchronism follows from the tact that u..t~c} = u,l~L More generaUy, ,since "the input is steady state the previous relation holds after r. To show the shift property we will use the following lemma:

LEMMA 1. l f the input ts in translation with a velocit) v then:

B61 x,_(~c) = )' + ~51 ~

&(oct -

ing a contradiction: Suppose that x,/:cl --~ I

,L

+

[.f

~ = v.tt -- r.,) - s, ~ u t s ,

tl - uls,,,t,,I

Vs, s,,.t,t,,,

tM)

Proof. This result directly follows from the equation of rigid motion under uniform illumination.

{47)

l~

First suppose that [ f ]" = 0. Then it follows from (25) that x.(=) > F. Now consider the case when [ f ]" # 0, that is [ f ]+ = fl(xe(m) - ~1) > 0. The proof will be done by deriv-

Apply the Lemma

1

with the following substitutions:

I

5.

=

--

I .....

r.

Motion

505

Perception

nation becomes:

This yields: u(r, t - l") - u(u'c + r, t).

ax,(s, t)

A x t ( s , t) * u(~', t),

Ot

Hence.

ax:(s, t) /It

u(s, t - r ) * G,(s) = ( u ( v r + r, t)G,(s - r) dr.

Then by substituting l

Ax,_(s, t) + x,(s, t),

(67)

vr + r ax,,,(s, t)

Ax,,,(s. t) + x.,, ,(s, t),

J

u(.s,t

r ) * G,(s) = I u(I, t)G,(s + vr - l) dl i

thus u(s, t

r) * G,(s)

ax,,(s, t)

u(s, t) * G,(s + vr).

A x , , ( s ' , l ) + x, i ( s , t ) ,

Ot

Then it follows from Definition 2 that the resulting equation is synchronous and the inhibitory receptive field is shifted by vr which is a linear function of v.

and a x ( s . t) _

A x ( s , t) + DOG * (x,(s, t) - x,,(s, t)).

at

(68)

The impulse response is given by: APPENDIX

C

The additive equation is obtained by dropping the shunting factors in eqn (36), that is:

ax(s, t) hi -

Ax(s.t)

-

+ u,(s.t) * G,.(s) - u,(s, t) * G,(s).

(62)

We will first derive some mathematical properties of synchronous and asynchronous additive networks and then compare the results with models proposed in the literature. (i) S y n c h r o n o u s additive networks (SAN). By letting u,(s. t) = u,(s, t) = u(s, t) in eqn (62), one obtains the synchronous additive network equation:

h~,,,,(s, l)

{

]

t"+

_(m_ +1 1)! t'"*e 4t ) DOG(s).

(69)

(it) As'ynchronous additive networks (AAN). There are different neural mechanisms that can yield asynchronous inputs to the network starting from the image u(s. t). We will consider the following simple case:

a. Excitatory and inhibitory inputs to (62) are delivered through neurons obeying additive equations with different time conslants. b. The inhibitory input is delayed with respect to the excitatory input. Thus:

a.r(s, t) -

-

Ot

Ax(s, t) + DOG(s') * u(s, t).

(63)

au,~(s, t) -

When the kernels are chosen as Gaussian functions, DOG represents the difference of Gaussians. The impulse response of this network is given by: e ~'DOG(s).

h~,~,(s, t)

(64)

-

0t

au,(s, t) - at

-

A,u,(s, t) + a(s, t), A,u,(s, t) + u(s, t

"c).

(70) (71)

This particular AAN is defined by cqns (7(I), (71), and (62) and its impulse response is given by tL,~,,,(s, t) = K,.(e ...... e a')G,.(s)

The impulse response is separable, that is h (s, t) can be written as: h(s, t) - h,(s)h,(t). Moreover, it can be readily shown that the system is linear, time-invariant, and spatially shifi-invariant. Hence, results from linear, time (and space) invariant system theory can be applied. In particular, the output of this network for an arbitrary input u(s, t) is given by the convolution intervals: x ( s , t) = h(s, t) * * u(s, t),

(65)

where * denotes convolution in time. And with separability:

x(s, t)

h,(t) * (h,(s) * a(s, t))

= e A, , (DOG(s) * u(s, t)).

(66)

Note that these convolution integrals result in multiplications in the Fourier domain. This illustrates the popularity of LTI models with their Fourier domain analysis. For this reason, these models are sometimes called Fourier models. If local temporal discrimination is introduced through the use of a different number of interneurons obeying additive equations with the same time constant, the SAN with temporal discrimi-

Kj(e . . . . .

e ")G,(s),

(72)

where K, = (A - A,) E, and K, = (A - A,) '. Thus h(s, t) is inseparable. Moreover, these networks are linear, time-invariant, and spatially shift-invariant. aii) Comparisons. Considering the spatial aspect of the SAN (since it is separable), one can see that popular DOG models (e.g., Marr, 1982) are equivalent to SAN. Moreover, SAN with local temporal discrimination and Gabor function kernels is forreally equivalent to spatio-temporal filters used in Adelson & Bergen (1985) and Watson & Ahumada (1985) models. The basic difference between A A N and SAN is the spatiotemporal inseparability introduced through asynchronous interactions. This inseparability and related experimental data is discussed in Fleet et al, (1985). These authors studied a class of inseparable filters called center-surround models (CS) (Fleet et al., 1985; Richter & Ullman, 1982). With slow interneurons (i.e., A,~ ~ A and A, .~ A) CS models become equivalent to AAN. However, none of these models have nonlinear and adaptive properties exhibited by ASN.