Neural .~'ct~rks. Vol 3. pp. 245 2e.3, 1990 I'rintcd m lilt Is,.\ .All rights rc,,cr'..'ed
Ilst~3-00Sll90 $3 IX) - .00 ('op~.ngl'tl , 1'gt~!Pcrgarnon Pr,,.'sspit."
ORIGINAL CONTRIBUTION
Neural Mapping and Space-Variant Image Processing HANSPETER A. M A L L O T , W E R N E R VON S[-t-;IA:.N AND F()TIOS G I A N N A K O P O U L O S Institut fflr Ncuroinfortnatik. Ruhr-Univt:rxitiit. Bochum
( Received 12 Decemhr'r 1988: revised and acc~7~ted 15 S~Trtemlrer 1989)
Abstract The mapping of sensory surlaces onto cortical areas as well as maps in inter-areal projections are chtss!fied and discussed in terms o f space-variant image processing. The neurobiological phenomena discussed in this context include retinotopic maps, columnar organization, interactions between a retinotopic map and a un!form operation in a cortical area, and patchy or ~'egregated projections of two or more areas to a common target area. We present a mathematical Jramework lbr describing the interaction Of neural mappings and local image processing operations which allows functional interpretations. In an example fl'om visual navigation, we .show that neural maps are powerfid tools for the parallel processing o f visual information. Since mappings oJ the various types result in a spatial encoding ¢~larbitrary stimulus in[ormation, image processing operations can he applied to inJbrmation from other modalities or for higher-level problems as well. For this aspect O/neural mapping, we adopt tile term parametric mapping.
Keywords--Visual cortex, Spacc-variant image processing, Rctinotopic mapping, Optical flow analysis, Uniformity of visual cortex. Columnar organization. Ocular dominance columns. Visual receptive fields. I,IST OF M A T H E M A T I C A L
A function R,; ----, R such that k(x) = k(:xll) /,',¢(x) Mean mapped filter kernel included in the kernel K A function T --, R:. Mean topographic bh map included in the kernel K ~o,(x) A function R e--, R. Retinal sampling or window function S~ Sampling coefficient associated with the window function q~, ",(y) A function R e --, R. Point spread funclion for the coefficient s,. k(r)
SYMBOLS
Re. R ~ Two- and three dimensional real number space S, Spl A subset of R -~. mnemonic for source area T A subset of R e, mnemonic for target area X.y.Z Vectors of two to three components '2 ~J{ ,.< > ('oordinatc transforms R e -+ R" M,,(x) Areal magnification factor at x E S M~(x. ~o) Linear magnification factor at x E S m direction q~ C [0, 2~z] M,(x) ('ellular magnification factor at x ~ S c,,(v) I)cnsitx of input from S,, at location v E 1" Perspective map, R " - ~ R" s(x). e(x) Functions R-" -+ R. spatial distributions of stimuli or excitation K(x: y) A function S × T - - . R. a kernel of a linear integral operator k(x) A function R:--~ R, usually a convolution kernel
I. I N T R O D U C T I O N During the past decade, receptotopic mappings have become an increasingly important topic in the study of almost all sensory modules of the brain. Simultaneously, the intrinsic architecture of the cortex has been elucidated by means of new anatomical and physiological methods. Columnar internal organization and patchy inter-areal connectivity are now considered general principles of cortical architecture. In the emerging picture, the cortical position of a neuron largely determines (a) which piece of data it receives, (b) how this data is processed, and (c) to what target the processed information is eventually transferred. These three aspects of mapping correspond to the notions of a sensory or afferent map describing the projection into a given area. an internal map describing the computations performed
l'his work was supported by the Deutscht: Forschungsgt:meinschaft, Grant No. St: 251 ,'30-1. and by the German Ft:dcral Department of Rest:arch and lt:chnology (BMFI'). Grant No. ITR8800A2. We art: grateful to T h o m a s Zielke and Gerd .I. Giefing for valuable discussions and to K. Rehbindt:r who carefull~ prepared the drawings. Correspondence: Hanspeter A. Mallot, Institut for Neuroinfl)rmatik, Ruhr-Universit~it ND 04/174, D-4630 Bochum 1. Federal Republic of Germany.
2~
246
tt. A. Malh~t, W. yon Seelen. and kl Giunnakopoulos
(1988a) self-organizing feature maps fire of this type, too. As an important special case of space-variant image processing, we investigate the cascade composed of a topographic mapping find a subsequent spaceinvariant filter ("mapped filter", cf. Figure It, section 4). Space-invariancc or uniformity of the internal operation in the visual cortex was reported by Ftubel and Wiesel (1974). Although strict uniformity has been doubted in more recent studies (Dow, Snydef, Vautin & Bauer, 1981 ), we selected this case as an instructive example. Even if uniformity of the intrinsic processing is assumed, space-variant distributions of receptive field orientation (find size) result from the mapping plus filter cascade. Similar types of space-variance have been found in the cat visual cortex by Payne & Berman (1983) and Berman, Wilkes & Payne (1987). As a step towards a theory of the networks composed out of cortical areas, the com,'ergence of several projections onto one common target area is discussed in section 5. In this case, fibers tend to segregate in patches, stripes, etc. (Figure ld). One general effect of mapping is the encoding of stimulus information into the spatial location within a cortical area. One might prefer to think of representations of this type in terms of two-dimensional histograms rather than maps. If ordinary early vision operations are applied to these imagelike representations, interesting image processing can result. This computational aspect of mapping was pointed out earlier by Ballard (1981, 1987) and Barlow (1986). In section 6, we discuss neurobiologically plausible mechanisms for the construction of parametric maps and their functional implications. Finalh', an application of mapping to a computational problem is presented in section 7. While the electrophysiological assessment of neural
within that area, and a motor or efferent map according to which information is transferred to a target area. Sequences of this type can be combined into networks of several functionally mapped areas. In this case, the output maps of primary areas coincide with input maps of areas further down in the sequence. We shall argue that this organization is crucial for the understanding of "natural" computation; it may be considered a specification of the idea of the localization of cortical function. The scope of this paper is to give a mathematical framework for the description, simulation, and computational interpretation of neural maps. Figure 1 schematically shows the different types of mappings discussed in this paper. The rationale underlying all global mapping studies (e.g., Daniel & Whitteridge, 1961: Tusa. Rosenquist, & Palmer, 1979: van Essen, 1985: Allman & Kaas, 1971; Kaas, Nelson, Sur, & Merzenich, 1981: Suga, Niwa, Taniguchi, & Margoliash, 1987) is the topographic pointto-point mapping, or coordinate-transform depicted in Figure la. Explicit mapping functions have been proposed by Schwartz (1977) for visual area Vl in the monkey and by MaIIot (1985) for visual areas 17, 18. find 19 in the cat. In a topographic mapping, position is the only receptive field parameter that varies as the recording electrode proceeds through the cortex. In general, however, a space dependence of any receptive field property or a combination thereof can be considered a maFping. Mappings of this type are many-to-many and can be treated as space-variant operators (Figure lb. section 3). They have been described electrophysiologically (Hubel & Wiesel, 1977: Cynader, Swindale, & Matsubara, 1987; Swindale, Matsubara, & ('ynader, 1987: Suga et al., 1987) and have been termed "computational'" maps by Suga et al. (1987) anti Knudsen, duLac, and Esterly (1987). Kohonen's
~(:f gel
Ioyer
~t,Okar Ce
foyer
t forget Ioyer
I source
I,'~yer 1
i
1 'l'
i
| !~li/ ,~
i
COP,-,~.C~ torge"
IQye,
, SOurCe
IOye~
I" i//l source I(:yer 2
FIGURE 1. Mapping types, a. Topographic mappings are piecewlse continuous one-to-one maps of a source layer S onto a target layer T. b. Space-variant filters involve many-to-many connections. For each pair of points x E S, y E T, the kernel K(x, y) specifies a connection strength, psf: point spread function; rf: receptive field profile, c. Mapped filters are a special type of space-variant filters that can be realized as a cascade of a topographic mapping and a space-invariant filter, d. Patchy projection of two source layers S, and $2 to s common target area.
Ne u ra I ,'~l~tppit tg
247
mappings with single-unit recordings is rather tedious, there are now a number of anatomical and physiological methods that provide more appropriate data (e.g.. Tootell, Switkes, Silverman, & Hamilton, 1988; Cynader et al.. 1987; Blasdel & Salama, 1986). Wc expect that the mathematical framework presented here will prove useful for the interpretation of this type of data. 2. T O P O G R A P H I C
MAPPING
In the simplest case, a neural mapping may be regarded as a rule for the projection of a fiber bundle from one neural sheet to another (Figure la). In mathematical terms, a mapping of this type will bc treated as a function :~ relating two subsets of the plane which will bc denotcd S (for source arca; S also denotes the visual field) and T (for target area), respectively: ~,~:S----, T.
S. T C_ R:.
(1)
Wc assume throughout this paper that ~J~ is one-toone and onto, i.e., invertible and that both ~,Z and y,~ ~ are piecewisc continuous and differcntiable. Mappings with these properties will be called topographic" mappings in this paper. ~J~may be considered a "'rubber-sheet transformation" if it is understood that (a) there might be a few slits in the rubber sheet and (b) the elasticity' of the rubbcr nccd not bc isotropic. We prefer the real two-dimensional notation (I) to the complex notation used by Schwartz (1980) since it applies to nonconformal ~ mappings as well. There is good evidence that nonconformal mappings do occur in the visual cortex (Mallot, 1985). Areal magnification corresponds to the absolute value of the determinant of the Jacobian of " • J'L,
M,,(x):-
dot
=
J.(x)i
i,.r--~' a.~,-~ -
&r--~-" ~mr, I '
x = (x,,.r:) ~ S.
(2)
Linear magnification along an orientation t,0 is given
by M;(x: ~) "=
ksin
x ~ S.
(3)
A group of mappings of this type, namely the retinotopic maps in the cat visual cortex areas 17, 18, and 19, has been studied in previous work (Mallot, 1985). The mapping of all three areas is based on the complex power function with the exponent p -. I).43 which, in polar coordinates, can be written ~l~:R-"
~ R2:
~J~(r, ~) • = (r;'. p¢o).
For a better approximation of the electrophysiological mapping data (Tusa et al., 1979), corrections were introduced via affine transformations that render the overall map nonconformal. Figure 2 illustrates this coordinate transform for the areas 17 and 18 by using the simulated cortical representation of a visual scene that has some relevance for the cat. It is difficult to see what computational advantages these distortions might have. ['his simple illustration shows that computational interpretations of neural mapping should not be based on special properties of one mathematical rule. Rather, a framework for the interpretation of apparently irregular mappings is required. This need becomes even clearer if the multiplicity of mapped representations in all mammals is considered. For instance, in macaque monkeys, whose area V I can in fact be approximated bv the complex logarithm, there are some 15 other retinotopically organized areas with different maps (van Essen, 1985). However, most applications of coordinate transforms that have been proposed so far use the complex logarithmic map and rely heavily on its special properties (e.g., Sawchuk, 1974: Casasent & Psaltis, 1976: Reitboeck & AItmann, 1984: Jain, Bartlett, & O'Brien, 1987). 3. S P A C E - V A R I A N T I M A G E P R O C E S S I N G 3.1. Maps are Defined by Receptive Field Properties
Basically, the information processing performed by a given neuron is described by its receptive field. Therefore, the one-to-one concept (I). where the notion of the receptive field does not exist, must be replaced by a many-to-many mapping. (Many-to-one could describe a single receptive field, but not an array of fields.) In this section, we show how this idea can be formulated in terms of space-variant image processing, where different image processing operations are applied in parallel to different parts of the image. To define a neural mapping, we first have to select a distinguishable property of a receptive field. For most sensory modalities, position on the sensory surface is such a property. [n this case, there is a trade-off between the sharpness of the mapping and the size of the receptive fields•' Other maps represent orientation, velocity, spatial frequency, or color in the visual system and frequency or interaural la-
(4) : Erickson (1968) has introduced the terms topographic and modalities depending on the size of the receptive fields. Note that there is no necessary relation between nontopographic modalities and nontopographie maps: The former need not be mapped at all while the latter may represent some arbitrary stimulus characteristic of an otherwise topographic modality.
nontopographic sensory
: Conformal mappings are characterized by the preservation of local angles. That is, if two curves in the domain of the map intersect at a given angle, their images will intersect at that same angle in the range of the mapping. The conformal mappings arc but a small subset of the mappings studied here.
248
tt. A. Mallot. Ihi yon 5eelen, uml kl (;iannakopoulos t'8° I
"
! A
~
U
I
'.
.//,
O"
Q FIGURE 2. a. Sketch of a realistic scene from the cat's environment, b, c. Representation in cortical areas 17 and 18, respectively.
tency in the auditory system, etc. A two-dimensionally extended area of the brain is said to map a certain stimulus property, if the following conditions hold: • The receptive fields of neurons in that area are sufficiently sharply tuned to the stimulus property in question. • The peak of the tuning curve changes continuously with the cell's position in the neural layer. This definition works quite well, if, for example, receptive field position in the cat area 17 is considercd. However, in general, the choice of the receptive field property presents a problem. For instance, in area 17 of the cat, properties like orientation, size, ocularity, disparity, or velocity vary jointly with the cell's position in the cortex. The maps of the various tuning parameters can either be produced by just one underlying map (simple mapping) or they can be mutually unrelated (nested mapping).
Here, s : S ---, R denotes the stimulus (input distribution) and e : T ~ R the excitation. The integral is taken over the two-dimensional domain S. Receptive fielcL~"and point ,~pread functions can both be obtained from the kernel K(x: y). For fixed y C T, the receptive field profile of a neuron located at position y is obtained. Vice versa, for fixed x C S, K is the point spread function or point image for a stimulus at position x. Put differently, K(x; y) is a family of receptive field profiles parameterized by y (cf. Figure lb). The mapping is the clearer, the sharper the if-profiles change in dependence on y. Therefore, if position is the mapped property, sharpness of the mapping relates to the size of the receptive fields. In pure coordinate transforms (section 2). K is of the form g(y - ~,?(x)) and the size of the receptive fields is zero. The centroid or center of gravity for a receptive field function K(x, y,,) is given by f xK(x y=) dx m,(y,,) : -
.
(6)
f K(x: y..) dx
3.2. Definitions In terms of space-variant image processing, the situation can be described as follows: Space-variant linear operators are characterized by a kernel K(x: y) which, for each site y E T in the cortical target area and each stimulation point x E S, specifies the weight by which the input influences the response (Figure lb). s(x)-
~ e(y) ::- (s(x)K(x: y) dx. .ts
(5)
where m. E R:. The mean topographic" map for a kernel K can be defined by assigning to each point y C T the centroid of the corresponding receptive field: (7) :~l,(y) : ~ m l ( y ) .
Note that :~),goes from the target area "'backwards" to the source area. Of course, this is how topographic maps are measured in electrophysiology, where an electrode location in the cortex and a receptive field
Neural Mapping
249
3.3. Space-Variance of Input and Intrinsic Processing
center in the visual field are related. In section 4.6, we will prove that :~,v is in fact the inverse of the correct mapping function for a certain class of kernels. ('elhdar magnification: Consider a point stimulus s(x) = ¢~(x - x~,) for some x~, E S. A measure for the total excitation elicited by this stimulus is the total energy of the corresponding distribution of excitation, i.e., the point spread function. While this total excitation may be interpreted as an amplification or gain factor, it will be taken as a model of cellular magnification (Myerson, Manis, Mievin, & AIIman. 1977) in the continuous approach presented here. ]'his interpretation assumes that each cell in T has lhe same dymunic range such that an increase in total excitation corresponds to a proportional increase in the number of cells carrying that excitation. We denote bv M,(x~,) the cellular magnification at a h)cation x, E S and define:
.w (~) ' - ( Ji K:(x,,: y) ay)' :.
While retinal processing and retinotopic point-to-point maps are obviously space-wmant operations, internal cortical processing is often considered uniform, or space-invariant. This view is supported by both physiological (Hubel & Wiesel, 1974) and anatomical studies. For a more recent discussion of the issue see Orban (1985) and D o w e t al. (1981). Space-variance in the internal processing, at least in the striate cortex, occurs on a fincr spatial scale, i.e., on the level of columnar organization. In terms of the theory developed here, the two types of space-variance can be described as follows: Input space-variance: The original (retinal) input s(x) is transformed by a neural mapping into a cortical version s(.~ ~(y)). The subsequent intrinsic operation acts on this distorted image. Intrinsic space-variance: The intrinsic weighting function depends on both input- and output-site, rather than on their mere difference. In the linear model, a 4D kernel K(y', y) is required. To model columnar intrinsic organization, K might depend periodically on y. The types of space-variant processing discussed in this paper are summarized in Table 1.
(~)
Linear and areal magnification factors can only bc defined for operators with reasonable mean topographic mappings :,g (7). The definitions are given in section "
TABLE
1
Neural Mappings as Space-Variant Operators. Summary of the Operators Discussed in this Paper
Topographic Map
e(y) = s "~J~ (y)
Mapped Filter
e(y) -Iis..: !,~ '(y').k(y - y ' ) d y ' / i s ( x ) . detJ (x) -k(y - ,~(x)) dx
Patchy Connectivity
/-
e(y)
=
JT~s':>:~"(Y')'c'(Y')'k(Y-
y')dy'
• t,
Transmission Grating Model
•
JS,,
%.
-
f~
x) dx
n
c,(~J~,(x)) • idet J,+(x) = 1. i.e+. e(y) - ~ Is s.(x).k(y
General Space-Variant Filter
,~.(x))
cellular magn. !J~,(x) ~- x for all n, i.e.. e(y) = ~ /~ s , ( x ) . c , ( x ) . k ( y -
Constant Cellular Magnification Model
-
j
- ,~.(x)) dx
e(y) - fss(x)K(x; y) dx
dx
25¢1
tt. A. Mallot, B'[ yon Seelen. and k] (;iannakopoulos 4. M A P P E D
FILTERS
In this section, we present rigorous results concerning an important special class of space-variant, linear operators. These are cascades of a continuous mapping from the retina (or a brain area) to some (other) brain area, and a convolution with a square-integrable kernel k. These cascades are the simplest model for the analysis of the interaction of input mappings that change local neighborhoods with an internal map which, in this case, has no positional variation at all. It is an extention of the continuous space-invariant (homogeneous) theory, presented by von Seelen (1968) and Marko (1969). We shall show that a clear distinction between pure coordinate transforms and computational mappings fails even with this simple model, since both topographical position and computational properties of the resulting receptive fields arc affected. Mapped filters are particularly suited to study the relationship between point-spread-functions and receptive fields as pointed out by Mallot (1987). For electrophysiological results on this relationship, of. Mclhvain (1986).
4.1. T h e Mapping Plus Filter Cascade
DH:lyrr]oy. As in (1). consider two bounded subsets of the plane, S, T C_ R 2, representing the visual field and a two-dimensional hrain area, respectively. Let ~ he a topographic mapping S--~ T as is (1) and k • R: ---. R a convolution kernel describing the internal coupling in the hrain area T. Finally let s and e denote square integrable fimctions S --~ R attd T --, R, respectively, i.e., a distribution o f stimuli (an image) or excitation. We then call the linear operator s ~ e [-s c: ,,: ~(,")k(y- y ' ) d y '
It is clear from (10) and (11) that for linear and affine mappings ~J~,the mapped filter reduces to an ordinary convolution. Figure 3 shows the resultant kernels K(x; y,,) for fixed 3',, and different mappings :J{. Mapped filters are a special class of space-variant operators that are easy to construct. The converse question, whether a given space-variant operator can be realized or approximated by a mapped filter, is therefore of considerable interest. In the remaining part of this section, we will discuss a number of mathematical properties of mapped fillers: • Is the description by mapping and intrinsic kernel unique? • What class of space-variant operators can be generated by sequences of mapped filters'? • Given a mapped filter, how can we retrieve mapping and intrinsic kernel? This problem arises in neurophysiological analysis and in technical applications if an arbitrary space-variant operator is to be constructed as a mapped filter. • How do mapping anti intrinsic kernel interact in the visual cortex? Our results show that tailoring of mapped filters to image processing applications is quite feasible. The number and applicability of operators generated by mapped filters can be further increased, if nets or feedback circuits of mapped areas are considered (cf. section 5).
b
(9)
/|
a mapped filter with intrinsic kernel k and ,tapping ~J~. Again, x and y denote vectors and the integral is taken over a two-dimensional domain. (A detailed biological inteppretation O[ this operator has been given t~v Mallot, 1987.)
In order to compare (9) to the general form of a space-variant operator (5), we apply the transformation theorem and obtain
~'(y) As in (2), J,:(x) denotes the Jacobian of :~. The absolute value of its determinant is the local areal magnification factor of the mapping :J~ (2). From (10), we find the resultant space-variant kernel of the mapped filter. K(x:y) - k(y - ~J~(x))dctJ.(x),.
(ll)
FIGURE 3. Mapped filter kernels, a. Rotationally symmetric convolution kernel k(y), a difference of Gauasians. b. Resulting kernel (cf. eqn (11)) for a linear mapping function (constant areal magnification). The mapping leads to a distortion of the argument of k. c. Resulting kernel for the complex logarithmic mapping. In addition to the distortion of the argument of k, an asymmetry is induced by the space-variant areal magnification.
251
Neurul Mapping 4.2. Uniqueness
We start a simple case where the separation of a mapped filter into a map ~ and an intrinsic kernel k is not unique: consider the mapping ~J?.'(x) := !g(x) ÷ s and the kernel k'(x) := k(x - s) for a fixed shift vector s. Obviously. the mapped filters built on ~,? and k on the one hand and on :J?' and k' on the other are identical. Uniqueness up to this shift vector can be proved for an important special class of kernels: TIIEORt-.M 1. Let k : R: --+ R be a rotationally symmetric kernel, i.e., k is o f the Jorm k(x) = I~(]]xl)Jbr x E R: with k : R,; ---, R one-to-one and (~ ~ O. The integral of k over R 2 exists and is different f r o m zero. Let further ~,~ : S ---, T he a topographic mapping as in (1) with det J..:(x) ~ (I and 0 E T. Then k and :,? in the resultant kernel k(y - :~(x)) • ,det J,:(x)] are uniquely determined. The proof of'l'hcorcm 1 can be found in Appendix A. The symmetry requirements for k can easily be relaxed. In order to illustrate the relation of intrinsic and resultant kernel, however, rotational symmetry is a convenient assumption and has already been used in Figure 3. 4.3. Magnification Factors for Mapped Filters
Areal and linear magnification factors of mapped filters are simply the corresponding magnifications of the internal mapping ~J? as defined in (2, 3). For the cellular magnification, we have from (8): •
..
- 'lk • M,Ix).
: ?
.
(12)
That is. for mapped filters, areal and cellular magnification differ only b} a constant. It is implicitly assumed, of course, that the density of cells in both thc sourcc and thc target area are constant. In section 5, we study a case where a variable cell density in the target area is involved. 4.4. Which Operators Can Be Written as Mapped Filters?
From the definition. (9). it follows immediately that the zeroth moment of the resultant kernel K(x; y,,). taken over x. does not depend on y,,. Therefore. a necessary condition for a space-variant operator to be a mapped filter is the space-invariance of its zeroth moment. We call operators of this type space-mvariant in the mean and state the following: "I'HI~OREM 2. Let A , B he two operators which are space-invariant in the mean (in the sense that the zer-
oth moments o f their kernels are space-invariant). Further let the range of B be part of the domain of A such that the composition A o B exists. Then, the composed operator A © B is again space-invariant in the mean.
Theorem 2 gives a necessary condition for the constructibility of a general space-variant operator out of mapped filters• The proof is straightforward and will be omitted. The converse question, whether all operators that are space-invariant in the mean can be constructed as sequences of mapped filters, can not yet be answered.
4.5. Optimal Sampling for Mapped Representations
In order to represent an image s(x) in a mapped area in the distorted version s(:~?-~(x)), the input sampling grid should not be spatially uniform but depend locally on the areal magnification of ~?. In the visual system, input sampling is performed by the retinal ganglion cells and is therefore described in terms of receptive fields rather than discrete grids of 0-functions. As was pointed out by Fischer (1973), a relation between the receptive field size and sensitivity on one hand and the mapping function on the other hand can be predicted in the case of uniform optimal sampling. We now derive optimal sampling and interpolation functions for mapped representations• Consider a grid of cortical input sites {y,},_~..x. Each input site receives an input value s, which is determined by the retinal stimulus and a retinal receptive field function ~o,(x). The input value s, is spread over the cortical surface according to an interpolat i o n - o r point-spread-function c~,(y). In other words. instead of the ideal mapped image s(~J~ ~(y)), input to the cortex is now comprised by the sampled version .¢(y): •~(Y) := ~ ~.IF)
(13)
where the coefficients s, arc determined from the input image s via the window functions ~0,: ~ : : / ~,(x)s(x)dx
I14)
Optimallity of sampling can be achieved by choosing appropriate input sites {y,}, point-spread-functions a,, and receptive field functions q~,, such that the representation error f (s(:~¢ '(y)) - g(y))-' dy is minimized. Let us assume that the point-spread-functions {c~,}form an orthogonal system, i.e., a, and a, are uncorrelated for all i # j. If furthermore ]ioGII = 1 for all i, it follows from the theory of orthogonal
252
H. A. Mallot. W. yon Seelen. and E Giannakopoulos
svstems (cf. Tolstoy, 1962):
s - ~ s(~? ~(y)jo~,(y) dy
(15)
/ s(x)~,(~(x))]det J,(x)l dx. Comparison with (14) yields q~.(x) = ~,(~,~(x))]detJ.(x)l.
(16)
If the {oG}are correlated, optimal sampling functions {~0,} can be derived along the same lines by first orthogonalizing the system {~,}. As an illustration, we choose for ~ the complex root function (4), for {y,} an equidistant Cartesian grid and for ~, difference-of-Gaussians with zero mean, centered around y,. The functions (1/x])cr( \ / x - y,), i.e., sections along the horizontal meridian through the optimal sampling functions are plotted for a number of equidistant cortical positions y, in Figure 4. The optimal sampling functions are space-invariant in the mean, i.e., the product of receptive field size and sensitivity is constant, as is the relative receptive field overlap of cells mapping to equidistant cortical locations. Both properties have been found in the cat retina by Creutzfeldt, Sakmann, Scheich, and Korn (1970) and Fischer and May ( 197(1t. Space-invariance in the mean, however, is not preserved in the operators proposed by Braccini, Gambardella, and Sandini (1981) and Porat and Zeevi (1988) as a model of retinal filtering. A more realistic model that accounts for local scatter and global continuous variation of receptive field size has been proposed by Koenderink and van D ~ r n (1978). A somewhat surprising result of this consideration is that in order to obtain optimal sampling in a retinotopically mapped cortical area, the preferred orientations of retinal receptive fields should vary with
/
retinal position (of. Figure 4, 5a). More specifically, preferred orientation should depend on the radial angle bctween the cells location and the horizontal meridian, since the areal magnification is more or less a radical function of retinal position and decreases with eccentricity. Further details of this relation depend on the interpolation function a. For different spatio-temporal interpolation functions and moving stimuli, the occurrence of both radial and tangential arrangements was simulated by von Scelen, Mallot, and Giannakopoulos (1987). Ncurophysiological and anatomical studies of retinal ganglion cell receptive fields have shown that a tangential orientation bias is in fact present in the cat retina (Lcvick & Thisbos, 19[q0: Schall, Vitek. & Levcnthal, 1986). In the cat geniculo-striate pathway, areal magnification is obtained without cellular magnification since the retinal ganglion cells are distributed more sparsely towards the periphery. On the other hand, both cellular and areal magnifications are found in the monkey (Orban, 1984). Since in both cases lhe receptive field profiles of the retinal ganglion cells are adapted to the subsequent distortion so that optimal sampling is obtained, it follows from (16) that mapped filters arc an appropriate description in either situation. Put differently, mapped filters need not be interpreted as a cascade of a pure mapping plus a pure convolution but are applicable to a much broader class of problems.
4.6. Retrieval of Mapping and Intrinsic Kernel Suppose we have a space-variant operator which we want to express as a cascade of a mapping and a convolution. In this section, v,,e assume that this seg-
~ ,
16x
FIGURE 4. Optimal retinal sampling functions for uniform resolution in the cortical representation. A apace-lnvarlant difference of Gausslans is chosen as a cortical point-spread-function. The mapping function is the complex root.
253
Neural Mapping mcntation is possible. Wc define candidates for the intrinsic kernel and the mapping of a spaec-variant operator (with spacc-invariant mean). DEHNITION. Consider a linear operator with a kernel K : S × T ~ R and a topographic mapping !~ : S ---, T (which, oJ" course, we suspect is the underlying retinotopic mapping). IV(' then call the function k.
:
50'
~'j
Area 17
•
. /'I
.
R: ~ R. with
1 Ji K(!,~ '(y'); y' + y)dy'
k.(y) = I'~
dct J,(,~ '(Y'))I
(17)
the mean intrinsic kernel of K with respect to ~?. ttere, IITi! denotes the (Lebesque-) measure of T, i.e., its area. (lf T is unbound, it suffices to take any bound subset o f T , since the integration in (17) is only an averaging operation.) Note that k.~ is well dcfincd in the sense that from k(x;y) = k(y - ~?(x)). Idet J :(x)l, it follows that k: = k. Equation (17) thus solves the problem of retrieving the intrinsic kernel i]'!,~is already known.
i
o
We now address the harder problem of retrieving the mapping :~. The mean topographic mapping :,~ of an operator with kernel K was already defined in (7). Here. we show that. under certain assumptions. :~h.is the inverse of the sought mapping ,~: THE()RI-;M 3. Consider a mapped filter as defined in (9) with an intrinsic kernel k and a topographic" mapping ~J?.In addition, we assume that k is rotationallv symmetric, i.e., k(x) = l<(llxll) and that the mapping ~J~is conformal. Then the mean topographic mapping ~v of M is the inverse of ~.J~except fi)r bounda O, effects. The proof is given in Appendix B. Theorem 3 confirms the intuitive expectation that, in mapped filters, the center of gravity of the receptive field of a cortical cell (which is, of course, a location in the visual ficld) is in fact mapped to thc cortical position of that cell by the underlying mapping function.
4.7. Intrinsic and Resultant Kernels 4.7.1. Simulation. As compared to the intrinsic coupling function, the resultant kernel of a mapped filter is changed in a systematic way. In this section, wc study the relation between intrinsic and resultant kernels by way of examples. In Figure 3, resultant kernels K(x; y,,) (i.e., receptive field functions) constructed out of an internal center-surround organization are shown for a simple affine transformation and the complex logarithm. Thc space-variant organization for modelled maps of the cat's areas 17, 18, and 19 (Mallot, 1985) is shown in Figure 5. Here, contour lines for relative levels of
/ i / FIGURE 5. Space-variance of resultant kernels shown by contour lines for the maps of areas 17, 18, and 19. Intrinsic kernels are DOG-functions as shown in Figure 3a; contour lines of both intrinsic and resultant kernels correspond to 20% of the maximal excitation (filled black regions) and 80% of the maximal inhibition (open circles and crescents). Mapping functions are taken from the model of Mallot (1985). Resultant kernels show a distortion (elongation, size) as well as an induced asymmetry which is roughly radially oriented.
excitation and inhibition, respectively, are used to show receptive field organization. Thc result is that a radially organized asymmetry is induced in the receptive field organization, The position dependent bias of orientation selectivity predicted by Figure 5 has been found electrophysiologically by various authors (cat area 17: Payne & Berman, 1983: Schall et al., 1986, cat areas 17 and 18: Berman et al., 1987; cat areas PMI,S, PLLS: Rauschecker, yon Grflnau, & Poulin, 1987). The theoretical considerations presented here show that topographic mapping may be the underlying reason for this effect. Even if the map does not perform a polar coordinate transform (such as the complex Ioga-
254
rithm), polar and radial arrangements in the visual field can arise (Figure 5). Payne and Berman (1983) and Rauschecker et al. (1987) have proposed optical flow analysis as tin obvious functional interpretation of this radial organization. However, this interpretation makes sense only if the observer fixates the focus of expansion of the optical flow pattern. Psvchophysically, no such behavior has been demonstratcd. In section 7, we will introduce a topographic mapping for optical flow computation thai works without fixating the focus of expansion. As an example for functional interpretations of receptive field asymmetries, consider the representation of a moving stimulus in a laver of neurons with spatio-tcmporal rccepti,,e fields. If these receptive fields are rotatiomdly symmetric, the peak of excitation will trail the actual position of the stimulus bv an amot, nt depending on stimulus velocity and tcmporal response characteristics. The representation will suffer from dispersion. A~, was pointed out by Fr6mel (1980), asymmetric receptive fields with different temporal characteristics for excitatory and inhibitory mechanisms can compensate for this effect. Asymmetries of this type were in fact demonstrated in clcctrophysiological recordings in the cat's superior colliculus (Friirnel. 1980 ). In the spatial arrangement shown in Figure 5, the compensation applies to motion in radial directions. How can the asymmetry or orientation of a receptive field function bc formally defined? One way is the axis of maximal inertia as defined bv the second moments (e.g., Rosenfeld & Kak, 1982). However, in analogy to Theorem 3, the second moments of intrinsic and resuhant kernel mav be identical even for mappings that produce severe asymmetries. They are therefore not suitable for studying the effect depicted in Figures 3 and 5. Another problem with the axis of inertia is that, strictly speaking, it can only be defined for nonnegative functions. Oriented receptive field functions, however, will always have negative (inhibitory) parts. We therefore propose a different deftnition of a mapping-reduced orientation: topographic mapping affects a receptive field mechanism the stronger, the larger its spatial extent is. If we assume an inhibitory and an excitatory mechanism of different size. i.e., a simple center-surround organization, the location of these mechanisms in the resultant kernel can be shifted by different amounts. In fact, this is what happens in Figure 3, where the locations of inhibitory and excitatory parts are visualizcd by contour lines. Fhc relative shift of excitatory and inhibitory mechanisms is a plausible neural model for thc construction of orientation selective units. Receptive field organization of this type has been proposed for the superior colliculus by Fr6mel (1980) 4.7.2. Models o f A s y m m e t o ' .
H. A. Mallot, 14'. yon See&n. and t'i Giannakopoulos
and for retinal ganglion cells by l)awis. Shapley, Kaplan. & Tranchina (1984). The asymmetry induced by rctinotopic mapping and the considerations on optimal sampling presented in section 4.5 can serve as a model for the global orientation bias with receptive field position. As was shown in section 4.5, a bias of retinal dentritic field orientation is required to produce an isotropic cortical representation. Wc conjecture that the global bias is related to input space-varivncc while cortical orientation columns are due to intrinsic space-variance. Both effects may be independently superimposed. A different view has bccn presented by Vidyasagar (198,7) who discusses gh)bal bias as a source of cortical orientation selectivity.
5. INTEGRATION OF MAPS: PATCHES, BLOBS, AND STRIPES 5.1. Patchy Connectivity So far, we have dealt with single cortical areas and their connections in simple sequences. However, complicated networks or directed graphs are much more realistic for describing the interactions of cortical areas in the visual system (van Essen, 1985). in nets of areas, the convergence of projections from two or nlorc source areas to a common taroet often involves segregation, i.e., the disruptive arrangement of different fiber populations in stripes (e.g., ocular dominance in VI) or patches. The spatial pattern of this segregation has been investigated in terms of pattern formation (Meinhardt, 1982), the representation of higher-dimensional parameter spaces (Mitchison & Durbin, 1986), or its computational benefits (Yeshurun & Schwartz, 1989l. ltcre, we focus on realistic descriptions of input segregation as a prerequisite fi~r further computational interpretations. For different types of interaction between two images, different spatial patterns of input segregation can be named. For instance, stripes occur it" more or less equiwdent images (such as left and right view of a stereo pair) are combined. High acuity intensity images and coarse color information arc combined in a more asymmetric way by distinct color blobs in a continuous luminance representation.
5.2. Formalization Suppose that N source areas S, . . . . . S~ C R: converge onto one common target area T C R:. We assume in this section that the S,, and 1" are bound. Associated with each source area S,, 1 ~ n -< N is a mapping function ~J+,, : S,,---, I". In the dcfinithm of a topographic map (1), we implicitly assumed that all cells of the target area receive input from the source area. i.e., that areal
Neural Mapping
255
and cellular magnification are the same. In patchy connectivity, this is no longer the case since a representation of a small region may be sparsely scattered over a wide area. For instance, if the representation of the left retina is interrupted by an ocular dominance stripe representing the right retina. the geometrical magnification at this site is huge since small shifts in retina position may correspond to a cortical shift across the dominance stripe. However. the cellular magnification does not increase. Wc introduce a set of functions c. : T
, R,; with
\
\
'%
",
,"", (19)
% d
= E
Met J,,,:(y)i = c,,(y) for n E I1 . . . . .
~
I s,,(x)(.,(,l~,,(x))(.|et J,(x) '
k() . . . .
',,:(x,)dx
(2o) T h i s g e n e r a l m o d e l a c c o u n t s f o r m o s t cases o f p a t c h y
connectivity. We will discuss two special cases of (19) that will be used in the examples below (of. Figure 6). l}'ansmis.~ion Grating Model: Suppose that !,~,, %,. for all n. m ¢~ {1 . . . . . N}, e.g., :l?,,(x) = x for all n. x. An example is discussed by Schwartz (1982), who uses two inputs andcl.e(y) := ~(1 = siny,) as a pattern of input segregation. Note that in the transmission grating model, those parts of the inputs that happen to be mapped to a y ¢ T with c,,(y) # 1 are partially deleted. In Schwartz" model, this affects one half of each image. The advantage of the transmission grating model is that it can be interpreted as a spatial modulation of the input stimuli. Constant Cellular Magnification Model: This special case of (19) avoids the problem of partial deletion and models the case depicted in Figure ld. For the mapping from one S,, to T, we obtain the cellular magnification M .... by applying (8) to the kernel of the summands of (20): M ,, = c,,(,:,,(x)) • Idet J.,,(x), • Ikl.
(21)
N}.
(22)
Equation (22) can be satisfied only. it ~,, ''~ k PS,,II = IITII. In all other cases, it has to be modified by tin appropriate factor• Intuitively. (22) means that any areal magnification arising in one of the maps %, is accompanied by a dilution or attenuation of the input signal s,,(x,,). In our continuous model, this corresponds to the absence of cellular magnifications even if strong areal magnifications arc present. We substitute (22) into (2(I) and obtain: e(y) = ~ l - s , , ( x ) . k ( , - , : . , ( x ) ) d x .
such that at a location y C T. c,,(y) is the densiw of cells receiving input from S,,. These functions describe the pattern of patchy connectivity. The total input into T is given by E). t e,,(y) • s,,(,?,7 l(y)), where s,,(x) denotes the stimulus distribution in the source area S,, (cf. cqn (9)). l.et k denote an intrinsic convolution kernel operating in the target area T and e the output or excitation in the target area. As in (9). the overall operation becomes:
""'
The assumption that M .... is a constant amounts to:
(23)
An interesting mathematical problem arising with the constant cellular magnification model is the following: suppose a pattern of patchy connectivity is given as in (18). Under which conditions do the corresponding mapping functions ,~,, exist, such that the partial differential equation (22) can be satisfied? In the next section, a case with separable functions c,, is presented, where an explicit solution is possible. 5.3. Two Equivalent Dominance Stripes
Inputs: Ocular
Ocular dominance stripes in area Vl tire the best known example for discrete connectivity, in this case originating in the contra- and the ipsilateral geniculate bodies. Moreover. neuroanatomical findings suggest that a stripewise combination of different inputs is often used in cortices. Let the connectivity pattern be described by separable functions c,.z(y) "= J~ :(Yl) • g(Y_'). We choose ,q ~ I and .f periodic with period length ).. For v~ E [-).,"2. )./2], c, can be defined by: c,,(y) ~ .f,,(y:): tz
-
1 .for. . a. "f:(Y:):=
0
1.2
otherw~se
a-~
(24)
f:(y;) : ~ l - f , ( y , ) .
where a < ). is the width of the stripes representing source area SL. In the transmission grating model (Figure 6a), both % and '72 in (19) arc the identity. We expand (24) into a Fourier series:
f,[ +s.(y')"
'
(
I
(l - h.,) - N~, b, cos2i~
• k(y - y') dy',
(25)
where h. := a/2 and b, := 2/i~r sin i~ a/2 are the Fourier-cosine-coefficients of f~ from (24). Omitting the convolution and setting b,, = ~. (25) reduces to eqn (3) of Schwartz (1982).
256
It. A. Mallot, W. yon Seelen. and F Giannakopoulos
In intervals where f,, -- c,, = 0, the value of %, ' does not change as y, is increased ("jumps" in the graph of :g,, in Figure 6b). The maps can, however, be inverted if discontinuities in the resulting functions ~,~, are tolerated. Of course, these discontinuities correspond to the boundaries of ocular dominance stripes. Inserting (26) into (20) shows that both the mapping functions %, and the weights c, contain periodic parts. Applying this result to (21)) shows, thercfore, that this case can bc interpreted as a combination of amplitude and frequency modulation. It should be noted that the separable periodic connectivity pattern (24) is only an approximation of the true pattern. In a recent anatomical stud,,' by Lowel and Singer (1987). a spatial frequency analysis of the ocular dominance pattern in the cat is presented. While there is a peak at a column spacing of about 80(I/Lm. this is much less pronounced than one might expect for a spatial modulator. Of course, the general models of this section can still be applied.
Co,'*,,col
x
le"t ,rnoge
right image
(a)
Y 't
,l
y=R.(x>
CorDCQ[ ,mage
Y
y : R2(x)
..... i)ZTi x
A ~ C
D E
F]
[
P
(b)
r,g h*, ,moge
'1
FIGURE 6. Ocular domlance stripes: a. Transmission grating model. The mapping functions !J~,.2are identical and Independendent of the stripe pattern c,.2. One haft of each Image is deleted, b. Constant cellular magnification model. The mapping functions are slightly different for the left and right eye and depend on the stripe pattern by (26). All parts of the images are represented by equal amount of cortical space.
Equation (25) allows the following interpretations: 1. 3"he process outlined represents a spatial amplitude modulation in which, with an appropriate filtering, the input images can bc shifted in the spatial frequency domain. 2. Using the described interlacing technique, images can be combined in any frequency range. In this v~,av, it is possible to reduce the dimensions of the images and to adapt them to locally restricted operators. 3. Using two images multiplied with 180° phase shifted sinusoidal transmission gratings allows measuring the disparity (Schwartz, 1982; Yeshurun & Schwartz, 1989) and/or the degree of fusion with an appropriatc filtering or by computing the differences between the two multiplexed images. For the constant cellular magnification model (Figure 6b). the result is slightly different. In the simple separable case (24), one can explicitly solvc (22) for the %. ~ and obtains:
5.4. Two Nonequivalent Inputs Consider as a second example two-dimensional discretization where input from S~ is confined to small input sites embedded in continuous input from S.,. We have clO') : a El:, ~(y - y,) for some a < ['Tlb/ L. where the y/. l :- 1. . . . . L are the input sites. In this example, we are only interested in the "'interarea receptive field" of a cell at position y,, E T in the discretely mapped source area S,. (That is, we suppose that sz(x) -~ 0.) With the transmission grating LI model (.J,,(x) = x). one obtains: ely) -- a ,~ .s(y3k(y - y,) '~
(27)
= a /i- .s(v')<" 6 ( v ' - - v , ) k ( v - y')dv'. For a hexagonal grid y / a n d a Gaussian as intrinsic kernel, the distribution of excitation e(y) is depicted in Figure 7a. More information can be gained from the inter-area receptive field of a unit at position y, f: (note that x = ,~,(x) = y in the transmission grating model): 1
f,(x) := x£ 6(x - y,)k(y - x).
(2~)
We can now apply our previous results on spacevariant operators to (27) and (28). First consider the mean topographic mapping of the space-variant operator (27) as defined in section 4.4. Unlike the case studied in Theorem 3, the centroid of f: is shifted as compared to y. The pattern of this shift is largely determined by the width of the intrinsic coupling k. In horizontal penetrations with a recording microelectrode, this pattern would correspond to a clus-
Neural M a p p i n g
257
\
t/
,
/',
t
,X
,
,~v
. ~. , , " %
,
, ~
,~.
'
. : .. "¢c7.~ ...
,,
./", C ,," t _.. , ~ ,.*-,-v,'~-'.P,. ,'-"x "~-", ~,"~'fl, " ',,,,'
,
4/#,
\
.-,-,."- 4 .
~ '-.
,~k
--~ ' ~ '
' . ' _~-"::7', , , " "
,
2Y.. " ~ , ' , - ,
,
-,
,
,.., .
f,
/,,
.'~'
t,
""2._/ ",
~,~ , , . ~ . ' ~ ,
,
M-..:t,,,
'
,
, ,
"',~,-,',','. .....
. ,
(a) ~./7, , •" / " /
",."\
/
\
\
'"
~
\
/
~ , !
_
\ ...\~ ~"<
l
/
~
.l
/
"//,1"_ h'~
, /" --
,
corresponds to position in the input layer while fine position corresponds to a functional specialization of the receptive fields. While the centric orientation pattern of Figure 7b is well in line with electrophysiological recordings (cf. yon Seelen, 1970; Braitenberg & Braitenberg, 1979), (29) cannot serve as a model of cortical hypercolumns, since the spatial sampling of input fibers in area 17 is much finer than the hypercolumn spacing. Braitenberg (1985) has presented a model which uses spatial discretizations on a larger scale, i.e., cytochrome oxidase blobs, and combines small asymmetric fields into receptive fields of realistic size using cell assemblies. In general, discretizations, whether in the input (as discussed here), in internal or in feedback connectivity, can be considered a means of producing spatial variations of receptive field parameters.
\
,,
\
5.5. Multiple Equivalent Inputs
/ \
i 1
X x
~ \,l ,/ / ...Q !/.¢...-
/
------....-" ~" - , . - -
,,
/
\
1
(b) FIGURE 7. Asymmetry induced by a hexagonal grid of input sites, a. Cortical distribution of excitation for simultaneous stimulation of all input sites. Intrinsic coupling is a Gausslan with width In the order of the separation of the input sites. b. Induced asymmetry of the receptive fields of cells between the input sites (cf. eqn (29)). The receptive field centers are shifted towards the input sites while the vector of induced asymmetry points away from them.
tering of receptive field positions in the vicinity of the input sites. The size of the shift depends on the width of the kernel k. Since we have a clear definition of the position of the resultant receptive field, we can define an i n d u c e d a s y m m e t r y in the same way as in section 4.5. Let B denote the range or coupling width of the intrinsic kernel. We define the axis of asymmetry for a cell at position y. a(y). and an intrinsic kernel k by
Figure 8 shows one schematic example for the integration of early vision data from different channels in a patchy representation. Suppose input from different channels is combined in patches in a coarsely topographic area. The channels may deal with texture, motion, luminance, color and so forth. In image segmentation, discontinuity information from all of these channcls has to be integrated to come up with one consistent interpretation. Patchy connectivity can reducc this interaction problem to that of the local detection and straightening of edges. The finding that patches dealing with different input parameters often have irregular overlap (Sherk. 1986: Swindalc ct al.,
: ~ .
X%,~:er~s,tv
I .....
'/
/
I
/-
~
'te'~u~e
,I
# ~] y,k((y - y,)/B) (29) a ( y ) : - #B
~
k((y
y/)/B
I
V
\_/I
i
t, ,
Figure 7b shows the asymmetries of inter-area receptive fields of cells from the target area T. The vectors are positioned at the receptive field centers. The results show that a regular pattern of asymmetries can be brought about by discrete connectivity. Figure 7b shows a nested map where coarse position
X~
FIGURE 8. Integration of input data from different modules in a patchy representation. Discontinuity data from the various channels can be combined by simple edge straightening operations.
H. ,4. Mallet. l+, yon Seelen, am/[i Giannakopoulos
258
1987: L6wel, Freeman, & Singer, 1987) fits nicely into this interpretation.
6. PARAMETRIC MAPPING The general idea for computational interpretations of neural mappings is that, once a map is established, the subsequent local operations are much more powerful than if operating on the original image. In this view, a two-dimensional layer of neurons comprises a general data format that can represent much more lhan the mere stimulus position through the location of neural excitations. This data format can be used on various hierarchical levels of processing; it can be interpreted as a parallel implementation of parameter estimators. Mechanisms. by which general spatial representations, or "'parametric maps", can be brought aboul, include: 1. Internal columnar organization such as cortical orientation columns leads to a mixed spatial represenlation of both stimulus position and orientation. 2. In mapped filters (section 4), several receptive field properties change according to cortical position. For example, it was shown in section 4.7 that a space dependent asymmeto' is imposed even on space-invariant intrinsic filters. Conversely, it fl~llows that cortical position encodes a combination of both retinal position and orientation. The same holds for receptive field size and stimulus velocity. 3. Patchy connectivity can organize continuous local parameter variations, as was demonstrated in section 5.4. In effect, this is equivalent Io internal columnar organization. Both cases result in a
nested map where coarse coordinates cncode stimulus position while fine coordinates encode an independent receptive field parameter such as orientation. An example is the variation of ocularitv between the discrete ocular dominance stripes. . If an input map is equated with an output map, meaningful spatial representations can arise. For instance, in eye movement conlrol, the input image is represented in the upper layers of the superior colliculus. The basal layers contain an output map in the sense thai "'saccade-related bursts" precede a saccadic e v e movement to the topegraphically related position (Sparks & Nelson, 1988). In a sense, the output map represents motor programs in a spatial code. . In the auditory system, spatial encoding of stimulus frequency is already established by the mechanical properties of the basilar membrane. This spatially encoded "auditory image" (Suga, 1988) is processed in much the same wak. as ordinary images in the visual syslcm. Ahhough the projection of the spiral ganglion to Hie auditory cortex is a topographic map (sometimes wilh "'fovealization". Suga el al., 1987), lhc whole system acts as a parametric map.
Although the mechanisms mentioned are quite different, they all result in a spatial encoding of stimulus parameters other than stimulus position. We conjecture thai types 1 and 3 which lead to nested maps are the most common ones in the brain. As compared to the approaches of Barlow (1986) and Kohonen (1988a), these mechanisms employ only simple features whose existence in the neocortcx can easily be demonstrated. At this poinl, however, it is
TABLE 2 Processing Parameter Images. On the left, Parameters are Listed that can be Represented Spatially in the Cortex. Applying Early Vision Operators (top row) to These "Parameter Images" Corresponds to Advanced Information Processing Tasks Lateral Inhibition Position
Color
Contrast Enhancement Retinex (separate reflectance and illumination)
Disparity Oriented Motion
(hypercolumns) Pitch Plus Intensity (in bats) Patches of Different
Parameters
Feature Detection
Motion Detection
Features
Motion
Smoothing
Stereo-Matching
Motion in Depth
Surface Reconstruction Motion Capture;
Regularization
Acceleration
Relaxation
Aperture Problem Recognition of
Insect from Wingbeat Pattern
Detection of
FrequencyModulation Dynamic Parameter Relations
Integration of Modules
Neural Mapping not clear, how the above mechanisms can be used to construct parametric mappings of global computational aspects. Two-dimensional layers, mapping, local operators, and the spatially discrete combination of inputs form the basis for a geometrical organization of information processing. In this context, it is not the absolute position of a cell that is decisive, but its functional vicinity, i.e., the information represented in a neighborhood of the cell. Table 2 presents a immber of examples of how these parametric mappings, once they are established, can interact with simple earlx-vision operations. On the whole, parametric representations can be used to exploit the redundancy in the inpul stimuli. This can be achieved by encoding in nearby spatial positions stimulus situations thal tend to occur jointly or that resemble each other (yon Seelen & Mallot, 1988). Strictly speaking, this idea requires that lhe inherent topology of the stimulus phase space be two dimensional. An application of this principle in speech recognition has been presented by Kohonen (1988b).
7. A P P L I C A T I O N : I N V E R S E PERSPECTIVE MAPPING
In the previous sections of this paper, we have developed a mathematical framework for the analysis of various types of neural mappings, tlere, we present an example of the computational applications of topographic maps, i.e., an optimal map for obstacle detection based on optical flow computation. Consider an observer (e.g., a typical mammal) whose movement is confined to a plane. That is, he has one or two degrees of freedom for translations and one for rotations around the normal axis of the phmc (i.e., his dorso-ventral axis). In this situation an obstacle can be defined as anything rising above this plane. This is a minimal definition of an obslaclc which does not require additional information about the nature of the object blocking the way. With no obstacles around, both the optical flow generated by translatorv cgomotion and stereoscopic disparities lake a simple form which is determined by the camera (eye) geometry. Anv deviation from this expected pattern must be due to an obstacle which is the more importanl, the larger the deviation is. The problem in the detection of such deviations is that variations of optical flow or disparity can be duc to either perspeclive f o r e s h o r t e n i n g (in the background) or the 3D structure of the scene. Since we are only interested in the latter, i.e.. the presence of elevated points, we could try to use a coordinate transform to eliminate the effects of perspective (cf. Figure 1(I).
259 N o t e that for an orthogonal coordinate system A : - {a, b. c}, perspective mapping o n t o the a, bplane at a distance d from the center of projection is given by '?~ : R ~ ---- • R: ".+,: e, I
('~,
, (E'c)
(/'." b)
(3(I)
for all points E = ( e ~ , e , , e ~ ) E R ;withe~ # 0. We introduce a world coordinate system tt : = Ix, y, z}, where x and y span the horizontal plane while z points upwards. The camera model is described by a second coordinate system ( ' : = {u, v, w}. where u and v span the image plane and w is the optical axis. Both frames share a common origin, the center of projection or nodal point, N at distance h (height) and f (focal length) from the horizontal and image phmes, respectively. The coordinate transform from the camera centered system to the world system is described by an orthogonal matrix 7" which is composed of the column vectors u, v, w (of. Figure 9). To remove the space-variance duc to perspective, we construct an "'inverse perspective mapping": starting with a point 1-~ in the image plane, we want to find the corresponding point E;, in the horizontal plane, i.e.. the x, y phme of the world coordinate system tt. Let x'. v' denote the coordinates in the horizontal plane and u', t,' those in the image plane. By tracing the ray through k,~ and the nodal point to its intersection with the horizontal plane, one can
,mage/
plan/
33' N
Y
v ~ , , - - -
horizontal plane
-'~
F
E~
FIGURE 9. Coordinate system for inverse perspective mapping. N: nodal point or center of projection, y, z: axes of the world centered coordinate system, v, w: axes of the camera centered system. The third axes (x. v) coincide and are orthogonal to the paper plane. F fixation point; E a point in three-space; E~, E;~ its projections to the Image- and horizontal planes, respectively; E its homogeneous representation.
260
H. A. Mallot, W. yon Seelen, and E Giannakopoulos
7!!"->Z'/; ;,:..,, , i, :?)!!.". -...~ .....-
easily show that the collineation '2 :
, R:
R -~
• (u~u'
+
v,v'
-
\u:u' + tbv'
w,.f]
/ /'1~,/!~1!1"/__
w: f]
/
(31) is the sought coordinate transform. (For a derivation in terms of projective geometry, cf. Mallot, Schulze, & Storjohann, 1989a.) As in (30). we denote by .q~¢ and ~', the projection onto camera and horizontal plane, respectively. Then, by construction, we have: '.~ o !c, - !%
d~", (E) •
-J. (/.} • m.
dt
(33)
where J., denotes the Jacobian matrix of the projection (31)). Figure 10c shows the resulting motion field for the scene depicted in Figure 10a and translatory egomotion. If we apply the inverse perspective mapping ':? prior to the computation of the image flow, i.e.. if we compute the image flow from the transformed image shown in Figure lob, the result is: nl~, ,
-
,m,,,! -
-J :
~
1
.,(El
"m
h
I
=
' i'm'l.
\\
.%
\ ,111':1 :
(32)
It is only this projection, 9',, that we have to deal with in the sequel. The dependence on the camera coordinate system was removed by the inverse perspective map. In practice, of course, ,bt cannot be obtained directly, since the original image acquisition used the projection !¢< in the first place. If the camera frame, i.e., the observer, is moving in the horizontal plane at a constant speed m, the image of a stationary 3D point E will move in the image plane at a speed m[ which is determined by m;
/
. ,:,: ,,',:,:,:,:,','
\ I
-J
,,(E)
FIGURE 10. Optical flow and inverse perspective mapping. a. Perspective view of a 3D scene, b. Flowfleld for egomotlon In this scene, c. The same flowfleld as detected after inverse perspective mapping, d. Body-scaled obstacle detection based on the length of the motion vectors in c.
stacle can be defined simply as a deviation from that rule and no higher-level information is needed• The comparison of the two images is performed via topographic mappings '3/. and !~/~, for the left and right eye, respectively, and a subsequent substraction which leaves only those regions above threshold that display an obstacle. D :=, b [!h elcv I. - elev I
(35)
• m
(34)
where elev := h + e, is the elevation of the point E above the ground plane, i.e., its importance as an obstacle. (Note that e~ < 0 in typical cases.) From
here. it is easy to detect the obstacle with a local uniform operation, such as an unidirectional motion detector. Thresholding the result to cut off the egomotion vector itself, the obstacle can be made to stand out clearly (Figure 10d). No further information about the obstacle is required to trigger some sort ot "'avoidance behavior". For a more detailed discussion of optical flow computation and inverse perspective mapping, see Mallot, Biilthoff, & Little. 1989b. Inverse perspective mapping can be applied to stereopsis as well. Again, if no obstacles were present, the image generated in the left camera would be identical to that in the right one, except for a coordinate transform that includes the inverse perspective mappings for both cameras. Thus. an o b -
FIGURE 11. Iso
261
Neural Mapping
As in (34), elev := h + e3 is the elevation of the obstacle above the ground plane, i.e., a measure for its importance. Equation (35) shows that disparity in the mapped image depends only on the z component of the imaged point. Surfaces of constant disparity are therefore the horizontal planes, The situation can be made clear by Figure 11. Inverse perspective mapping, although very useful for visual information processing, has not been demonstrated in electrophysiological studies. However, Epstein (1984), who investigated the deviation of the cat's area 17 map from a conformal mapping, argued that inverse perspective does account for that deviation. In monkeys, who live in full three-dimensional environments rather than on a plane, this compensation of perspective has not been found and would in fact be useless. 8. CONCLUSION 1. Neural mapping is a powerful strategy for parallel information processing. Even problems that by their nature do not involve two-dimensional distributions are tackled with image processing techniques in mapped two-dimensional layers of neurons, i.e., cortices. 2. Position and spatial nearness in cortical networks are crucial parameters since all internal operations are local. We therefore favor an advanced localization theory of cortical function over the idea of assemblies of cells at arbitrary positions. 3. Natural neural networks have a rich anatomical structure of which the mapping types mentioned here are only a part. For the study of neural computation, this structure should not be neglected. To what extent geometric organization of neural nets can be used for the processing of symbolic information is not clear at present. REFERENCES Allman, J. M., & Kaas, J. H. ( 1971). Representation of the visual field in slriate and adjoining cortex of the owl monkey (Aotus trivirgatus). Brain Research, 35, 89-106. Ballard, D. H. (1981). Generalizing the Hough transform to detect arbitrary shapes. Pattern Recognition, 13, 111-122. Ballard, D. tl. (1987). Cortical connections and parallel processing: structure and function. In M. A. Arbib & A. R. Hanson (Eds.), Vision, brain, and cooperative computation (pp. 563621). Cambridge, MA: MIT Press. Barlow, tt. (1986). Why have multiple cortical areas'? Vision Research, 2,6, 81-9[). Berman, N. E. J., Wilkes, M. E., & Payne, B. R. (1987). Organization of orientation and direction selectivity in areas 17 and 18 of cat cerebral cortex. Journal of Neurophysiology, 58, 676-699. Blasdel, G. G.. & Salama, G. (1986). Voltage-sensitive dyes reveal a modular organization in monkey striatc cortex. Nature, 321. 579-585. Braccini, C., Gambardella. G., & Sandini, G. (1981). A signal
theory approach to the space and frequency variant filtering performed by the human visual system. Signal Processing, 3, 231-240. Braitenberg, V. (1985). Charting the visual cortex. In A. Peters & E. G. Jones (Eds.), Cerebral Cortex, Vol. 3: Visual Cortex (pp. 379-410), New York and London: Plenum Press. Braitenberg, V., & Braitenberg, C. (1979). Geometry of orientation columns in the visual cortex. Biological Cvbernetics, 33, 179-186. Casasent, D., & Psaltis, D. (1976). Position, rotation, and scale invariant optical correlation. Applied Optics. 15, 1793-1799. Crcutzfeldt, O. D., Sakmann, B., Schcich, H., & Korn, A. (1970). Sensitivity distribution and spatial summation within receptivefield center of retinal on-center ganglion cells and transfer function of the retina. Journal of Neurophysiology, 33, 654671. Cynader, M. S., Swindale. N. V., & Matsubara, J. A. (1987). Functional topography in cat area 18. The Journal ofNeuroscience, 7, 1401-1413. Daniel, P. M., & Whitteridgc, D. (1961). The representation of the visual field on the cerebral cortex in monkeys. Journal of Physiology' (London), 159, 203-221. I)awis, S., Shapley, R., Kaplan, E., & Tranchina, D. (1984). The receptive field organization of x-cells in the cat: spatiotemporal coupling and asymmetry. Vision Research. 2,4, 549-564. Dow. B. M., Snyder. A. Z., Vautin, R. G., & Bauer R. (1981). Magnification factor and receptive field size in foveal striatc cortex of the monkey. Experimental Brain Research. 44. 213228. Epstein, L. I. (1984). An attempt to explain the differences between the upper and lower halves of the striate cortical map in the cat's field of view. Biological Cybernetics. 49, 175-177. Erickson, R. P. (1968). Stimulus coding in topographic and nontopographic afferent modalities: On the significance of the activity of individual sensory neurons. Psychological Review, 75,447-475. Fischer, B. (1973). Overlap of receptive field centers and representation of the visual field in the cat's optic tract. Vision Research, 13, 2113-2120. Fischer, B., & May, H. U. (1970) Invarianzen in der Katzcnrctina: GesctzmaBige Beziehungcn zwischen Empfindlichkeit, (;r6Be und Lagc receptiver Felder von Ganglienzellen. Experimental Brain Research, !1,448-464. Fr6mel, G. (1980). Extraction of objects from structured backgrounds in the cat superior colliculus. Part II. Biological Cybernetics, 38, 75-83. Hubel, D. It., & Wicsel, T. N. (1974). Uniformity of monkey striate cortex. A parallel relationship between field size, scatter, and magnification factor. The Journal ~f Comparative Neurology, 158, 295-306. Hubel, D. [|., & Wiesel. T. N. (1977). Functional architecture of macaque monkey visual cortex. Proceeding,s of the Royal Society (London) B, 198. 1-59. Jain, R., Bartlett, S. L., & O'Brien, N. (1987). Motion Stereo Using Ego-Motion Complex Logarithmic Mapping. IEEE Transactions on Panern Analysis and Machine Intelligence, 9, 356-369. Kaas, J. tt., Nelson, R. J., Sur, M., & Merzenich, M. M. (1981). Organization of somatosensory cortex in primates. In F. O. Schmitt, E G. Worden, G. Adelman, & S. G. Dennis (Eds.). The organization of the cerebral cortex (pp. 237-261). Cambridge, MA: MIT-Prcss. Knudsen, E. I., du Lac, S., & Estcrly, S. D. (1987). Computational maps in the brain. Annual Review of Neuroscience, 10, 41-65. Kocnderink, J. J., & van Doorn, A. J. (1978). Visual detection of spatial contrast; influence of location in the visual field, target extent and illuminance level. Biological (),herneti~w, 30, 157-167.
262 Kohonen, T. (1988a). 5elf-Organization and associative memory. Springer Series in Inlbrmation Sciences Vol. 8 (2nd cd. ). Berlin: Springer Vcrlag. Kohonen, T. (19,",;8b). The "'neural" phonetic typewriter. I E E E Computer, 21. 11-22. l,cvlck. W. R.. & "I'hisbos. l.. N. (1980). ()rientation bias of ¢,:lt retinal ganglion cells. Nature. 286. 389-390. [./w, el. 5.. & Singer. V,/. (1987). The pattern of ocular dominance columns in liar-mounts (51 the cat visual cortex. E.rperinn'ntal Brain Research. 68. 6ol-b6b. Lov,cl, S.. Freeman. B.. & Singer, W. (191,17). lopographic organization (51 the orientation column system in large fl'lt-mount~ (51 tile cat visual cortex: a 2-deox.vglucose stud,,'. ?'he Journal o! {'
H. ]t. Mallot, W. yon Seelen, att(I k7 G i a n n a k o p o t d o s Schall, J. D., Vitek, D. J., & l,c,.enthal. A. (;. (1986). Retinal constraints tm orientation sf~ecificity m cat visual cortex. The Journal +d Neuroscience, 6. ,s23-s3+,. Schwartz, IS. 1,. (1977). Afferent geometry in the primate visual cortex and the generation of ncur,.m,d trigger feature~. Bioh~gicul ( "~hernetic,~. 28. I- 14. Schv,'artz, IS. L (lqgO). ('omputational an,item,, and hincti(mal architecture of striatc cortex: A spatial mapping approach to perceptual coding I,'l+ton Rc~earch, 211,. b4S-e~bg. Schv, artz, E. I,, (10g2). Columnar architecture and computational anatomy in primate visual cortex; -.cgmentation and Icaturc extraction via spatial lrcqucnc.~ coded dilfercncc rnapping. Bio[ok, tcal( ~vhernetics'. 42, 157-1"8. Sherk. ['t. (19gO). ('¢,incidencc el patch 5 inputs Iron] the lateral genicuhltc complex and area 17 to the cat's ('larc- Bishop area. l'he Joarnal ~l ('omparativc .\'earo/~,~,,y'. 253. 105-120. Sparks, l). L , & Nelson, ,I. S. (19~,8). Sensory arid motor rnaps in the mamrn,ilian superior colliculu,,. 7)-cn+A m Neuro.+cience+, 10, 312-317. 5usa, g. (ltJSt4). Neural conlptltdlion lor auditory illlaging. Neural .\'f'/¢i' show that k()'
'R(x)).ldetJ(x)+
= ht~ - >(x)) . , d e t J . t x h.
(3~)
where h and ,.~ satisfx the same requirements., a~, k and ,J~. implie~,
Neural ,.~lal~pm,q k
h ;rod 'C
k(~
3"1
26.7
~. N,atc that (36) i>, e q u i v a l e n t /z(~,
:~().'))dctJly')i
to
'0,iihh:
',,()) ."
•
(371 "lTIc imq~ping :~ i', '.\ell d c l i r i c d sirlc¢ ~-J~\\as r e q u i r e d I~ bc or'ict~-om.'. \\'c ,4~o\\ litst the r c l a l i o n k = const • h: Sino." II ~ T. v~¢ gill1 chn(~,c ) H in (37) illl(] obt,;iin: /, I.~ )
hly
h(I)l) de\ J ((I) .
i l q i l dot
./ Ill) /~ hiy) d~
143)
.tl
\'.hctc m.., I,, Illc /,,..rolh monlcnl ~)1 lhc intrit+Nc kernel I, xA.'c ...'\i'liIIIC]b : ,I~ fill\) il [;i\I(lr, ,,,,.,rlgsiiild slii,,l\ the rc',.LIItiIIL.' Ill',)lllt-'Ill'~ ol ~. the t.'xl+~ilIlSi'nn ol <~ i',l'.O',siI"It-" MIl+..c,i~ is doIII~iIIniII. Ill COIllpOI1¢II l~.;
13,'; )
N o l o |hil| I h c rir',t illl)lllgrlts ~i b(.llh ]., ~md h arc / c r o d u c I() |lie ~,111IllCII'} I¢t]tllrClllt.'lll~ \ \ C l~lkg tIIc lirst illOI11¢111s ill I~oth ~illcs (~1 (3k) ~illd n b l m n
II
~,'),I~'.
,"()')k(.~
~,~ '
(3u)
I~. h,i I I. 2; .~ ( ~.. / ) . Stil',Mllulin~ L'\I?I'L'''~L'd l'~\ the Ill()IIlL'II[S 01 1". I c i
i)
(i
r').
(44)
thr, i n l o ( 4 3 h :h, can bc
.
t r o n i ('47i. \\c" II~i~c' dot J " dcl J . d o t d . I I o t h Jacobian~ and lhc' ill[c'Tral o l h \\c_'ll_' r e q u i r e d Io bc dirlc.rcnt lr()lll / c r o . II¢llCC h i l l i . (I :ind l h c r c l o r c I r o n l i t S ) k ( l ) - .clcl J' (())lhl,~). l h c illlitliit.'llt.'~s ol ,l~ (.'{ill I)t_' ~hil\\ll b} COlllral~itlMli()ll: \Vc h a \ c I(~ ~hn\\ Ihat '1 "~ • ,i; id. lilt.' i d c n t i t \ . \Vc Cl,~,~tilllc' th~il '1 , itl Cllltl c'(UlMtlc'r lhc Itlllc'li~lll
IL%');
dot J ( ) ' ) i
(-Ill)
,.let J ((i)
u
/'(3
~, : . ' : ( . ~ ' . ;
;,l.~" I
(45)
,h.
(4n)
_.."~ 4
i21):(21)'
" u.
(-17)
#'.l'.ll i I)!
~ul+~',llItllill~(441 ~md 147) ii+tt~I4") vtultl~: b\
h.,(~)
"~(~,)
.
_,: \,
.u
A
~(~).
V, ilh
b(~")). (J2)
I'lndll\. ',~.cIIt)IL'Ill;it ll,.ml i~ icl. it Iollo\,.,, their the cl)It,,Iillll .I 1(11 rclatirl,.1'1~ ~ind h i-,unity I'hls complctt--,, the prt~td of" I'hct~rcil] I
II.
I,/~
(41)
" i~(~'~)
Ft+r ~ \ \ i t h ~ , '+(~") ,,.~: ¢~tll llO;\ ,..'Ollslruc! a ,.'~.)tilriicliditm t-'h(),+Ml1~ h i l ' ~ in ( 4 ] ) I11c \,ilu,..',.
1,(i.t
i P k(Jld,.
m
/,f.~ - 3')
I i~ ~
d.._.nol,.:lh.+.'nlomcnl,, ol ,( ~iil,.lit',l~i,,li:ilpart ).,.lCN',c.+:li'.cl } I"ro111 (he l'()lil|iOlIitl",\llllllt-'I[\. iT It~Ilm',,lh;l[ all llII)lllCn(~It/. OI']~ ,a.ilh odd i tlr / ~.ittIi~.I1,I"~ir tilt."H.\ICC t.'~.t.'iI ll1OIlICIIlx.~+llICt.'~in,,I~.'li',.¢;
F r o m (371. (3;':I. '.,.,._' ha',c ll)l
:
m
", " :
.. (( . .....
)
" .
~ ....
I .~ ) .
(4S)
~lllc"J .~ !l~ in t'OllhllllIitl,l',o[h L'Ollll+~illICllls ()I ~ :lru'hill'IllOlllC drlcl the [ ~ii',Im:imb,~" nl ~iI[~u',.IcI,./.,~.dI1bh l lcn,.c ',. <,;
Proof ~,,f Theorem 3
\\',...' l'q,,.',,Clt! :l ',tricI p r o o l k l r the \-';is,..' T I~ Suh,,titutirL,.z I n u n I l l ) i n h l 171. lind d,.:nc, lirL,.z the l11\{.'l'S,,2 ill '1~ lw .". ,at-. hil~.,,. '
11 I , R . tht-' n.',.uh hnld,, tnr ~lll p o i n t , .~. H " ",ut-'h their the -.Upl'~ut o l / , ( ) ) ,) i,, :l " , u h w t ol T