Two-dimensional Gabor-type receptive field as derived by mutual information maximization

NN 1160 Neural Networks PERGAMON Neural Networks 11 (1998) 441–447 Contributed article Two-dimensional Gabor-type receptive field as derived by mu...

Download PDF

322KB Sizes 14 Downloads 12 Views

Report

PDF Reader
Full Text

NN 1160

Neural Networks PERGAMON

Neural Networks 11 (1998) 441–447

Contributed article

Two-dimensional Gabor-type receptive field as derived by mutual information maximization K. Okajima* NEC Corporation, Tsukuba, Japan Received 19 May 1997; accepted 25 December 1997

Abstract Two-dimensional receptive fields are investigated from an information theoretic viewpoint. It is known that the spatially localized and orientation- and spatial-frequency-tuned receptive fields of simple cells in the visual cortex are well described by Gabor functions. This paper shows that the Gabor functions are derived as solutions for a certain mutual-information maximization problem. This means that, in a low signal-to-noise ratio limit, the Gabor-type receptive field can extract the maximum information from input local images with regard to their categories. Accordingly, this suggests that the receptive fields of simple cells are optimally designed from an information theoretic viewpoint. q 1998 Elsevier Science Ltd. All rights reserved. Keywords: Gabor function; Information; Visual cortex; Simple cell; Receptive field

1. Introduction The spatially localized and spatial-frequency (and orientation)-tuned receptive field (RF) of a simple cell in the visual cortex is well described by the Gabor function (Marcelja, 1980; Daugman, 1980), which is defined as a plane wave, or a complex exponential function localized by a Gaussian envelope G j(x) (Gabor, 1946; Daugman, 1985): ÿ ÿ (1) f x; k0 ; Gj (x) exp ik0 ·x Its Fourier transform f˜ ðk; k0 Þ is also localized around k 0 in the frequency domain. Therefore, as a filter it shows (broadly tuned) band-pass characteristics (Pollen and Ronner, 1983). An example of the (two-dimensional) Gabor function is shown in Fig. 1. The problem considered in this paper is why the visual system in the brain adopts such a function to analyse its input. Okajima (1997) showed that the well-known property of the Gabor function, that it is maximally localized in the space and frequency domains, is closely related to its information theoretic capability. Following Linsker’s approach (Linsker, 1988, 1993), he showed that the Gabor function is derived as a solution for a certain * Requests for reprints should be sent to Dr K. Okajima at Exploratory Research Laboratory, Fundamental Research Laboratories, 34 Miyukigaoka, Tsukuba, 305 Japan.

0893–6080/98/$19.00 q 1998 Elsevier Science Ltd. All rights reserved. PII: S0 89 3 -6 0 80 ( 98 ) 00 0 07 - 0

mutual-information maximization problem. This means that, under rather general assumptions, it can extract the maximum information from input local signals. Thus, he suggested that the RFs of simple cells must be optimally designed from an information theoretic viewpoint. To explain the two-dimensional RFs of simple cells in more detail, one has to investigate RF functions in twodimensional cases. Okajima (1997) did not, however, deal explicitly with the two-dimensional cases, and we had some difficulty with them, as we shall see in Section 3. Thus, the purpose of this paper is to investigate two-dimensional RF functions from an information theoretic viewpoint. The next section describes two objective functions to be investigated in this paper. The first one, which is basically the same as that used by Okajima (1997), will be analysed in Section 3. It will be shown that in two-dimensional cases, when the input signal statistics are isotropic, maximization of this objective function does not result in the orientation-tuned Gabor functions. In contrast, it will be shown in Section 4 that the other objective function leads to the Gabor function as an optimal RF function even in two-dimensional cases. Thus, the result still suggests that the Gabor-type RFs of simple cells are optimally designed under a certain information theoretic criterion which is, however, slightly different from that used in Okajima (1997). Preliminary results are presented in Okajima (1996). By using computer simulations, Kohonen (1994) demonstrated that the Gabor-function-like RFs are self-organized

442

K. Okajima / Neural Networks 11 (1998) 441–447

Below, I shall determine the optimal RF function f by maximizing the mutual information MI 1 or MI 2. Before that, however, a ‘localization term’ will be introduced to the objective functions. 2.2. Localization term

Fig. 1. An example of the Gabor function. (a) The real part of the function (or the cosine-type Gabor function). (b) The imaginary part of the function (or the sine-type Gabor function).

through a learning algorithm called the adaptive-subspace self-organizing feature map (ASSOM). Olshausen and Field (1996) also used computer simulations to demonstrate that the Gabor-function-like RFs can emerge by learning a sparse code for natural images. The results obtained in this paper might be related to these simulation results, although the relationship is not yet clear.

2. Objective functions 2.1. Feature extraction and the information obtained by measuring the feature

x

in the objective function to be maximized. Here, m is a Lagrange multiplier and u(x) ; 1/w(x) 2 is a ‘localization potential’ defined by the window function, which is assumed to be well approximated by the second-order expansion around its minimum as u(x) < u0 þ bx2

Shift invariance for the input signal statistics is assumed throughout this paper. That is, it is assumed that if a certain image f g(x) is presented, its displaced images f g(x ¹ x 0) will be also presented with the same probability. In this case, any image can be formally specified by two parameters: g, which specifies its category, and x 0, which specifies its position. Now suppose we extract a feature a from an input image f(x) using an RF function f(x), X f(x){f (x) þ n(x)} (2) a ¼ (f, f þ n) ; x

Here n represents an additive noise, assumed to be uncorrelated Gaussian noise independent of the image. We obtain a certain amount of information by measuring the feature. This paper considers the following two measures for this amount of information: ÿ (3) MI1 [f] ¼ H[a] ¹ H[alf ] ¼ H[a] ¹ H alG, X0 MI2 [f] ¼ H[a] ¹ H[alG]

Empirically, most of the features that have been useful for analysing images are localized ones. Accordingly, Okajima (1997) confined himself to localized features, and determined the RF function by maximizing MI 1 by considering local signals as signals (the local signals are obtained by seeing the original signals through a certain window function w(x)). Through such a procedure, he derived the ‘localization term’, X (5) ¹ m u(x)lf(x)l2 =kfk2

(4)

Here MI 1 is the mutual information between a and f, where H[a] ¼ ¹oa P(a) log[P(a)] and H[al f ] ¼ ¹of P(f ) oa P(alf) log[P(alf)] are respectively the entropy and the conditional entropy, which are defined using the probability P. On the other hand, MI 2 represents the mutual information between a and g; that is, the average amount of information obtained about the category of the image by measuring the feature a. In Eq. (4), H[alG] denotes the conditional entropy defined by H[alG] ¼ ¹o gP(g) o aP(alg) log[P(alg)].

(6)

Meanwhile, we can show that we obtain almost the same result as in Okajima (1997) if we start by explicitly assuming this ‘localization term’ (5) and maximize X (7) l1 ¼ MI1 [f] ¹ m u(x)lf(x)l2 =kfk2 x

instead of maximizing Eq. (3) for local signals. In this case, however, m is not a Lagrange multiplier, but a fixed parameter determining the relative weight of the ‘localization term’, or a penalty term for non-localized RF functions. In this paper I also confine myself to localized RF functions and, for simplicity, I shall take the latter approach. That is, I shall determine the optimal RF function by maximizing l 1 in Eq. (7) or l 2 in Eq. (8): X l2 ¼ MI2 [f] ¹ m u(x)lf(x)l2 =kfk2 (8) x

3. RF functions maximizing the objective function l 1 Okajima (1997) analysed a mutual-information maximization problem whose objective function is basically the same as Eq. (7) and showed that when the power spectrum of the signals has a peak at frequency k 0, a Gabor function of frequency k 0 becomes the optimal RF function under rather general assumptions. This section first summarizes the results obtained by Okajima (1997), some of which will be used in Section 4. Then, a problem in two-dimensional cases will be discussed. Okajima (1997) assumed a low signal-to-noise ratio (SNR) limit in his analysis (when the input signal statistics

443

K. Okajima / Neural Networks 11 (1998) 441–447

are Gaussian, this assumption is not necessary). In a low SNR limit, the mutual information MI 1 can be approximated as MI1 <

(f, Cf)

(9)

2j2n kfk2

where C is the covariance matrix of f and j2n is the variance for the noise. Therefore, we can maximize Eq. (7) by maximizing X X 2 ˜ p(k)lf(k)l l1 9 ¼ (f, Cf) ¹ m9 u(x)lf(x)l2 ¼ ¹ m9

X

x

k

u(x)lf(x)l2

ð10Þ

x

3.2. Two-dimensional isotropic cases

under the constraint of kfk 2 ¼ o xlf(x)l 2 ¼ 1. Here m9 ¼ 2j2n m is a constant, p(k) is the power spectrum of the input signals, ˜ is the Fourier transform of the RF function f(x). In and f(k) deriving theXright-hand side of Eq. (10), I used the equality 2 ˜ (f,Cf) ¼ k p(k)lf(k)l , which is valid when C is shift invariant. Then, by expanding p(k) and u(x) around their maximum or minimum up to the second order as p(k) < p 0 ¹ a(k¹k 0) 2 and u(x) < u 0 þ bx 2, Eq. (10) is rewritten as " # X ÿ X 2 2 2 2 ˜ a k ¹ k0 lf(k)l þ m9b x lf(x)l l1 9 < ¹ ÿ

k

þ p0 ¹ m9u0

Fig. 2. Schematic representations of the power spectrum p(k) and an ‘effective power spectrum’ p9(k) or remaining frequency components of p(k) that are not extracted by the Gabor function f(x;k 0).

x

ð11Þ

Okajima (1997) showed that the Gabor function f(x;k 0) of frequency k 0 becomes the optimal RF function that exactly maximizes objective function (11), by using the well-known property that the Gabor function is maximally localized in the space and frequency domains. 3.1. ‘Effective power spectrum’ Since the Gabor function f(x;k 0) thus obtained is localized in space, we actually have to prepare a set of Gabor functions f(x ¹ x 0;k 0), whose centers x 0 are located at various positions in space, in order to analyse input signals all over space. Now, suppose that we have measured features a(x 0;k 0) ¼ o xf(x ¹ x0 ;k 0)f (x) at positions x 0, which are sampled in space with an interval satisfying the sampling theorem. Then, what will be the next optimum RF function, provided we already know these a(x 0,k 0)? Intuitively speaking, the frequency components of signals around k 0 are already extracted by the set of Gabor functions f(x ¹ x 0;k 0). Accordingly, the power spectrum p9(k) of the remaining frequency components will have the profile depicted schematically in Fig. 2, with a new peak at frequency k0 9 (see Okajima, 1997). Then, by considering this p9(k) to be a new ‘effective power spectrum’, the procedure described in the previous subsection can be repeated to obtain the new Gabor function of frequency k0 9. By repeating these procedures again and again, we can obtain Gabor functions of various frequencies in sequence.

In two-dimensional cases, however, objective function (7) does not give a Gabor function (of nonzero frequency) when the input image statistics are isotropic. This is because, in two-dimensional isotropic cases, the power spectrum (or the ‘effective power spectrum’) cannot have an isolated peak, except when it has a peak at zero frequency. Meanwhile, in order to derive a Gabor function as a solution for the maximization problem, we need a power spectrum having an isolated peak. For example, suppose the power spectrum has a profile with a peak at zero frequency. Then, under appropriate assumptions, a Gabor function of zero frequency (i.e., a Gaussian) is obtained as the optimal RF function. However, when we extract frequency components around zero frequency by the set of this RF function, the remaining ‘effective power spectrum’ might have a crater-like profile, which does not have an isolated peak (see Fig. 3). Thus we see that, in two-dimensional isotropic cases, one cannot obtain Gabor functions of nonzero frequencies by maximizing the objective function (7). Let us approximate the ‘localization potential’ as u(x) < u 0 þ bx 2, and consider the problem of maximizing X X 2 ˜ p(k)lf(k)l ¹ m9b x2 lf(x)l2 (12) k

x

Fig. 3. A crater-like two-dimensional isotropic power spectrum (schematic representation).

444

K. Okajima / Neural Networks 11 (1998) 441–447

under the constraint of kfk 2 ¼ 1. By using the equality X X ˜ f˜ p (k)=2k f(k) (13) x2 lf(x)l2 ¼ ¹ x

k

we obtain the following eigen equation from the objective function (12): ˜ ¼ lf(k) ˜ (14) ¹ p(k) ¹ m9b=2k f(k) The solution which maximizes (12) becomes the eigen function corresponding to the lowest eigen value. Meanwhile, when p(k) is isotropic, the lowest ‘energy state’ of Eq. (14) is an ‘s-state’ which is also isotropic. Accordingly, a Gabor function of nonzero frequency cannot be the solution, since it has an oriented profile and is not isotropic.

4. RF functions maximizing the objective function l2 Next, let us move on to objective function (8). It will be shown that maximization of this objective function results in a Gabor function even in two-dimensional isotropic cases. Here let us also assume a low SNR limit. Then the first term in MI 2 (see Eq. (4)) is expanded in 1/j n (j2n : the variance of the noise) as H[a] < 1=2 þ 1=2 log 2p þ 1=2 log j2n þ j2s <1=2 þ 1=2 log 2p þ 1=2 log j2n D E X 2 ˜ lf(k)l lF(k)l2 þ k 2j2n D ED E X 2 ˜ ˜ lf(k)l lf(k9)l2 lF(k)l2 lF(k9)l2 ¹

ð15Þ

k

where F(k) denotes the Fourier transform of the input image and ‘h i’ denotes an averaging operation. Similarly, the second term in MI 2 is expanded as

þ 1=2 log

þ

X

#+ 2

k

<1=2 þ 1=2 log 2p þ 1=2 log j2n D E X 2 ˜ lf(k)l lFg (k)l2 g þ k 2 2jn D E X 2 ˜ ˜ lf(k)l lf(k9)l2 lFg (k)l2 lF(k9)l2 ¹

k, k9

4j2n

It should be noted that the first term in the right-hand side of ˜ 2 while that in Eq. (10) is Eq. (17) is second order in lfl 2 ˜ . This will cause the localization of the function linear in lfl in the frequency domain even when the power spectrum has a crater-like profile and its maximum is ‘degenerated’ over every orientation. Suppose here, for example, that the input image statistics are Gaussian. Then, using D E D ED E when k Þ 6 k9 lF(k)l2 lF(k9)l2 ¼ lF(k)l2 lF(k9)l2 D E D E2 lF(k)l4 ¼ 2 lF(k)l2 ¼ 2p(k)2

otherwise

x

where p0 is defined as ÿ 2 ˜ p"(k) ¼ 1= 4j4n p(k)2 lf(k)l

g

g

ð16Þ

(19)

x

We see that the first term in Eq. (19) tends to localize the RF function in the frequency domain around the region where the power spectrum takes its maximum, while the second term tends to localize the function in the space domain. This is the same as in objective function (10). An important point here, however, is that even when the power spectrum has a crater-like profile as shown in Fig. 3, the first term is larger when the function is localized around a certain frequency on the ridge of the crater than when it is extended all over the ridge. To see this, we rewrite Eq. (19) as X X 2 ˜ p"(k)lf(k)l ¹ m u(x)lf(x)l2 (20) l2 ¼ k

˜ lf(k)l lFg (k)l 2

ð17Þ

x

k

(see Appendix A). Here j2s is defined as E X XD 2 2 ˜ ˜ p(k)2 lf(k)l ¼ lF(k)l2 lf(k)l j2s ¼ (f, Cf) ¼

j2n

k, k9

D ED E 2 ˜ ˜ lf(k9)l2 ¹ lF(k)l2 lF(k9)l2 lf(k)l X ¹ m u(x)lf(x)l2 ; kfk2 ¼ 1

(see Appendix B), Eq. (17) is rewritten as X ÿ X 4 ˜ p(k)2 lf(k)l ¹ m u(x)lf(x)l2 l2 ¼ 1= 4j4n

4j2n

H[alG] < 1=2 þ 1=2 log 2p * "

x

E ÿ X D lF(k)l2 lF(k9)l2 <1= 4j4n

(18)

k, k9

k

Here, I explicitly write the averaging operation over g as ‘h i g’. However, since the variables inside the bracket do not depend on x 0, this averaging operation can be replaced by that over g and x 0 (i.e., over f), ‘h i’. Thus, Eq. (8) is rewritten as X l2 ¼ MI2 ¹ m u(x)lf(x)l2

(21)

As described in Section 3, we already know that a Gabor function is the solution for maximizing l 2 in Eq. (20) when p0 has a peak at a certain frequency. In this case, however, since p0 depends on the solution f itself, Eqs. (20) and (21) must be solved self-consistently. Let us suppose temporarily that a Gabor function f(x;k 0) of frequency k 0 (k 0 being a certain frequency on the ridge of the crater-like power

K. Okajima / Neural Networks 11 (1998) 441–447

445

Fig. 4. Numerical solutions that maximize objective function (16). The power spectrum of an uncorrelated random signal filtered through an isotropic twodimensional DoG (Difference of Gaussians) filter is adopted for p(k), and an exponential function, exp(cx) 2, is adopted for the localization potential u(x).

spectrum) is a solution. Then, from Eq. (21), p0 will have a peak at the frequency k 0, because the Fourier transform of ˜ the Gabor function f(k;k 0) becomes a Gaussian with its center at k 0. Meanwhile, we already know that when p0 has a peak at k 0, the solution maximizing Eq. (20) is the Gabor function of frequency k 0. Accordingly, we see that this Gabor function was actually a self-consistent solution. Fig. 4 shows a numerical solution obtained by solving the maximization problem in Eq. (19) with a computer. We see that even when the image statistics are isotropic, an oriented, Gabor-like RF function is obtained from the objective function (8).

5. Discussion We started from the objective function (8) and found that, in a low SNR limit, it is rewritten as the objective function in Eq. (17). We also found that objective function (17) leads to an oriented, two-dimensional Gabor function as an optimal RF function even when the input image statistics are isotropic. In the previous section, we considered a case where the input image statistics are Gaussian. I think, however, that the result may be general, and not confined to Gaussian cases. Let us consider here another simple example. Suppose a sine-wave grating of frequency k and of a constant amplitude is shown as an input image, each presented with a probability P(k). In this case, Eq. (17) is calculated as X X 2 ˜ p"(k)2 lf(k)l ¹ m u(x)lf(x)l2 (22) l2 ¼ k

x

where

2 ˜2 , ˜ ¹ lfl p"(k) ~ P(k) lf(k)l

˜ 2; lfl

X

2 ˜ P(k)lf(k)l

k

(23) D E D ED E 2 2 2 2 (note that lF(k)l lF(k9)l ¹ lF(k)l lF(k9)l ~ P(k)dk, k9 ¹ P(k)P(k9) holds in this case). Therefore, we can follow the same procedure as described in Section 4 to see that a Gabor function becomes a self-consistent solution maximizing Eq. (22), even when the probability P(k) is isotropic having a crater-like profile.

Neurons in the visual cortex receive their input signals by spike trains, whose frequency is thought to code the strength of the signal. Then, a trade-off must exist between the SNR and processing speed: if we require the higher SNR, the neurons will need the longer averaging time to evaluate the frequency. On the other hand, if we require a high processing speed, the neurons must deal with signals of low SNR. In the latter case, the low SNR assumption made in this paper might be justified. Even when we look at a single object, its retinal image might be projected at slightly different positions from time to time. Accordingly, we expect a neuron in the visual cortex to ‘see’ the same feature (such as a line or an edge of a certain orientation) at various positions within its receptive field. Objective function (7) requires the neurons to discriminate a feature from one presented at displaced position even when the feature itself is the same. Objective function (8), however, requires the neurons to discriminate a feature from others only when their categories (such as orientations) are different. The result obtained in this paper suggests that the RFs of simple cells in the visual cortex are designed based on the latter strategy. It should be emphasized that objective function (8) leads to a Gabor function even when the input signal statistics are Gaussian. Let us imagine that the visual system is equipped with a certain learning mechanism which self-organizes the RFs of simple cells in such a way that objective function (8) is maximized. If this is the case, from the results in Section 4, we expect that the learning mechanism will self-organize the Gabor-function-like oriented receptive fields even before animals open their eyes, driven by the random spontaneous firing of retinal ganglion cells. This expectation is in accordance with some self-organization simulation results (Miyashita and Tanaka, 1992; Miyashita et al., 1997) as well as the experimental finding that the orientation selectivity of visual cortical neurons is observed even before the animals have visual experiences (Wiesel and Hubel, 1974). Acknowledgements This work was performed under the management of FED as part of the MITI R&D of Industrial Science and

446

K. Okajima / Neural Networks 11 (1998) 441–447

Technology Frontier programme (Bioelectronic Devices project) supported by NEDO.

Appendix A Derivation of Eqs. (15)–(17) Let us rewrite Eq. (2) as a ¼ a s þ a n, where as ¼ (f,f ) represents the signal component and a n ¼ (f,n) represents the noise component. We are assuming that a s and a n are mutually independent, and that a n is a Gaussian variable. Then, in a low SNR limit, we may regard P(a) and P(alg) as almost Gaussian, are respectively

2 whose variances

2 written 2 2 2 2 2 ¼ a þ j and j9 ¼ as P(alg) þ as j þ a ¼ j s n s n

2 2 2 2 an ¼ j2s, g are given by Xjs, g þ jnD whereE js and X 2 2 ˜ ˜ lF(k)l2 and j2s, g ¼ k lf(k)l lFg (k)l2 . j2s ¼ k lf(k)l If P(a) is exactly Gaussian, the entropy H[a] is exactly calculated as 1/2 log 2pej 2 (see Rieke et al., 1997). When P(a) and P(alg) are almost Gaussian, H[a] and H[alG] are respectively expressed as 1 log 2pej2 þ h 2 1 2 þ h9 log 2pej9 H[alG] ¼ 2 g H[a] ¼

(A1)

where h and h9 represent correction terms whose order is, as shown below, o(1/j6n ). Accordingly, by expanding Eq. (A1) in 1/j n up to the fourth order, we obtain Eq. (17). The order of magnitude for the correction term h is estimated below. For simplicity, let us assume ha si ¼ 0 Z us also assume that the characteristic and ha ni ¼ 0. Let function F(y) ¼ P(a) exp(iya) da and the cumulant function W(y) ¼ log F(y) exist, and the following cumulant expansion is possible: W(y) ¼

X (iy)m am c m! mÞ0

when P(a) is Gaussian (see Rieke et al., 1997). Third, we see that h depends on e 3,e 4,… only in the form of (e 3/j 2),(e 4/ j 4),…. Therefore, using the above properties we see that the most dominant term in h is (e 3/j 3) 2, whose order is o(1/j6n ). To see the third point stated above, let us Z formally write the probability distribution as P(a) ¼ 1/2p exp(W(y)) exp( ¹ iya) dy. By substituting Eq. (A3) and introducing y9 ¼ jy, this is rewritten as ( ) ÿ ÿ Z 3 e4 =j4 4 1 y9 2 i e3 =j 3 ¹ exp ¹ y9 þ y9 þ … P(a) ¼ 2 2p 6 24 expð ¹ iy9a=jÞ

dy9 j

ðA4Þ

which indicates that P(a), and accordingly H[a], also depend on e 3,e 4,… only in the form of (e 3/j 2),(e 4/j 4),…. Finally, I show that when the variance is constant, the entropy H[a] takes its maximum when the probability distribution P(a) is Gaussian. Suppose a stochastic variable x takes a value x i with a probability P i. Then the entropy is written as H[x] ¼ ¹ oP Xi log P i. We want to maximize H under the constraint x2i Pi ¼ j2 (the mean is assumed to be zero for simplicity). Since we also have another obvious constraint oP i ¼ 1, we obtain h X i X X (A5) d ¹ Pi log Pi ¹ m1 Pi ¹ m2 x2i Pi ¼ 0 ¹

X

dPi log Pi þ 1 þ m1 þ m2 x2i ¼ 0

ÿ [Pi ¼ const·exp ¹ m2 x2i The Lagrange multiplier m 2 is determined by the constraint as m 2 ¼ 1/(2j 2). The order of h9 can also be estimated as o(1/j6n ) by the same procedure.

(A2)

where ha mi c is the m-th cumulant (Kubo, 1962). Since a s and a n are independent and a n is a Gaussian variable, the expansion (A2) is rewritten as ÿ 2 js þ j2n 2 ie3 3 e4 4 y ¹ y þ y þ… (A3) W(y) ¼ ¹ 6 24 2 for the signal component where e 3,e 4,…

are the cumulants

a s, i.e., e3 ¼ a3s c , e4 ¼ a4s c and

so on. In deriving Eq. (A3), we used ha 2i c ¼ j 2 and am n c ¼ 0 (m $ 3) for a Gaussian variable a n (Kubo, 1962). Now, let us expand the correction term h in these cumulants. Three points should be made here. First, we see that when all the higher order cumulants, e 3,e 4,… are zero, the correction term h also becomes zero, since in this case P(a) becomes Gaussian (Kubo, 1962). Second, the expansion begins from the second-order terms. This is because when the variance is constant, the entropy H[a] takes its maximum when all the higher order cumulants, e 3,e 4,… are zero; i.e.,

Appendix B Derivation of Eq. (18) Let uspdefine X the Fourier transform of an image f as F(k) ¼ 1= N x f (x) exp( ¹ ikx), where N ¼ o x1. Then, the first term in Eq. (18) is written as D E X ÿ ÿ ÿ ÿ 1 f x1 f x2 f x3 f x4 lF(k)l2 lF(k9)l2 ¼ 2 N x 1 , x2 , x3 , x 4 ÿ ÿ exp ¹ ik x1 ¹ x2 ¹ ik9 x3 ¹ x4 ðB1Þ Let us assume here for simplicity hfi ¼ 0. Then we have (Kubo, 1962)

ÿ ÿ ÿ ÿ ÿ ÿ ÿ ÿ f x1 f x2 f x3 f x4 ¼ f x1 f x2 f x3 f x4 c

ÿ ÿ ÿ ÿ f x 3 f x4 þ f x1 f x2

ÿ ÿ ÿ ÿ f x 2 f x4 þ f x1 f x3

ÿ ÿ ÿ ÿ f x 2 f x3 ðB2Þ þ f x1 f x4

K. Okajima / Neural Networks 11 (1998) 441–447

When the statistics of f are Gaussian, the first term in the right-hand side of Eq. (B2) becomes zero because, in this case, all the cumulants higher than the second order becomes zero (Kubo, 1962). By substituting Eq. (B2) into Eq. (B1), we obtain Eq. (18) (note that, for example, hf(x 1)f(x 2)i ¼ c(x 1 ¹ x 2) holds, where c(x) is the correlation function that satisfies o xc(x) exp( ¹ ikx) ¼ p(k) ¼ lF(k)l 2). References Daugman J.G. (1980). Two-dimensional spectral analysis of cortical receptive field profiles. Vision Research, 10, 847–856. Daugman J.G. (1985). Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America, A2, 1160–1169. Gabor D. (1946). Theory of communication. J. Inst. Elec. Eng., 93, 429– 457. Kohonen, T., 1994. Self-Organizing Feature Map. Springer-Verlag, New York. Kubo R. (1962). Generalized cumulant expansion method. Journal of the Physical Society of Japan, 17, 206–226. Linsker R. (1988). Self-organization in a perceptual network. Computer, 21 (3), 105–117. Linsker, R., 1993. Deriving receptive fields using an optimal encoding

447

criterion. In: Hanson, S.J., Cowan, J.D., Giles C.L. (Eds.), Advances in Neural Information Processing Systems, vol. 5. Morgan Kaufmann, San Mateo, CA, pp. 953–960. Marcelja S. (1980). Mathematical description of the responses of simple cortical cells. Journal of the Optical Society of America, 70, 1297– 1300. Miyashita M., & Tanaka S. (1992). A mathematical model for the selforganization of orientation columns in visual cortex. NeuroReport, 3, 69–72. Miyashita M., Kim D.-S., & Tanaka S. (1997). Cortical direction selectivity without directional experience. NeuroReport, 8, 1187–1192. Okajima, K., 1996. The Gabor-type RF as derived by the mutual-information maximization. Extended Abstracts of the International Workshop on Brainware, Tokyo, Japan, pp. 119–121. Okajima, K., 1997. The Gabor function extracts the maximum information from input local signals. Neural Networks, 11, 435–439. Olshausen B.A., & Field D.J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609. Pollen A.D., & Ronner S.F. (1983). Visual cortical neurons as localized spatial frequency filters. IEEE Trans. System Man, and Cybernetics, SMC13, 907–916. Rieke, F., de Ruyter, van S., Bialek, W., 1997. Spikes. MIT Press, Cambridge, MA. Wiesel T.N., & Hubel D.H. (1974). Ordered arrangement of orientation columns in monkeys lacking visual experience. Journal of Comparative Neurology, 158, 307–318.

Two-dimensional Gabor-type receptive field as derived by mutual information maximization

Two-dimensional Gabor-type receptive field as derived by mutual information maximization

Recommend Documents