Physica A xxx (xxxx) xxx
Contents lists available at ScienceDirect
Physica A journal homepage: www.elsevier.com/locate/physa
Fisher information in Poissonian model neurons Eitan Z. Gross Statistical Analyses and Bioinformatics, 211 Cambridge Place Drive, Little Rock, AR 72227, USA
article
info
Article history: Received 2 November 2018 Available online xxxx Keywords: Fisher information Mutual information Poissonian neurons Stam’s inequality
a b s t r a c t Mutual information (MI) is being widely used to analyze the neural code in a variety of stochastic neuronal sensory systems. Unfortunately, MI is analytically tractable only for simple coding problems. One way to address this difficulty is to relate MI to Fisher information which is relatively easier to compute and interpret with regard to neurophysiological parameters. The relationship between the two measures is not always clear and often depends on the probability distribution function that best describes the distribution of the noise. Using Stam’s inequality we show here that deviations from Gaussianity in neuronal response distribution function can result in a large overestimation of MI, even in the small noise regime. This result is especially relevant when studying neural codes represented by Poissonian neurons. © 2019 Elsevier B.V. All rights reserved.
1. Introduction Information theory has been employed in the analysis of neural response data in a variety of sensory systems [1– 3]. Such studies typically make comparisons between the Shannon mutual information given different classifications of stimulus and response ensembles. Unfortunately, mutual information (MI) is analytically tractable only for simple coding problems [4]. One way to address this difficulty is to relate MI to Fisher information [5]. For many neural population coding models Fisher information is relatively easy to compute and interpret with regard to neurophysiological parameters (e.g., neural response gain and dynamic range), as well as psychophysical behavior (e.g., discrimination threshold; [6,7]). The relationship between MI and Fisher information is not always clear however and often depends on the probability distribution function that best describes the distribution of the noise. It is commonly believed that in the limit of small Gaussian noise, Fisher information constitutes a lower bound for MI. However, using Stam’s inequality [8], it was argued recently [9], that this is not necessarily true for neurons with non-Gaussian response distributions. Furthermore, a recent study [10] has shown that when the error distribution is non-normal, Fisher information serves as an upper (rather than a lower) bound for Lindley information (an index of the expected change in probability of an observed variable and a latent attribute under knowledge of their joint distribution, relative to knowledge only of their independent marginal distributions). Here we followed that approach to show that for Poissonian neurons, Fisher information serves as an upper bound to MI. Poisson-like spike trains are believed by many in the field to be the fundamental unit of cortical communication [11–14]. 2. Results Let the scalar Y denote the output of some visual cortical neurons due to a signal S generated by the neurons in the retina in response to a stimulus θ . We introduce an additive noise, ε , Y = g (S ) + ε,
(1)
E-mail address:
[email protected]. https://doi.org/10.1016/j.physa.2019.123451 0378-4371/© 2019 Elsevier B.V. All rights reserved.
Please cite this article as: E.Z. Gross, https://doi.org/10.1016/j.physa.2019.123451.
Fisher
information
in
Poissonian
model
neurons,
Physica
A
(2019)
123451,
2
E.Z. Gross / Physica A xxx (xxxx) xxx
where g (S) is a neuronal gain function and ε represents an arbitrary noise. For an unbiased estimator, θˆ = g (S), we may ( )
( )
write the Fisher information as, J θˆ
(
(
∫
(
)2 (
∂ lnp y|θˆ
)
p y|θˆ dy. Assuming an additive noise with a smooth density
∂ θˆ
( ) = q y − θˆ . Under this assumption, J θˆ is independent of θˆ and thus becomes a constant ∫ ( ∂ lnq(ε) )2 that summarizes the total local dispersion of a distribution, i.e., J [ε ] = q (ε) dε . The Shannon entropy is also ∂ε ∫ independent of θˆ and is identical to the noise entropy, H [ε ] = q (ε) lnq (ε) dε . For a given amount of Fisher information, the Shannon entropy of a continuous random variable ( is)minimized if and only if the variable is normally distributed [8]. Thus, Stam’s inequality implies that, H [ε ] ≥ 21 ln 2J[πεe] . Therefore, for a normally distributed noise with variance σ 2 , ( ) Shannon entropy is given by H [ε ] = 21 ln 2π eσ 2 , while Fisher information is given by J [ε ] = σ12 . We can use these expressions for the Shannon entropy and Fisher information to define a new measure ∆ for non-Gaussianity of the noise q (·), we can write p y|θˆ
)
= )
distribution [9],
∆ = H [ε ] −
1 2
( ln
2π e J [ε ]
)
.
(2)
Given an invertible gain function, g (S), we can now write the mutual information (MI) as follows,
[
]
I θˆ , y = H [y] −
∫
( )
(
)
p θˆ H y|θˆ dθˆ = H [y] −
From Eq. (2), we get H [ε ] = ∆ + 12 ln
(
2π e J[ε ]
∫
( )
ˆ p θˆ H (ε) dθ.
(3)
)
. Substituting into Eq. (3), yields,
)) ( ( ∫ ( ) ( ) ∫ ( ) ] 1 2π e ˆ ˆ ˆ ˆ ˆ ˆ dθ. I θ , y = H [y] − p θ H y|θ dθ = H [y] − p θ ∆+ ln 2 J [ε ] ( ) Since θˆ is unbiased it follows that J [ε ] = J θˆ . We can thus re-write Eq. (4) as, [
(4)
⎞ ⎛ ∫ ( ) ∫ ( ) ( ) ( ) 1 2π e I θˆ , y = H [y] − p θˆ H y|θˆ dθˆ = H [y] − p θˆ ∆dθˆ p θˆ ln ⎝ ( ) ⎠ dθˆ − 2 J θˆ ⎛ ⎞ [ [ ]] [ ] 1∫ ( ) 2π e = H [y] − H θˆ + H θˆ − p θˆ ln ⎝ ( ) ⎠ dθˆ − ∆. 2 J θˆ [
]
∫
(5)
Changing variables, yields,
⎛ ⎞ ) ( ∫ [ ] 1∫ ( ) 2π e 1 2π e H θˆ − dθ. p θˆ ln ⎝ ( ) ⎠ dθˆ = H [θ ] − p (θ) ln 2 2 J (θ) J θˆ
(6)
Consequently, Eq. (5) can be re-written as [9],
⎛ ⎞ ( ) 2π e I θˆ , y = I [θ , y] = [H [y] − H [g (S)]] + H [θ ] − p θˆ ln ⎝ ( ) ⎠ dθˆ − ∆ = IF + H − ∆. (7) 2 J θˆ ( ) ∫ ( ) Where H = H [y] − H [g (S)] and IF = H [θ ] − 12 p θˆ ln 2(π e) dθˆ is the Fisher information term. As Eq. (7) suggests, ˆ [
1
]
∫
J θ
when the noise entropy is equal to zero (i.e. H = 0) and the noise(distribution function is Gaussian (i.e. ∆ = 0), MI equals )
[
]
ˆ y = H [θ ] − the Fisher information [9,15], i.e. I θ,
1 2
∫
( )
p θˆ ln
2(π e) J θˆ
dθˆ . Thus, only if the noise is Gaussian, does the
Fisher information constitutes a lower bound on MI. In many cases, however, the noise distribution is not Gaussian. Under these circumstances, the Fisher information, as we show below, might constitute an upper bound on MI. To illustrate, consider a Poissonian neuron, with an output (response) probability given by, P (µ, n) =
µn −µ e , n!
(8)
where n is the number of spikes emitted in a given time interval and µ is the average number of such spikes obtained over many trials. In Fig. 1 we plot results from computer simulations that show the ratio between MI and the Fisher information as a function of n for a Poissonian neuron with µ = 3. As can be seen, the ratio approaches unity asymptotically from Please cite this article as: E.Z. Gross, https://doi.org/10.1016/j.physa.2019.123451.
Fisher
information
in
Poissonian
model
neurons,
Physica
A
(2019)
123451,
E.Z. Gross / Physica A xxx (xxxx) xxx
3
Fig. 1. Computer simulations of the ratio of mutual information to Fisher information (Eq. (7)), MI/IF , vs. n for a Poissonian neuron, with µ = 3. The ratio approaches unity asymptotically.
below. This asymptotic behavior can be accounted for by the fact that in the limit of large µ (e.g. upon increasing the time interval), the Poisson distribution approaches a Gaussian with the same mean, G (µ, n) = √
1 2π µ
e
( ) − (n−µ)2 /2µ
.
(9)
3. Discussion Fisher information has been widely used as an approximation for the mutual information (MI), which more often is very difficult to compute. The relationship between the two information measures is not always clear. In the current paper we have shown that deviations from Gaussianity in neuronal response distribution function can result in a large overestimation of MI, even in the small noise regime. As a concrete example of a non-Gaussian neuron we studied the mutual information for a Poissonian neuron and demonstrated that the Fisher information for this neuron constitutes an upper bound on the mutual information. As was shown in Fig. 1, the mutual information approaches the Fisher information asymptotically. It should be noted that while the noise of a Poissonian neuron is non-additive, the effective noise, which summarizes the noise characteristics of the whole neural population with regard to the stimulus dimension can be approximately additive [16]. Our derivation is in general applicable for any gain function g (·). Often there exists a transform g (·) for which Fisher information is uniform. Uniform Fisher information implies that the noise is additive to a first-order approximation. Acknowledgments The author wishes to thank two anonymous reviewers for their useful comments on the manuscript. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]
A. Borst, F.E. Theunissen, Information theory and neural coding, Nat. Neurosci. 2 (1999) 947–957. F. Rieke, Spikes: Exploring the Neural Code, MIT Press, 1999. W. Bialek, F. Rieke, R.R. de Ruyter van Steveninck, D. Warland, Reading a neural code, Science 252 (1991) 1854–1857. J.J. Atick, Could information theory provide an ecological theory of sensory processing? Network: Comput. Neural Syst. 22 (2011) 4–44. R.A. Fisher, Theory of statistical estimation, Proc. Cambridge Philos. Soc. 22 (1925) 700–725. H.S. Seung, H. Sompolinsky, Simple models for reading neuronal population codes, Proc. Natl. Acad. Sci. USA 90 (1993) 10749–10753. P. Series, A.A. Stocker, E.P. Simoncelli, Is the homunculus aware of sensory adaptation? Neural Comput. 21 (2009) 3271–3304. A.J. Stam, Some inequalities satisfied by the quantities of information of Fisher and Shannon, Inf. Control 2 (1959) 101–112. X.X. Wei, A.A. Stocker, Mutual information, Fisher information, and efficient coding, Neural Comput. 28 (2016) 305–326. C. Markon, A generalized definition of reliability based on lindley information, 2017, PsyArXiv. W.J. Ma, J.M. Beck, P.E. Latham, A. Pouget, Bayesian inference with probabilistic population codes, Nat. Neurosci. 9 (2006) 1432–1438. C. Geisler, N. Brunel, X.J. Wang, Contributions of intrinsic membrane dynamics to fast network oscillations with irregular neuronal discharges, J. Neurophysiol. 94 (2005) 4344–4361. M.E. Mazurek, M.N. Shadlen, Limits to the temporal fidelity of cortical spike rate signals, Nat. Neurosci. 5 (2002) 463–471. M.N. Shadlen, W.T. Newsome, The variable discharge of cortical neurons: implications for connectivity, computation, and information coding, J. Neurosci. 18 (1998) 3870–3896. J.-P. Nadal, N. Parga, Nonlinear neurons in the low-noise limit: a factorial code maximizes information transfer, Network: Comput. Neural Syst. 5 (1994) 565–581. F. Rieke, D.A. Bodnar, W. Bialek, Naturalistic stimuli increase the rate and efficiency of information transmission by primary auditory afferents, Proc. Biol. Sci. 262 (1995) 259–265.
Please cite this article as: E.Z. Gross, https://doi.org/10.1016/j.physa.2019.123451.
Fisher
information
in
Poissonian
model
neurons,
Physica
A
(2019)
123451,