Progress in Retinal and Eye Research 31 (2012) 467e480
Contents lists available at SciVerse ScienceDirect
Progress in Retinal and Eye Research journal homepage: www.elsevier.com/locate/prer
Optical superresolution and visual hyperacuity Gerald Westheimer*,1 Division of Neurobiology, 144 Life Sciences Addition, University of California, Berkeley, CA 94720-3200, USA
a r t i c l e i n f o
a b s t r a c t
Article history: Available online 23 May 2012
Classically, diffraction theory sets a boundary for the resolving capacity of optical instruments. Yet some visual thresholds have values much better than the traditional resolution limit. Recent developments in superresolution, an area of optical physics and engineering with claims of transcending the stated resolution limits of optical instruments, are reviewed and their possible relevance to visual spatial processing and to the exploration of the eye’s structure are assessed. In optical or diffractive superresolution the transmitted spatial-frequency band is not so much extended as either multiplexed with or displaced into regions that are usually beyond reach, with no overall gain in information transfer because prior knowledge is used to make inferences of possible object structure from the image. The Uncertainty Principle for photon position and momentum is never disobeyed. The study of the neural substrate of visual hyperacuity does, however, overlap that of “geometrical superresolution,” in which techniques are used for transcending limits imposed by the receptor lattice in analyzing fine image structure. Ó 2012 Elsevier Ltd. All rights reserved.
Keywords: Image analysis Diffraction Retinal image Visual optics Visual cortex Visual spatial processing
Contents 1. 2.
3.
4.
5. 6.
7.
Introduction. The classical analysis of optical resolution: the diffraction limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468 Exceeding the diffraction limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 2.1. Optical techniques of superresolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 2.2. Superresolution in object/image information transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471 2.3. “Molecular superresolution”: refined nanoscale spatial localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472 2.4. Geometrical superresolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472 Superresolution and the eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .472 3.1. Superresolution and the eye’s optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472 3.2. Aliasing and the retinal mosaic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 3.3. Very low spatial visual thresholds secondary to optical factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473 Visual hyperacuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474 4.1. Superresolution concepts not involved in visual hyperacuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474 4.2. Overlapping considerations in geometrical superresolution and visual hyperacuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 4.3. Pre-neural stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 4.4. Neural circuitry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476 4.5. Translation of concepts from geometrical superresolution technology to study of visual hyperacuity circuitry . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 The third dimension: axial and depth resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .478 Conclusion, current developments and future directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .479 6.1. Implications for neurocomputational approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479 6.2. Clinical implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .479 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
* Tel.: þ1 510 642 4828; fax: þ1 520 643 6791. E-mail address:
[email protected]. 1 Percentage of work contributed by author in the production of the manuscript is as follows: 100% contribution by the sole author. 1350-9462/$ e see front matter Ó 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.preteyeres.2012.05.001
468
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
Fig. 1. Diffraction limit. Top: Spread of light from a point image (A) and light distribution when two point targets are separated by the half-width of the point-spread function (B). This is the Rayleigh resolution limit, where the notch between the two peaks is deep enough to signal that it is a target doublet. Top right: the optical transfer function (C) which decreases to zero at the cut-off spatial frequency, beyond which no object sinusoidal stimuli are passed by the instrument. Bottom: Two equivalent formulations of the effect of an aperture on an incoming plane wave: (D) diffraction changes the electromagnetic wave propagation as formulated in diffraction theory and (E) confining a photon’s position to the aperture results in an uncertainty of its momentum, here the direction.
1. Introduction. The classical analysis of optical resolution: the diffraction limit According to well-established physical principles embodied in diffraction theory, the image created of a monochromatic point object by any conventional technique cannot be more compact than allowed by the wavelength l and aperture diameter a. Aberrations, focus defects, more extensive wavelength band, etc. always make it wider (Fig. 1A).2 Faced with the distinction, important in the spectroscopy of the time, whether a spectral line was single or double, Lord Rayleigh (1879) proposed the rule of thumb that this cannot be judged to
2 Throughout the review, distances in the lateral dimension, i.e., in transverse planes orthogonal to the optical axis, are expressed in angles subtended at the entrance pupil of the instrument or the eye. Except in Section 5, discussion is restricted to in-focus image planes, conjugate to the object planes in terms of geometrical optics. Imperfections such as aberrations or deformations of the wavefront, irregular apertures or deviations from monochromaticity of light will have to be taken into account in the application to specific situations, but do not impair overall validity.
be the case unless there was a separation of at least the half-width of the diffraction distribution (Fig. 1B). This Rayleigh resolution criterion is widely and justifiably used as a measure of performance. It features a pronounced notch in the joint light distribution of two points, whereas in the related Sparrow limit there is just the beginning of such a notch. However, the application in double-star resolution contains an arbitrary component, as was first pointed out by Toraldo di Francia (1955). Two sources separated by much less than the Rayleigh or even the Sparrow limit generate an image that while very close to the shape of the image of a single source is nevertheless wider and therefore in principle distinguishable. When examining optical performance it has become customary to use not point or line targets but spatial sinusoidal gratings. They have the advantage that with them the diffraction limit is simply and concisely expressed by a single value, the cut-off spatial frequency (Fig. 1C). Object sinusoids with a higher spatial frequency are not transmitted by the optical system. The width of the band of spatial frequencies is fixed at a value a/l cycles/radian in the object space, a rule which is inviolate and has support in the foundations of physical phenomena in quantum mechanics. As is the case for all
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
fundamental particles, knowledge about a photon’s trajectory obeys the Uncertainty Principle. Restricting its passage to an aperture, and hence delimiting its possible range of locations, makes its momentum, here the direction of propagation, less determinate (Fig. 1E). This is an alternative statement to the diffraction phenomena as derived from electromagnetic theory. According to the common interpretation, the strength of the electromagnetic disturbance in any point is a measure of the probability of photon capture. The concordance all the way from quantum mechanics to the diffraction image of a point source and its equivalent, the cut-off spatial frequency of a/l cycles/radian, demands scrutiny of any claims for superresolution, with the implication of exceeding this diffraction limit. 2. Exceeding the diffraction limit Yet, without in any way denying the absolute nature of the diffraction limit, there are nevertheless stratagems of transcending it. They are based, broadly speaking, either on subtly manipulating the passage of optical beams through the devices, or on sophisticated information-theoretical analyses of object/image relationships and, as should be expected, they all demand that any extra knowledge comes at a cost. In an early exposition of the topic (Cox and Sheppard, 1986) proposed reserving the word superresolution for the former and coining the term ultraresolution for
469
the latter. Though this nomenclature, even if it meant a proliferations of terms, would have would have helped emphasize the distinction, it did not catch on. Hence the two methods will be presented under the headings of optical and geometrical superresolution, respectively, the first involving the optical paths through the instrument and the second the analysis and interpretations of images.
2.1. Optical techniques of superresolution Well before the advent of the modern Fourier theory of optics, Abbe (1873) taught that when a microscope specimen is illuminated by a parallel beam of coherent light, its spatial frequency content is spread out in the plane of the objective’s principal focal plane, and truncated there by the diameter of the aperture. All spatial frequencies beyond this cut-off value (Fig. 2) will be absent. This is, however, the case only when the zero-order enters the objective in its center. Oblique illumination, in which the zeroorder is shifted to the edge of the aperture, does allow entry of the frequency band between 0 and a/l which includes the range between 1/2 a/l and a/l not ordinarily admitted. “Dark-field” microscope illumination, where the zero-order, carrying the uniform background light, is directed just beyond the edge of the aperture, had been well known to allow increased resolution.
Fig. 2. Abbe’s theory. Top: Optical arrangement of microscope viewing in Abbe’s theory (1873): Parallel beam collimated from a monochromatic point source (or from a laser), is diffracted by the specimen. The range of diffraction orders that enters the microscope objective depends on the wavelength of light and the numerical aperture. The target’s spatial Fourier spectrum is formed in the objective’s principal focal plane, because the beams entering it, though diffracted, are parallel. Bottom: The spatial frequency content of the image is limited by the aperture to a band of width l/a, ordinarily centered on zero, which carries the uniform background light level, and then extends from 1/2l/a to 1/2l/a. Alternatively (right), as occurs in darkfield illumination, it can be shifted laterally, allowing entry of spatial frequencies to one side, ordinarily blocked by the aperture.
470
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
The procedure clarifies an issue: the 1/2 a/l domain limitation mandated by diffraction theory may be located anywhere with respect to the spatial-frequency spectrum generated through diffraction of coherent light by the specimen; the usually quoted cut-off frequency is only the special case where the range has been centered on the zero spatial frequency. Dark-field illumination therefore does not contradict the traditional diffraction limit, it merely shifts the accepted spatial frequencies, substituting a range usually excluded for some usually admitted. The Abbe theory describes the situation for light that it coherent, i.e., that originates from a laser or a very small monochromatic point source. The case for incoherent light, where the target is either self-luminous or is illuminated by a large source, is somewhat but not radically different. At the stage of photon capture in the image space, the intensities rather than the phase-dependent amplitudes of the electromagnetic disturbance now add. This has the consequence that the spatial-frequency passband is twice as wide as that of the coherent case, but instead of being wide open between zero and the cut-off frequency, it falls off gradually. It is possible to use optical techniques to funnel other spatial frequency bands through an aperture not just instead of but also in addition to the one that usually passes through it, though this inevitably introduces disentanglement difficulties in the image when attempting to reconstitute the original target. One of the earliest proposals for optical superresolution was put forward by Lukosz (1966) and involves the interposition of a diffraction grating between the object and the aperture (Fig. 3). High spatial frequency target components will now be directed into the aperture and participate in imagery, albeit by being conflated with light from the direct beam. More recently the principle has been implemented, in a spatial frequency equivalent in optics of the heterodyne technique in radio, by superimposing a set of sinusoidal light fringes on an object. The product of two sinusoidal signals of different frequencies gives the sum of two cosinusoidal signals with frequencies equal to both the sum and the difference of the frequencies. If a sinusoidal illuminating beam of spatial frequency near but inside the cut-off spatial frequency is superimposed on a target, the transmitted (or reflected) light will be the product of the incoming light and the target contrasts, and hence contain components with the summed spatial frequencies (which certainly is beyond the cut-off limit) but also their difference, which contains higher spatial frequencies ordinarily
excluded by the aperture but now shifted into a region passed by it. This heterodyning can be done either in the object plane (Fig. 4), by illuminating (or transilluminating) the target with a sinsusoidal light distribution, sometimes called “structured illumination” (Gustaffson, 2000), or in the Fourier domain by using suitably placed masks. In either case, more than one of the object’s spatial frequency bands are superimposed in the generated image, leading to ambiguity; complete reconstruction requires multiple exposures. Alternatively, the aperture can be sequentially relocated (synthetic aperture optics), but then the coherence length and temporal constancy of the signal matter. Other target attributes such as wavelength or polarization can be used for superresolution purposes. For example, if the target is known not to have special polarization properties, one can use the single aperture’s passband for two virtual aperture locations, each with one of the two separable directions of polarization. In the same vein, the light wavelength can be multiplexed into separate bands each probing a different spatial frequency region. These techniques of extending the spatial-frequency transmission of an optical device in no way invalidate the classical diffraction limit; but they illustrate that displacing the aperture, and/or multiplexing the bundles of light that are passed through it, can increase some knowledge about the target that observation through the device yields, while at the same time reducing it (i.e., making some assumptions) in others. Most frequently the assumption involves temporal invariance: it is supposed that the situation has remained unchanged during the period in which multiple measurements are acquired. Gain in knowledge about the object always involves a counterbalancing cost of uncertainty in another target property, and the techniques all require elaborate procedures for object reconstruction. The discussion up to now concentrated on the purely optical factors limiting the spatial content of objects that is transmitted into the in-focus image by an optical device. To the extent that mention was made of some strictures about the nature of the targets to which it applies, such as having a coherence length within some time constraints, or having specific polarization properties, it draws attention to the information-theoretical aspects of the resolution process, which is the primary concern of the next section. Diffraction theory as ordinarily practiced computes the electromagnetic disturbance in image space initially free of any
Fig. 3. Lukosz superresolution schema. Schema devised by Lukosz (1966) to pass beams containing high spatial frequency object components that would ordinarily be intercepted by an aperture stop through an instrument’s entrance pupil. Ray PQ, representing a higher target spatial frequency, is deviated by the diffraction grating to reach the image plane at P0 þ1 where its association with target point P rather than other target points will have to be established. This concept marks the beginning of the contemporary superresolution developments.
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
471
Fig. 4. Structured illumination schema. Example of the structured illumination strategem of superresolution, formulated in the Fourier realm of an optical system for incoherent light. The contrast transfer function reaches zero at and beyond the cut-off spatial frequency. The object consists of a narrow band a at spatial frequency beyond the cut-off diffraction limit. A sinusoidal grating of spatial frequency b within this limit is superimposed, resulting in a target configuration consisting of the product of the two, which is the sum of bands with frequencies a þ b and a b (right), the second of these is within the spatial frequency region passed by the optical system and will, therefore, be included in the image. Hence the band at a, though beyond the diffraction limit, is represented in the image. But because it is superimposed on the components of the ordinary image, some processing is needed to reveal it.
additional constraints imposed by the apparatus which captures photons. But the receiving mechanism’s interaction with the electromagnetic field may not be passive; it may, for example, act as a waveguide, as has been suggested to be the case for some retinal receptors (Toraldo di Francia, 1949). Computation of the disturbance that is registered would then have to include an additional factor (Westheimer, 1959) with the result that the effective image distribution differs from the free-field case. 2.2. Superresolution in object/image information transfer Optical devices convey information about objects through the medium of light. The physical properties of light passage through the device, and the limitations imposed on them by diffraction, belong to one discipline and are embodied in the truncated passband of spatial frequencies based on aperture and wavelength of light. But in practice another discipline also requires attention: the laws governing information transfer. As formulated by Shannon, the information in a message is measured by how many possible alternatives are excluded. Applied to an object/image transfer it concentrates on the particular properties of objects that are expected to be represented in the image. The formulation of each individual study reduces these, either explicitly or implicitly, to a specific subset: intensity, wavelength, polarization, coherence, two- or three-dimensional spatial or temporal detail. Traditional resolution refers just to the case of twodimensional spatial details (lateral resolution) though sometimes the depth dimension is also considered (axial resolution e see Section 5 below). One usually begins by assuming that nothing is known about the object world and then the diffraction limit outlines the range of object details that an image transfer allows to be gained and, by exclusion, those that it leaves undetermined. On the other hand, it might be known ahead of time that the ensemble of possible objects is restricted. Then distinctions can be made by concentrating on the expected differences and disregarding image aspects that might have arisen from sources known beforehand to be absent. This was first clearly articulated by Toraldo di Francia in 1955 in the instructive case of two-star resolution. When it is known that the target cannot be other than either a single star or a star doublet
with equal total light intensity, the range of possible image structures is predicated by the instrument’s point-spread function and a single parameter, the separation of the sources. Any non-zero separation will induce a widening of the image distribution. Only if one had independent knowledge that the choice of possible targets can be limited to a single or a double star would width measurement suffice to yield secure knowledge and, depending on signal/noise factors in the image, a correct decision; “resolution,” could be obtained for separations smaller, perhaps much smaller, than the Rayleigh limit. In other words, “superresolution” would have been achieved at the price of drastically restricting the information transfer. If the target could just as well have been not only one or two points but also a short connecting line, then image width would not have sufficed and some measure like the Rayleigh or Sparrow criteria, with their dependence on the presence or absence of the classical central notch, would have to have been insisted on. A different, but equivalent interpretation of the situation obtains in the spatial frequency domain. A single and double target have different spatial spectra. When this is the only source of transmitted knowledge, decision between them has to be made on the basis of the difference of their components within the instrument’s cut-off spatial frequency. Yet some uncertainty, namely the targets’ spatial distribution beyond the cut-off frequency, remains (Harris, 1964). If there are probabilistic pointers within the transmitted spectrum to major differences beyond the cut-off frequency, the information gained would depend on the degree of firmness of this association, the prior. Under certain circumstances, when prerequisite probability distributions and optical transfer functions are available, such situations can be handled quite rigorously through utilization of the modern interpretation of the Bayes principle: the likelihood of the measured spread function having arisen from targets of a range of separations can be computed, the prior probability of the presentation of targets of different separations folded in, and the result would be values of the probability of the target being single or a doublet (Westheimer, 2009a). At a stretch, the term “superresolution” may be applied here, but, of course, no diffraction boundary has been breached. The procedures involving statistics serve as a reminder that signal/noise considerations always enter. The simple case of
472
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
Toraldo di Francia’s two-star superresolution proposal, discussed above, offers a quick insight. The decision whether the target is a single or a double star is based on the precision of the determination whether the image distribution is wider than a single star’s and, since this depends on the signal/noise ratio in the receptor activations, will improve with increase in the number of captured photons. Lukosz (1966) and Cox and Sheppard (1986) have discussed resolution limitations in terms of degrees of freedom which can be given quite general formulation involving bandwidths and train-lengths in the spatial frequency and time domains, and where noise, at the minimum that due to photon statistics (Fox, 2006), enables calculation of total information capacity. This necessarily constitutes an upper limit, but opens up the possibility of trade-off between, for example, one spatial dimension and another, or between a spatial dimension and time, or, if prior knowledge makes it possible, between intensity or contrast levels within their own dimension, or between them and another dimension. 2.3. “Molecular superresolution”: refined nanoscale spatial localization Extraordinary advances have recently been made in characterizing molecular structure by techniques to which the term superresolution has been applied. When fluorescence that has been activated in a labeled probe attached to a protein molecule is captured in a microscope, the image size cannot be smaller than mandated by diffraction, which corresponds to object width of the order of 100 nm and axial depth of 200 nm, dimensions large by molecular standards. But in principle, in an uncluttered scene, the location of the centroid of a non-overlapped feature can be determined with very much better precision than the width of the diffraction image, depending on the number of received photons and image-processing grain of the detection apparatus (Patterson et al., 2010). The situation is in all details, including centroid detection, need for spatial feature isolation and the deleterious effect of reduced contrast, identical to the one that had been explained as applying to the target localizing capability in human vision (Westheimer, 1976) for which the term hyperacuity was coined, because it was felt that the word “resolution,” in the sense of detecting separation of individual feature components, was not applicable. The major development in this nanometer molecular analysis has been the use of several fluorescent markers in the same protein region at the same time. By a sophisticated design of their wavelength properties and timing of their activation and quenching, structural details have been revealed by optical means that transcend microscopy resolution limits by orders of magnitude. However, as in all the other situations, these achievement still remain within the laws of diffraction. To the extent that its precision seems to have been exceeded, this is because uncertainties have been exchanged: a bunch of photons is assumed to have come from a single source, allowing the localization uncertainty of a single photon to be submerged in a population mean; no changes are presumed to have taken place during the finite time used for data acquisition. Mention might be made of techniques of elucidating molecular structure by examining the change that such structures impose on incident electromagnetic disturbance at extremely close range, i.e., within a fraction of the wavelength of light, called near-field (Betzig and Trautman, 1992) to distinguish it from the conventional imaging procedures, called far-field, where the examination is carried out with the intervention of lenses in planes many orders of magnitude of wavelengths removed. Near-field microscopy requires the placing of suitable probes within nanometers of the structure and does not as yet have a role in eye research.
2.4. Geometrical superresolution Having been generated by an optical system with identified spatial transfer characteristics, the light distribution is subject to the spatial processing properties of the image receiving apparatus, most usually partitioned into pixels of size that is not negligible in comparison with the cut-off frequency and therefore with resolution issues of its own. A whole theory of what has been called geometrical superresolution has been developed to deal with this problem and to design procedures for eliminating, or at least minimizing, losses or distortions caused by the finite size of the elements in the light-capturing layer (Zalevsky and Mendlovic, 2004). For example, regular pixel tiling can introduce aliasing, i.e., spurious frequencies when grating targets are funneled through them, in the manner of moiré fringes. Elaborate methods can be devised to deal with these situations and of their correlates, viz., the recovery of frequency components in the incident optical image to which the receiving layer might be thought impervious because they are finer than the pixel elements. Because refined localization of image features by the human observer in hyperacuity tasks involves just this kind of overcoming of limits imposed by the finite size of the elements of the receiving layer, more detailed consideration of geometrical superresolution will be deferred to Section 4.2 below. Occasionally the word superresolution is used for the image improvement that would result from the repetition of many exposures in which individually noisy transmission had left details uncertain. Averaging of multiple exposures, each within standard resolution limits, hardly deserves the qualifier “super” when the stricture has to be invoked that the object remain invariant across the time span. 3. Superresolution and the eye 3.1. Superresolution and the eye’s optics Optical superresolution procedures to extend the spatialfrequency passband of the eye’s optics have not been implemented to enhance visual performance because so many other techniques e telescopes, microscopes e have been invented to see fine details beyond those ordinarily passed by the unaided human eye. The retinal and neural stages of vision fit the spatial frequency range ordinarily allowed into the optical image by intermediate size pupils, and modern adaptive-optics methodology can extend this to very large pupil diameters well beyond the effective analyzing limit of the retinal mosaic. The widely discussed Wigner distribution function comes to the fore here (Lohmann, 1993; Zalevsky et al., 2000). It has the virtue that the segments of the target light distribution used for Fourier analysis are not arbitrarily imposed, as is the case for Gabor functions or in the wavelet system, but are being provided by the light distribution itself that is being analyzed (Westheimer, 2012). The reciprocal relationship between target dimension and the spatial frequency spectrum expressed in the fact that the area covered by its Wigner distribution function remains constant with magnification can be utilized to determine just what magnification is needed to enable certain target details to be resolved in specific circumstances. Once the needed spatial-frequency band has been established, the scale along the target distance axis is stretched until the associated narrowing of the spatial frequency axis is sufficient to allow the required target frequency components to be within the known cut-off frequency of the eye’s optics. Prior knowledge of the target details that need to be detected and of the eye’s cut-off spatial frequency then enables the appropriate magnification to be determined (Fig. 5).
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
473
preventing aliasing (Yellott, 1982), which should be classified as an illusion in that it represents a situation of mismatch of the physical stimulus situation (fringes of very high spatial frequency) and the sensory experience (fringes with spatial frequency determined by the difference between the incident light and the spacing of the receptor mosaic). 3.3. Very low spatial visual thresholds secondary to optical factors
Fig. 5. Wigner distributions. Target magnification changes spatial frequency spectrum. Right: Wigner distribution function for a 0.08 wide bright rectangular target. x axis: spatial frequency; y axis: distance in target light distribution. Prominent spatial frequency components, predominantly near the borders, extend to 40 cycles/degree and higher. Left: Three-fold magnification expands distances along the x axis, and in accord with the reciprocal relationship between target width x and spatial frequency n, evident in the kernel of the Fourier transformation cos(2pxn),compresses the spatial frequency spectrum commensurately, keeping he important components now within 15 cycles/degree.
One area of application of optical superresolution methodology in visual optics has an important clinical potential. In vivo visualization of fine structural details in the human retina by ophthalmoscopy is accomplished by means of passage of light through the eye’s optics and hence is subject to the restrictions imposed by the eye’s pupil. It could therefore very well profit from augmentation of the spatial-frequency band. This needs, as we have seen above, sophisticated instrumentation and image processing, and is now being attempted by structured illumination where sinusoidal fringes are projected on the retina. (The procedure differs from ordinary ophthalmoscopy where the incident light forms a uniform field, or from laser scanning ophthalmoscopy, where it is a very small scanning point.) The light returned from the fundus is now the product of the incident fringe light and the reflectance coefficient, and therefore would include components with spatial frequency equal to the difference between the fringe frequency and higher frequencies. In principle, information about fundus structure beyond the cut-off spatial frequency governed by the eye’s pupil can be accessed, but the technical problems of extracting them are formidable (Shroff et al., 2009).
The performance of the visual system is commonly probed by determining thresholds, i.e., gathering response information by varying the stimulus magnitude along a stimulus dimension. Spatial thresholds are those in which this dimension is distance along the surface of the retina, or, what amounts to the same, visual angle in object space, all other attributes, specifically time, intensity, color, and so on, remaining invariant. Thresholds have been reported that are well below the resolution limits of the eye’s optics and of retinal structure. They belong to two different categories according to whether the visual attribute responsible for the detection is contrast or distance in visual space. The latter include the hyperacuity tasks proper, such as vernier alignment or stereoscopic depth difference detection, which will be discussed below. But the celebrated detection of a telegraph wire against a uniform background (Hecht and Mintz, 1939) is based on discerning a difference in brightness and not location, and so does the 1 s of arc threshold for observer DL’s determination of position difference within a line triplet of 1.5 arcmin separation (Klein and Levi, 1985). In both of these instances, though the measurements are performed by changing distances in the eye’s object space, they involve the detection of differences in retinal light contrast secondary to changes in the width of small targets rather than the discrimination of position differences of a stimulus feature (Fig. 6). Basic to the phenomena is the understanding that even the smallest target cannot be imaged on the retina with a light distribution more
3.2. Aliasing and the retinal mosaic Starting with Helmholtz (1867/1924, vol. II, p. 35) some observers have reported seeing very fine fringes in foveal views of high spatialfrequency gratings. Two recent developments have given the phenomenon a firmer grounding. Good histological slices through the central fovea illustrate that indeed there are patches within which the receptors are arrayed in a very regular lattice, and lasers can generate full-contrast interference patterns on the retina at frequencies even beyond cone spacing. Hence conditions are favorable for probing responses not only at the limit of resolution of such a lattice but also beyond it. And in accord with the accepted view of processing of high-frequency spatial signals by a receiving layer with regular elements, there are responses at frequencies well beyond the element spacing (Williams, 1985). This aliasing fits into the rubric of geometrical superresolution, but it is fragile and, in any case, outside the range of natural visual sensations because the optical resolution limits of the eye with normal optics and the cone mosaic have over the course of evolution converged to a common value. The restricted regions of regularity of the latter had earlier been fingered as
Fig. 6. Contrast provenance of 30 three-line bisection threshold. Interpretation of the 1 arcsec bisection threshold in the three-line configuration of 3 arcmin overall width (Klein and Levi, 1985) in terms of discrimination of the contrast of the two notches in the retinal light distribution. The image of the slightly unequally-spaced three lines, after convolution with the eye’s line-spread function, features a slightly deeper trough at the wider spatial gap than at the narrower one. Threshold is reached when the contrast in the two gaps can be discriminated even though the separation differences between the peaks cannot.
474
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
compact than the point-spread function. The actual light distribution (the convolution of the target with the point-spread function) changes mainly in contrast and very little in shape when a target, say a very narrow dark line, increases in size from infinitesimal to a fraction of the point-spread function’s. Its increase in detectability has been convincingly ascribed to the increasing depth of the dimple it creates against the light background. This should surprise no more than the complementary effect, visibility of stars subtending a visual angle of a tiny fraction of an arcsecond if sufficiently bright. But it does not speak to the ability of the human visual system to partition location in space to values below the diffraction limit or the width of a retinal receptor. 4. Visual hyperacuity Hyperacuity covers the class of spatial visual tasks in which thresholds are smaller than the eye’s classical resolution limit (Westheimer, 1975). The term superresolution had not become common when these capabilities were subjected to detailed study; nor is it, strictly speaking, applicable because observers’ decisions are in the domain of relative localization rather than resolution of feature elements. However, the explanation of these visual functions overlaps the topic now called “geometrical superresolution” which is devoted to the filtering properties of sensor systems. In true visual hyperacuity the stimulus variations mark out distinctions that are not, as pointed out above, due merely to contrast changes alone. A specific example will illustrate (Fig. 7). Whereas the eye’s resolution limit rarely reaches 3000 , a bar of width 30 500 seen against a uniform background can be distinguished from one of 30 width. This width discrimination is robust to contrast variations and, therefore, a true measure of an observer’s performance in determining location differences. It is one of many such abilities in which the human observer can assign location to pattern components, in this case the two opposite edges of the bar, with precision that transcends by almost an order of magnitude the spacing of the retinal receptor elements and the width of the optical point-spread function. Pattern differences can be defined either as distance in visual space or, equivalently, by their spatial frequency spectra, where a full description ordinarily demands both amplitude and phase. The essence of the phenomena under discussion here is preserved for mirror symmetrical patterns, obviating the need for explanations in terms of target Fourier phase differences (Westheimer, 1977). A comparison of a specific situation in the two domains of space and spatial frequency is instructive. It favors the former (Fig. 7). Just as resolution is given a more compact and singular delineation by the cut-off spatial frequency as compared with the distributed one in terms of summed point-spread functions (Fig. 1B and C), so hyperacuity performance is characterized more concisely by statements about location in space rather than by differences in the domain of spatial frequency. Both terms, superresolution and hyperacuity, carry the implication that traditional limits of, respectively, resolution and visual acuity, are being transcended and information utilized that is unreachable under ordinary circumstances. It is worthwhile, therefore, to assess how far the concepts of superresolution, as they have been technologically realized, can provide insight into the still wide-open enquiry into the neural substrate of hyperacuity. 4.1. Superresolution concepts not involved in visual hyperacuity Because hyperacuity is evidenced under normal optical viewing condition, procedures of extending the spatial-frequency spectrum by directing normally extrinsic beams through the natural pupil (Section 2.1 above) are evidently not being utilized. If
Fig. 7. Two line resolution representation in position and the spatial frequency domains. Exemplar of a hyperacuity performance. Top: A bright bar on a dark background, 3 arcmins wide, can be distinguished in foveal vision as being narrower than one otherwise identical, 3.1 arcmin wide, a spatial threshold five or more times smaller than the resolution limit. Total light flux in the two configurations has been made identical and the distinction is made on the basis of the differing edge locations. Bottom: The spatial frequency spectrum of the two just-discriminable configurations. In the realm of spatial frequencies, the difference is widely distributed but in the realm of position it is highly localized.
“superresolution” were a term reserved for just these special optical procedures, more recently called “diffractive superresolution” (Zalevsky, 2011) then it would be a concept apart, justifying the original impetus to coin the word “hyperacuity” to provide a clear distinction. Much the same can be said about the ideas that in a hyperacuity response some temporal averaging takes place or that it is based on Bayesian inference. Because superresolution is a relatively new word, it has been appropriated for procedures which have nothing to do with acquiring information beyond the diffraction limit. One example is the averaging of many exposures to improve the quality when single images are degraded by noise, though it needs remembering that this requires the assumption that the original target remains unchanged during the process. Because most hyperacuity thresholds are unaffected by a reduction of exposure duration to as low as 10 ms, such averaging is not in play. Another notion to which the word superresolution has been applied is the linking of components of an object’s spatial frequency spectrum that is contained within the optical device’s cut-off spatial frequency to some components beyond it. Even though
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
these high spatial-frequency components are not available in the image, they may be used to draw conclusions about target properties if they are known to be associated with visible ones in the available image. This would in fact be an example of Bayesian inference, much discussed in modern perception theory (Kersten et al., 2004), and might be framed as follows: 1. An array of possible visual targets is assumed, whose configurations differ minutely in the relative location of their components and, consequently, have widely-distributed differences in their spatial frequency spectrum, both within and beyond the cut-off frequency, but always with a fixed association between what is inside the cut-off frequency (and hence passed into the image) and what is beyond it. 2. In a given presentation, the likelihoods are assessed that a seen spatial frequency content (necessarily within the cut-off spatial frequency) corresponds to each of the members of the target array to be available with a known prior probability distribution. 3. For each of these targets the products is formed of its prior probability and the likelihood of the present image having arisen from it. 4. Based on an analysis of the normalized distribution of these products, a decision is made as to which member of the array is the most likely to have in fact been the target on this occasion. This is the procedure used in machine vision. How applicable might this line of thinking be to visual hyperacuity? To begin with, such a computational methodology presupposes that the quantitative data necessary for the practical implementation of Bayesian induction (prior probability distribution of targets, ability to assess likelihoods that a given view had arisen from the various targets) is available, something that has yet to be realized. Some observations run counter to this kind of approach. Specifically, one can make a judgment whether the separation of one pair of features is wider than another pair of features, at a hyperacuity level, across a variety of interchangeable feature pairs, including previously unknown or unsuspected ones. What seems to be discerned is the magnitude of the spatial separation of a set of borders or contours, abstracted from the details of the object that generate the contours. Or more accurately phrased, judgments can be made at a hyperacuity level of whether a spatial interval is larger or smaller than a comparison, substantially decoupled from the manner of demarcation of the intervals. 4.2. Overlapping considerations in geometrical superresolution and visual hyperacuity When, however, the inquiry extends to transcending receptor size and spacing, a topic subsumed under the term “geometrical superresolution” (Zalevsky, 2011), then some considerations also apply to visual hyperacuity. The basic limitation is the partitioning of the receptor operation into compartments, each with a single indivisible spatial signature or label. The technical topic of sampling theory (Blackman and Tukey, 1958) has application only insofar as it is understood that the representation at each discrete location is the integral of the excitation over the whole acceptance area of each receptor. Sampling in the sense of identifying the signal height at a sparse series of single separated locations does not occur. In the technical literature the word ‘pixel’ is widely used. Design and analytical procedures in the optical technology of superresolution are usually built around the application of arbitrary operations such as superposition of masks, imposed image motion or scanning (“time multiplexing”), and it is commonly presupposed that pixels tile the image space uniformly and have invariant response characteristics. Migration of these concepts into the realm of hyperacuity processing
475
thus, in the first instance, invites investigations of the extent to which they apply to the image-dissecting apparatus there in play, namely the retina and the first projection to the visual cortex. At the outset, as in geometrical superresolution, what is being transcended are the properties of the layer of receptor elements. Often considered in conjunction with the optical image, they constitute the “pre-neural stages” of processing and can usefully be examined separately from the neural stages which they precede. 4.3. Pre-neural stages As a beginning proposition, consider the situation in the very best human hyperacuity performance in the center of the human fovea, with a near-perfect hexagonal receptor mosaic, each element of which is subserved by at least one unique neural connection and has high differential light sensitivity. To visualize the precondition for hyperacuity, assume for a moment a simpler situation still, a square receptor (pixel) lattice, a line target aligned with the lattice and an optical spread function so narrow that all the light is well contained within the same column of receptors. A small position shift that retains funneling of all the incoming light into the same receptors will not cause any response difference and would, therefore, not be registered. There would be no hyperacuity for a regular receptor lattice with elements wider than the light spread. This situation changes if the spread function’s width exceeds that of the lattice elements, or if there is lattice irregularity, or if movement and time-integration produces a smearing. Now a position difference can be detected by inter-receptor comparisons. Such a needed mismatch between overall light spread and receptor spacing is indeed the case in normal vision, though it may possibly be avoided by image stabilization and adaptive-optics image sharpening, and it may perhaps not apply in peripheral vision with extensive neural spatial summation. To outline the task that devolves on subsequent neural mechanisms, one can estimate the signals that might arise in an array of receptors when there is a just-detectable location difference of a line. Under normal viewing conditions. a good approximation for the optical light spread is that of a diffraction-limited optical system with a 2.5e3 mm diameter pupil (Fig. 8 top) (Westheimer, 2006). Aberrations, wavelength of the light, accommodation instability, and a whole host of other effects enter in any individual situation and render more exact computation beside the point. For the most acute observations in the center of the human fovea, the retinal receptor lattice’s geometrical configuration and acceptance properties can to a first approximation, be modeled by a hexagonal structure, as demonstrated histologically (Fig. 8 bottom) where each element has at its own individual neuronal connections. It is evident that there is enough mismatch between receptor width and light spread to allow hyperacuity even though this ideal arrangement no longer applies elsewhere in the retina. Applying these data, it is possible to describe the signals within the pre-neural stages in a representative hyperacuity task in which an observer can distinguish line or edge positions differing by, say, 0.1 arcmin. A mosaic is hypothesized in which every other row of the lattice is offset by half a module, the response considered for a traditional line-spread function aligned with the mosaic and the output summed over adjacent columns of vertically-aligned elements. A computation of this sort results in a distribution of receptor light absorptions as shown in Fig. 8 (middle). The estimated number of receptor photon events per presentation and exposure is of the order of at least 104 giving a signal/noise ratio large enough to render patterns of differential receptor activations such as shown in Fig. 8 (middle) quite stable (see Cox and Sheppard, 1986, Fig. 1).
476
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
4.4. Neural circuitry
Fig. 8. Representative line spread function, foveal receptor mosaic and excitation in adjoining receptor lattice columns. Top: Light distribution in the retinal image in a hyperacuity situation in which an observer judges the separation of a line pair 4 arcmin apart. Middle: Distribution of light excitation levels in adjacent columns of a hexagonal receptor lattice typical of the center of the human fovea (schematic below, and in actual histological section at bottom). At left, relative distribution of light in adjacent receptor columns half a lattice-width apart. At right, shown in different colors, the distributions when the line stimulus differs by 0.1 arcmin in location. This is a representative hyperacuity discrimination where an observer can distinguish between line separations of 4 arcmin and 4.1 arcmin (arrows below). In the diagram, necessarily somewhat schematic, it is assumed that light is captured over the hexagonal apertures of receptors and that summation occurs over columns, half a receptor width apart.
These calculations are based on light absorption. Because of the compressive non-linearity that is a feature of the transduction process (Naka and Rushton, 1966), the differences in the synaptic signal in the retina will be smaller. The fundamental difference is shown here to emerge between resolution and localization. At the two-line resolution limit, the retinal excitation distribution has two ridges separated by a notch (Fig. 1B), each one receptor column wide. The trough that needs to be detected there (Liang and Westheimer, 1993) is several times deeper than the differences between adjacent receptor columns indicated in Fig. 8, middle. It follows that hyperacuity localization of individual peaks or borders is accomplished with sub-pixel precision by some sort of operation utilizing output differences not between individual contiguous mosaic elements within the distribution originating from a single target feature, but from parameters derived from all the elements of the distribution. That, as had been demonstrated quite early by Best (1900), it does not matter in such tasks whether the contour is a line or an edge, implies that the nature of possible neural operations is rather general. How successful have neurophysiology and psychophysics, the two disciplines that take over at this point, been in providing an understanding of this operation?
The output of retinal receptor is subject to various operations such as compressive non-linearity in the intensity domain, and, spatially, summation as well as antagonistic surround inhibition. An adequate characterization of this transformation can be developed (Westheimer, 2007) and convolved with receptor-excitation functions to yield a theoretical excitation distribution in the inner retina. However, just how it is embodied in the activity of overlapping and interconnected population of neurons is far from clear. The high sensitivity to small excitation differences in single neural units with immediately adjoining spatial signature cannot at present be demonstrated neurophysiologically, but a closelyrelated finding suggests that the adequate neural substrate exists. Recordings, from retinal ganglion cells, first in the cat (Shapley and Victor, 1986) and later in the monkey (Lee et al., 1995), show that their response level reflects small position changes of an edge very precisely in their output, which nevertheless represent only a single fixed spatial value (line label) (Fig. 9). Experimental limitations make estimates of the receptive field width and shape uncertain, but it is wider by at least an order of magnitude than the smallest detectable position change and is subject to stimulus- and contextdependent surround inhibition. From this it is clear that position differences in the hyperacuity range will be reflected in changes in the excitation level in a minimum of a half a dozen, but probably many more, elements whose output is channeled to the visual cortex. Because the probes e single cell recording, imaging procedures e are as yet too coarse, and probably also because the distributed nature of the signals in these regions, direct information on the neural substrate of visual hyperacuity at levels beyond the retina is so far lacking. Once signals have entered the cortex there is a great deal of interconnectivity both within and between levels; neat layering into a hierarchy, each tier with its own processing characteristics, does not occur. There is much talk of “early” vision (Regan, 2000), which is concerned with neural operations predominantly in the primary visual cortex, but further diversification yet have already appeared there, namely the parceling out into parallel streams according to
Fig. 9. Ganglion cell edge responses (from Lee et al.). Spike response of macaque magnocellular retinal ganglion cell to an edge located in various positions inside the receptive field during 50-ms presentations. Output is sufficiently finely graded to differentiate edge locations to within about 2 arcmin, which matches the human position hyperacuity thresholds for targets with similar stimulus parameters. The signals represents the strength of the neuron’s output and has the position signature (local sign) of the neuron, regardless of the spatial stimulus distribution within the receptive field that triggered it. A more complete reconstruction of the target situation therefore requires an ensemble of such neurons.
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
attributes of visual percepts e contrast, brightness, color, depth, motion. What knowledge we have has been accumulated less by direct observation of the structure and functioning of neural elements than on the ultimate manifestation of the organism’s performance, mostly as revealed in psychophysical experiments. 4.5. Translation of concepts from geometrical superresolution technology to study of visual hyperacuity circuitry Psychophysical analysis of hyperacuity mechanisms has been extensive and, since adequate accounts are available (e.g., Westheimer, 2009b), need not be revisited here. What will be examined instead is whether help may be derived from some of the theoretical concepts that were found useful in geometrical superresolution techniques enabling enhanced retrieval of information contained in an optical image that has been passed through a receptor layer with defined geometrical limitations, for example operations such as filtering, masking, or superimposed movement. The result of a single experiment, which addresses the role of factors in the realms of time, space and motion, helps to highlight the operation of the apparatus by which the location of a feature element is identified by the human visual system (Westheimer and McKee, 1977). A pair of vernier lines, detectably offset in one direction, is briefly exposed for 5 ms sequentially in four spatially displaced locations. That is, a misaligned vernier pattern is briefly swept across the retina over a short distance. This is preceded and followed for 5 msec each by the presentation of a single line (Fig. 10) so situated that if the whole configuration were pooled spatially and temporally, it would appear as a vernier ribbon whose offset is in the direction opposite to that of the moving vernier line pair. The question becomes: Does the observer perceive the offset to be the one in the sequence of instantaneously exposed vernier line pairs, or the one in the spatially and temporally summed light of the whole? Or better: What are the time and space parameters at which the perceived offset reverses from that in the sweeping vernier line pair to that in the somewhat blobby ribbon?
Spatio-temporal integration of signals over 2-3 arcmin and 30-50 msec
25 msec 5 msec
Fig. 10. Centroid generation. Demonstration of light summation in temporal-spatial window for a hyperacuity response. A left-sided vernier pattern (heavy lines) is swept across the retina in four 5-ms steps, preceded and followed by single lines that, when integrated with them, generate a light ribbon whose centroid is misaligned to the right; this is what is reported by the observer. The lines presented in each of the six time stations, 5 ms apart, are shown as heavy; the lighter ones indicate what had been shown in previous steps in this particular trial.
477
The answer for good observers in the center of the fovea is 20e30 ms and 2e3 arcmin. That is to say, light components arbitrarily laid down on the retina within this time and space are summed, and their centroid determined with a precision that allows the direction of relative displacement of two neighboring patches to be judged down to about a fifth of the width of elements of the receptor mosaic. As is seen in Fig. 8, detection involves quite small excitation differences among neighboring receptors and this opens up signal/ noise considerations. Operations depending on light capture ordinarily perform better the higher the intensity, usually as the square root of the intensity. Nowadays the interest has shifted from target luminance (Baker, 1949) to contrast (Westheimer et al., 1999) where the effects depend critically on the brightness level and the kind of pattern. Performance suffers when stimuli differ from the background only in chromaticity (Morgan and Aiba, 1985) and not luminance. As regards image movement, two sets of observation are relevant. Contrary to the conjecture that the micronystagmus during ‘steady’ fixation plays a facilitating role (Marshall and Talbot, 1942), good localization thresholds can be obtained with stabilized images (Keesey, 1960) as well as with stimuli of very short duration, 10 ms (Westheimer and Pettet, 1990). The contention that the location of a target element is derived from excitation within a region larger than the optical point-spread function and the grain of the receptor elements is reinforced by the phenomenon now named “crowding.”, i.e., a performance decrement when interfering contours are introduced close to the test pattern (for a review, see Levi, 2008). Hyperacuity thresholds suffer when irrelevant stimuli are situated in the vicinity of a feature that has to be accurately localized. This interaction is maximal when the disturbing target is quite clearly articulated and sufficiently separated not to have been optically intermingled (Westheimer and Hauske, 1975). These phenomena can be interpreted as evidence for the existence in sensory space of circuits for hyperacuity processing whose operation needs relative immunity from competing stimulation. The difficulty of generating operating models once consideration reaches cortical circuitry is illustrated by the following finding (Malania et al., 2007). When the threshold in a hyperacuity task has been reduced by a simple crowding stimulus, this interference can under certain circumstances be diminished by additional features with which the crowding feature in turn interacts, i.e., there can be “uncrowding.” Further, recent work points to some fluidity and plasticity here, leading even to hints of perceptual grouping in accord with Gestalt laws (Sayim et al., 2010). This should not surprise since the location of this processing is cortical and communication between such centers is invariably both feedforward and feedback. In general, while results of this research permit on overview of the nature of the neural circuitry, they do not at present favor the development of convincing quantitative models. A further indication of the complexity of the neural circuitry of these early visual stages is given by the way localization signals can be read out. There is no doubt that location information of a visual feature with hyperacuity precision is based on neural operations within a confined temporal and spatial zone, w20 ms and w2 arcmin in the human fovea. However, whereas an observer can make such a location assignment to a wide range of features, this cannot be done when they are presented in isolation, either in time or space. The extremely small values of hyperacuity thresholds are manifested only as location differences for target configurations whose components are presented within narrow limits of position and synchrony. Spatial discrimination thresholds of a few seconds of arc in the human fovea can be demonstrated by a pair of lines or borders, but these have to be presented within 2e10 arcmin and
478
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
50 ms of each other (Westheimer, 2009b). To the extent that stimulus conditions deviate from these optimal ones, thresholds suffer. This is particularly the case for dim light, low contrast, heterochromatic isoluminance, opposite contrast polarity and for extra-foveal regions whose retinal grain grows coarser with eccentricity. One very active area of research at present is the adaptive capacity of the neural circuits: the performance in a given task is usually not static but varies over time and with context, depending on attention, expectation, learning, even after discounting the initial need to gain familiarity before responses are stable (Fahle et al., 1995). Unfortunately it is rather difficult to navigate this terrain of perceptual learning (Fine and Jacobs, 2002), of considerable relevance in applied vision and in the clinic, because results of even well-conducted studies can differ due to individual variations and to what may seem inconsequential differences in stimulus parameters and training protocol. 5. The third dimension: axial and depth resolution Consideration has so far been confined to the examination of the spatial disposition of feature images in a transverse plane conjugate to the target plane with respect to the optical device. Stated most succinctly, the wavefront diverging from the target has been changed by the optics into an ideally spherical one centered on the image plane and questions have been raised of limitation in discerning lateral object position, and position differences, from the image light distribution in that plane. In general, both theoretically and practically there is a favored transverse plane for this purpose, almost always the geometrical image plane. However, on occasions a separate question can be raised: What is the smallest detectable difference in target distance along the axial dimension? This is the question of axial resolution. The analysis based on light coherence, utilized in optical coherence tomography (see Drexler and Fujimoto, 2008; for a review) is along quite different lines, although issues of axial resolution also enter in that context. Just as the Airy disk describes the light distribution in the transverse plane at the focus, so distributions in other planes can be calculated and in this way the intensity of the electromagnetic
disturbance depicted also in the third, the axial dimension (Fig. 11). The central lobe extends forwards and backwards about twice as far as it does laterally and, to a first approximation, is symmetrical around the focal plane (Zernike and Nijboer, 1949). This elementary, ideal-case description of the spread of light from a point object into an ellipsoid with surrounding low-intensity rings can serve as a basis for axial resolution in the same manner as the Airy disk and receptor width do for lateral resolution. Whereas in some theories retinal receptors cells are modeled as waveguides with an acceptance aperture in a single transverse plane, it is more likely that light enters and is processed along the whole length of their outer segment, which is many times longer than the dimension of the central ellipsoid in Fig. 11. Hence receptor compartmentalization is not nearly the central issue for axial resolution (and for that matter accuracy-of-focus detection) as it is for lateral resolution. Yet this topic equally requires precision in its initial formulation in information-theoretical terms. What is the ensemble of physical situations within which distinctions are to be made? A representative example might be the following: Embedded in a transparent medium are two particles. Whereas the question asked in the previous sections dealt with their minimum distinguishable lateral separations, here it would be: What is their minimum separation along the axial dimension that can be distinguished? The problem gets somewhat opaque if there is only a single illuminating beam and the two particles are aligned in the direction of the propagation of light so that their beams are superimposed. But it becomes realistic if they are sufficiently separated laterally to allow examination and comparison of their individual three-dimensional image distributions. A static analysis of the cross-sectional light distribution in any single plane would give ambiguous results unless the possible target dispositions have been previously strictly delimited, i.e., unless it has been previously identified that there are only two point targets of known lateral separation and which of them might be the nearer. The options become much wider once the analysis is allowed to be dynamic with the implied understanding that the configurations remained invariant during the test duration. A swiveling coherent light illuminating beam would produce characteristic time-varying interference patterns. Alternatively, either the target or the image plane could be moved in an axial direction. A particularly sensitive
Fig. 11. Isophots in lateral and axial direction around the focal plane in diffraction limited imagery. Light distribution in a plane containing the optic axis in the vicinity of the focal plane for purely diffraction-limited imagery, showing “isophots.” The cross-section in the focal plane represents the intensity distribution in Airy’s disk. The intersection dashed lines are the beam limitations according to geometrical optics. Reprinted by permission from E.H. Linfoot and E. Wolf, Proc. Phys. Soc. 69:823, 1956.
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
technique is parallax detection which is the principle of human stereoscopic depth discrimination. Lateral separation in the image plane will change with incident direction of the illuminating beam and the determination will becomes one of difference in separation which, as was seen above, can be achieved with great precision. Human stereoscopic depth discrimination is a hyperacuity in that the threshold is just a few seconds of arc (for a review, see Westheimer, 1994). Depth differences are detected by the difference in parallax of features in the retinal images of the two eyes, so that the particular implementation of axial resolution involves the simultaneous comparison of two separate image representation of the same targets, rather than the temporal comparison in a single plane by moving illuminating beams. Fundamental differences between vernier acuity and stereoscopic acuity remain a major challenge in visual neuroscience. 6. Conclusion, current developments and future directions Sophisticated procedures allow information to be gathered by optical or by image-processing means about spatial details in objects that are finer than the traditional Rayleigh resolution limit mandated by the electromagnetic theory of light and by the concept of the Uncertainty Principle applied to photons. The phrase “breaking the diffraction limit” carries with it the wrong implication. In every case, whatever gains in spatial information that may have been achieved are accompanied by commensurate restrictions in knowledge about other object properties. For a given wavelength of light, an optical system’s aperture determines the passband of spatial frequencies that are admitted, but its placement along the spatial-frequency spectrum is open, as is also the possibility of multiplexing, by funneling more than one band through the aperture with a consequent problem of disentangling the resulting superimposition of their components in the image. 6.1. Implications for neurocomputational approaches None of these aspects of the current armamentarium of “superresolution” apply to the human observer’s ability to localize visual features with precision that far exceed the traditional performance limitation imposed by ocular optical and retinal structure. However, another area, classified under the rubric “geometrical superresolution” does relate to visual hyperacuity: the extraction of knowledge about the object world through processing of images that have been captured by a discretely tiled sensory layer. Most of the sophisticated methods of image analysis that have been proposed and technologically implemented utilize components that are fundamentally different from those in the retina, with its limited regularity of receptive elements and overlapping acceptance function of neural units (receptive fields). Hence such mathematically-defined operations as the Wigner distribution function mentioned above, utilized in engineering and technological application, have not yet found their place in vision research. Before subjecting visual phenomena to analysis in such terms, possible impediments have to be faced. The lattice of image-sampling elements is uniform in only a very limited region, there is little evidence for the viability of Fourier decomposition of spatial signals at the level of neural processing and, being biologically based, most parameters inserted in the equations will show variability, non-linearity and susceptibility to unknown interaction, rendering the results of modeling less universal and convincing than the mathematical rigor and computational power of their formulation. Though the versatility and elaboration of the neural circuitry through which the remarkable extension of human feature localization beyond the traditional optical limits emerges cannot be encompassed by current engineering practices e human performance across an enormous range of features is not equaled by
479
any current artificial device e the ingenuity and insight that is driving research in optical and geometrical superresolution bids fair to advance our knowledge of the neural substrate of visual hyperacuity and of available techniques in ophthalmic diagnosis. The reverse is also true: the adaptability to changing stimulus condition which is an important feature of the visual system is being recognized and emulated in engineering circles. 6.2. Clinical implementation Enormous strides have been made in technical optics in the last few decades. Interestingly, apart from the wrenching revision of the fundaments of physics through quantum theory, the solid framework for light provided by Maxwell’s electromagnetic theory has stood the test of time for a century and a half. Developments have been largely in the materials that have become available to generate, capture and channel light energy. While lasers have already found a place in ophthalmic practice, the application of much of the new technology has yet to come. Of particular relevance is the ability to make ocular structures, especially retinal ones, available to inspection by optical means. Differentiation for prognostic purposes by separation of incident and reflected beams according to wavelength, polarization, coherence, etc. is still a largely uncharted field. This review is centered on the limits of dissection of the spatial dimension both by optics and by the human sensory apparatus. One of the impediments has been overcome: Adaptive optics now allows utilization of the full bandwidth permitted by diffraction theory. This opens up the next challenge. The innovative ideas generated by scientists under the heading of optical superresolution can in principle lead to the visualization of retinal structures in finer detail yet than allowed by the diffraction barrier of even the widest pupil, though the technical obstacles will be formidable. The nanoscale refined localization techniques, often included in the rubric superresolution, could very well find application in ophthalmic practice, in particular for diagnostic purposes. When it comes to interpretation of fine details in images that transcend limitations imposed by the lattice of detecting elements, the topic now defined as geometrical superresolution, it remains to be seen whether the flow of interaction will be from the inventive engineering community to the researchers in neural processing of visual signals or the reverse. The significant issue is the categorical advantages of the units from which the visual nervous system is assembled and the versatility and fluidity of its functioning, as contrasted with the intellectual rigor characterizing geometrical superresolution theory and the specifiability of the components with which it works. Finally, while Snellen visual acuity is easily the most ubiquitous measure of visual function in the eye clinic, the diagnostic potential of visual hyperacuity testing has yet to be fulfilled. 7. Glossary Acuity: Literally sharpness, performance in the task of differentiating target particulars, as in distinguishing letters on a Snellen chart. Aliasing: When a sequence of sampled data from a distribution becomes a meaningful entity that is not actually represented in the original distribution. For example, discrete sampling from a sinusoidal distribution of a given wavelength can yield a sequence of data points forming a sine wave with a wavelength not present in the original. Diffraction Limit: Performance barrier of optical devices imposed by diffraction theory for light as regarded as waves (or the Uncertainty Principle for light regarded as photons). It depends on the wavelength of light and (inversely) the aperture and manifests
480
G. Westheimer / Progress in Retinal and Eye Research 31 (2012) 467e480
itself in the cut-off spatial frequency and the finite width of the light spread in the image of a point object. Diffractive Superresolution or Optical Supperesolution: Ability to capture details of the structure of an object that are beyond the conventionally defined diffraction limit of the optical device. Fourier Representation in Optics: Alternative way of describing objects and images, in terms of the amplitude and phase of sinusoidal light distributions which when superimposed will unambiguously reconstitute the spatial patterns. Geometrical Superresolution: Information about an object gained beyond the resolution limit imposed by the properties of the image-processing apparatus. Hyperacuity: Visual capabilities in which spatial localization can be achieved transcending the optical resolution limits of the eye and the retinal receptor mosaic. Information: When an ensemble of elements and the probability of their occurrence have been previously defined (e.g., the letters of the alphabet in an English text), information is the numerical expression of the reduction of uncertainty that results from the occurrence of an event (e.g. when one or more letters are revealed). Resolution: Ability to separate details in the representation of an object, specifically to detect from the image structure whether the generating object was single or double. Sampling: Reading out values of a distribution not continuously but at discrete intervals along the signal train. Spatial Frequency: In the Fourier representation of optical targets, the compactness of the sinusoidal patterns, measured in cycles/unit distance. Cut-off Spatial Frequency: Highest spatial frequency that diffraction theory allows to be passed by an optical device, proportional to aperture and, inversely, the wavelength of light. Uncertainty Principle: In quantum mechanics, the product of the position and the momentum of a fundamental particle is a constant. Applied to a photon, the more its location is restricted by the size of the aperture through which is passes, the less certain the direction of its propagation. References Abbe, E., 1873. Beiträge zur Theorie des Mikroskops und der microskopischen Wahrnehmung. Arch. Mikroskop. Anatomie 9, 413e468. Baker, K.E., 1949. Some variables influencing vernier acuity. J. Opt. Soc. Am. 39, 567e576. Best, F., 1900. Ueber die Grenze der Erkennbarkeit von Lagenunterschieden. Albrecht von Graefes Arch. Ophthalmol. 51, 453e460. Betzig, E., Trautman, J.K., 1992. Near-field optics: microscopy, spectroscopy, and surface modification beyond the diffraction limit. Science 257, 189e195. Blackman, R.B., Tukey, J.W., 1958. The Measurement of Power Spectra. Dover, New York. Cox, I.J., Sheppard, C.J.R., 1986. Information capacity and resolution in an optical system. J. Opt. Soc. Am. A 3, 1152e1158. Drexler, W., Fujimoto, J.G., 2008. State-of-the-art retinal optical coherence tomography. Prog. Retin. Eye Res. 27, 45e88. Fahle, M., Edelman, S., Poggio, T., 1995. Fast perceptual learning in hyperacuity. Vision Res. 35, 3003e3013. Fine, I., Jacobs, R.A., 2002. Comparing perceptual learning across tasks: a review. J. Vision 2, 190e203. Fox, M., 2006. Quantum Optics. An Introduction. University Press, Oxford. Gustaffson, M., 2000. Surpassing the lateral resolution limit by a factor of two using structured illumination microscopy. J. Microscopy 198, 82e87. Harris, J.L., 1964. Resolving power and decision making. J. Opt. Soc. Am. 54, 606e611. Hecht, S., Mintz, E.U., 1939. The visibility of single lines at various illuminations and the retinal basis of resolution. J. Gen. Physiol. 12, 593e612. Helmholtz, H., 1867/1924. In: Southall, J.P.C. (Ed.), Treatise on Physiological Optics. Optical Society of America. Keesey, U.T., 1960. Involuntary movements and visual acuity. J. Opt. Soc. Am. 50, 769e775.
Kersten, D., Mamassian, P., Yuille, A., 2004. Object perception as Bayesian inference. Annu. Rev. Psychol. 55, 271e304. Klein, S.A., Levi, D.M., 1985. Hyperacuity thresholds of 1 sec: theoretical predictions and empirical validation. J. Opt. Soc. Am. A 2, 1170e1190. Lee, B.B., Wehrhahn, C., Westheimer, G., Kremers, J., 1995. The spatial precision of macaque ganglion cell responses in relation to vernier acuity of human observers. Vision Res. 35, 2743e2758. Levi, D.M., 2008. Crowding e an essential bottleneck in object recognition. Vision Res. 48, 635e654. Liang, J., Westheimer, G., 1993. Method for measuring visual resolution at the retinal level. J. Opt. Soc. Am. A 10, 1691e1696. Lohmann, A.W., 1993. Image rotation, Wigner rotation and the fractional Fourier transform. J. Opt. Soc. Am. A 10, 2181e2186. Lukosz, W., 1966. Optical systems with resolving power exceeding the classical limit. J. Opt. Soc. Am. 56, 1463e1472. Malania, M., Herzog, M.H., Westheimer, G., 2007. Grouping of contextual elements that affect vernier thresholds. J. Vision 7 (2), 1e7. Marshall, W.H., Talbot, S.A., 1942. Recent evidence for neural mechanisms in vision leading to a general theory of sensory acuity. Biological Symposia 7, 117e164. Morgan, M.J., Aiba, T.S., 1985. Positional acuity with chromatic stimuli. Vision Res. 25, 689e695. Naka, K.I., Rushton, W.A.H., 1966. S-potentials from colour units in the retina of the fish (Ciprinidae). J. Physiol. 185, 536e555. Patterson, G., Davidson, M., Manley, S., Lippincott-Schwartz, J., 2010. Superresolution imaging using single-molecule localization. Ann. Rev. Phys. Chem. 61, 345e367. Rayleigh, L., 1879. Investigations in optics, with special reference to the spectroscope. Philosophical Magazine 8, 261e274. Regan, D., 2000. Human Perception of Objects: Early Processing of Spatial Form. Sinauer, Sunderland, MA. Sayim, B., Westheimer, G., Herzog, M.H., 2010. Gestalt modulates basic spatial vision. Psychol. Sci. 21, 641e644. Shapley, R.M., Victor, J., 1986. Hyperacuity in cat retinal ganglion cells. Science 231, 999e1002. Shroff, S.A., Fienup, J.R., Williams, D.R., 2009. Phase-shift estimation in sinusoidally illuminated images for lateral superresolution. J. Opt. Soc. Am. A 26, 413e424. Toraldo di Francia, G., 1949. Retina cones as dielectric anntennas. J. Opt. Soc. Am. 39, 324. Toraldo di Francia, G., 1955. Resolving power and information. J. Opt. Soc. Am. 45, 497e501. Westheimer, G., 1959. Retinal light distribution for circular apertures in Maxwellian view. J. Opt. Soc. Am. 49, 41e44. Westheimer, G., 1975. Visual acuity and hyperacuity. Invest. Ophthalmol. 14, 570e572. Westheimer, G., 1976. Diffraction theory and visual hyperacuity. Am. J. Optom. Physiol. Opt. 53, 362e364. Westheimer, G., 1977. Spatial frequency and light-spread descriptions of visual acuity and hyperacuity. J. Opt. Soc. Am. 67, 207e212. Westheimer, G., 1994. The Ferrier lecture, 1992. Seeing depth with two eyes: stereopsis. Proc. R Soc. Lond. B Biol. Sci. 257, 205e214. Westheimer, G., 2006. Specifying and controlling the optical image on the retina. Prog. Retin. Eye Res. 25, 19e42. Westheimer, G., 2007. Irradiation, border location and the shifted-chessboard pattern. Perception 36, 483e494. Westheimer, G., 2009a. Visual acuity: information theory, retinal image structure and resolution thresholds. Prog. Retin. Eye Res. 28, 178e186. Westheimer, G., 2009b. Hyperacuity. In: Squire, L.A. (Eds.), Encyclopedia of Neuroscience, vol. 5. Academic Press, Oxford, pp. 45e50. Westheimer, G., 2012. Spatial and spatial-frequency analysis in visual optics. Ophthalmic & Physiological Optics 32, 271e281. Westheimer, G., Brincat, S., Wehrhahn, C., 1999. Contrast dependency of foveal spatial functions: orientation, vernier, separation, blur and displacement discrimination and the tilt and Poggendorff illusions. Vision Res. 39, 1631e1639. Westheimer, G., Hauske, G., 1975. Temporal and spatial interference with vernier acuity. Vision Res. 15, 1137e1141. Westheimer, G., McKee, S.P., 1977. Integration regions for visual hyperacuity. Vision Res. 17, 89e93. Westheimer, G., Pettet, M.W., 1990. Contrast and duration of exposure differentially affect vernier and stereoscopic acuity. Proc. R Soc. Lond. B Biol. Sci. 241, 42e46. Williams, D.R., 1985. Aliasing in human foveal vision. Vision Res. 25, 195e205. Yellott, J.I., 1982. Spectral analysis of spatial sampling by photoreceptors: Topological disorder prevents aliasing. Vision Res. 22, 1205e1210. Zalevsky, Z., 2011. Exceeding the diffraction and the geometrical limits of imaging systems: a review. In: Dolev, S., Oltean, M. (Eds.), Optical Supercomputing 2010. Lecture Notes in Computer Science, vol. 6748. Springer-Verlag, Berlin, pp. 119e130. Zalevsky, Z., Mendlovic, D., 2004. Optical Superresolution. Springer, Tel Aviv. Zalevsky, Z., Mendlovic, D., Lohmann, A.W., 2000. Understanding superresolution in Wigner space. J. Opt. Soc. Am. A 17, 2422e2430. Zernike, F., Nijboer, B.R.A., 1949. Théorie de la diffraction des aberrations. In: Fleury, P., Maréchal, A., Anglade, C. (Eds.), La Théorie des Images Optiques. Editions de la Revue d’Optique, Paris, pp. 227e235.