Stereopsis by harmonic analysis

Stereopsis by harmonic analysis

BY HARMONIC ANALYSIS’ EUGENE LEVINS~N’ and RANDOLPH BLAKE Cresap Neuroscience Laboratory. Department of Psychology. Northwestern (Receked Universi...

654KB Sizes 9 Downloads 96 Views

BY HARMONIC

ANALYSIS’

EUGENE LEVINS~N’ and RANDOLPH BLAKE Cresap Neuroscience Laboratory. Department of Psychology.

Northwestern (Receked

University, Evanston. Illinois 60201. U.S.A.

19 October

1976: in reoisedfirm

1 May 1978)

Abstract-Stereoscopically viewed vertical gratings whose cycle widths differ by IO”/, fuse into a single grating rotated in depth. The perceived tilt can be predicted on the basis of a barwise computation of geometrical disparities, or on the basis of a comparison in terms of harmonic content. We have found that monocular gratings of similar cycle width but different harmonic content cannot be fused. whereas gratings similar in harmonic content but not in cycle width give good depth sensations. Such observations support the idea of stereopsis by harmonic analysis.

parity computation. We shall call this explanation the ‘spatial disparity hypothesis”. As an alternative to the spatial disparity model, consider that the stereoscopic sensation of tilt may result from a binocular comparison of two monocular gratings in terms of spatial frequency (the reciprocal of cycle width). Lending credence to this idea is the sizable body of evidence supporting at least a crude analysis of complex visual input by spatial frequencyselective channels (Sekuler. 1974). Such a visual harmonic analysis might be performed separately on each monocular view, and small differences in the results of the dual harmonic analyses could be interpreted as rotations from frontoparallel. This model we shall refer to as the “harmonic analysis hypothesis”. Although neither the spatial disparity hypothesis nor the harmonic analysis model have received direct experimental support, each has ostensibly been refuted. Blakemore (1970) has reported that the collapse of binocular fusion and tilt does not simply depend on overall retinal disparity, computed cumulatively across the entire width of a stereoscopic grating display. Moreover, Blakemore noted that the perception of tilt is not affected by steady movement in one half-image of a stereoscopic grating, a situation which should render bar-by-bar disparity computation impossible. Wilson (1976), on the other hand, has found that apparent tilt is enhanced when a gradient of cycle width is applied across the monocular gratings, a manipulation which spreads the harmonic content of each monocular pattern across many spatial frequencies and therefore confounds any comparison of half-images based upon harmonic analysis. Each of these previous investigators, then, has shown that rotation in depth can be obtained under conditions which make unlikely one or the other of the two hypothesis, i.e. spatial disparity vs harmonic analysis. The present study, on the other hand, rep resents an attempt to directly pit the spatial disparity hypothesis against the harmonic analysis hypothesis. We have devised several stereograms which allow us to determine whether a comparison based on cycle width or on spatial frequency determines perceived rotation in depth. Observation of these patterns conclusively supports stereopsis by harmonic analysis3

INTRODUCTION A human observer viewing the world binocularly

ceives normal

depth relationships

among

perthe objects

seen. If this observer linearly magnifies one eye’s visual field along the horizontal meridian (by viewing through a cylindrical lens, for example), the visual world then will appear distorted. In particular, objects which are truly frontoparallel relative to the observer will be seen as tilted or rotated in depth about a vertical axis (Ogle, 1950). Not surprisingly, such perceived rotation in depth can also be produced using artificially generated stereograms: a vertical grating viewed by the left eye and a vertical grating of, say, 10% smaller cycle width presented to the right eye binocularly fuse into a grating rotated in depth out of the frontoparallel plane (Fig. l), its letI edge apparently farther from the observer than its right edge (Blakemore. 1970: Fiorentini and MaKei, 1971). The perceived tilt produced by such a stereogram can be exactly predicted on the basis of geometrical disparities between corresponding features of the two monocular views, computed bar-by-bar. Figure 2 schematicahy illustrates this computation for a pair of gratings similar to that shown in Fig. 1. The monocular projections through corresponding contours of the two gratings always intersect in a plane displaced from frontoparallel: the fused binocular grating thaefore appears tilted in depth. The geometry of the situation has been worked out thoroughly by Ogle (1950). Note that the cycle width of each monocular grating represents the variable of critical importance for dis’ Supported by grants from the National Institutes of Health (S-SOS-RR07028 and S-F32-EY05148). ’ Present address: Department of Psychology, Washington University, St. Louis. Missouri 63130. U.S.A. 3 Note that we are dealing only with the form of stereopsis which involves rotnrion in depth about a vertical axis. The generation of other stereoscopic sensations may not depend upon harmonic analysis. Note also that the spatial disparity hypothesis, as we interpret it. requires a computation of disparities between corresponding bars of the two monocular gratings, a computation based on cycle width. This is not the same thing as a point-by-point cross-correlation of the two half-images, an operation which is. in effect, equivalent to a harmonic analysis. 73

7-1

EUGENE LEVINSON and RANDOLPH

RE

LE

Fig 2. A schematic representation of how the spatial disparity hypothesis can predict the sensation of depth generated by Fig. I (after Ogle. 1950). The reader is viewing the representation from above. such that “vertical” in the

schematic is perpendicular to the page.

YETHODS Vertical gratings were electronically generated on a pair of cathode ray tube displays. matched for phosphor (P4) and average luminance (7 cd/m’). Conventional raster generation techniques were employed. Luminance of either display was found to vary linearly with applied voltage. The displays were positioned in a mirror stereoscope such that one screen was visible to the observer’s left eye and the other to his right eye. The raster on each display was 5’ visual anaie square: the edges of the rasters served as convenient siimuli for binocular fusion of the two display screens. The authors served as principal observers. Each stereogram was also viewed by three individuals who were naive as to the purposes of the study: the reports of all observers agreed. The demonstration stereograms presented with this paper are typical of the many sttmulus patterns actually

used.

‘The relative contrasts of gratings described in this paper were set such that each spectral component in any grating was of the same amplitude as the corresponding spectral component of the real square-wave. For example. a real square-wave of contrast m might be stereoscopically paired with a sinusoid of contrast 4 m/x at the fundamental frequency. We did make some additional observations using other contrast combinations: the results did not depend upon either relative or absolute contrast. provided that the contrasts of both monocular gratings in a stereopair were high enough to make the gratings individually visible.

BLAKE

Our observations depend upon the fact that stereoscopic fusion of sinusoidal patterns breaks down and the sensation of depth vanishes when the cycle width of one monocular grating exceeds that of the other by about 20% (Blakemore, 1970: Fiorentini and Maffei, 1971). The logic behind the observations can be best appreciated by referring to Fig. 3. The column labelled “waveform” shows luminance profiles for four different gratings: the column labelled “spectrum” gives the corresponding Fourier line spectrum for each. Grating A is a “square-wave” grating of spatial frequencyfcldeg. Notice that it contains sinusoidal components at f and at all odd harmonics of j (i.e. at 3J 5jI 71; etc.). Grating B. which we shall call a “pseudo-square-wave”, contains all the harmonic components of A except that at the fundamental frequency f: Despite the absence of energy at f; grating B resembles a genuine square-wave of fundamental frequency f: in particular, gratings A and B appear to have the same cycle width (Campbell, Howell and Robson. 1971). Grating C is simply a sinusoid of frequency I: The cycle width of grating C is physically the same and appears the same as that of grating B: spectral components of the gratings, however. occupy non-overlapping ranges. Grating D is a sinusoid whose spatial frequency, 3j exactly matches that of the lowest frequency component in the pseudo-square-wave. Notice. however, that the cycle widths of these two gratings. D and B. differ by a factor of 3.4 Consider a stereogram composed of the genuine square-wave (grating A) and a sinusoidal grating IO”, different in cycle width from grating C. This stereopair is shown in Fig. 4: when fused the two halfimages produce a robust sensation of tilt which is comparable to that generated by the stereogram in Fig. 1. Of course. this depth sensation is predictable both by the spatial disparity hypothesis and by the harmonic analysis model. Now suppose that in place of the genuine squarewave we substitute the pseudo-square-wave (grating B), leaving the pure sinusoidal half-image unchanged. Such a stereo-pair appears in Fig. 5. Again, the cycle widths of the two monocular gratings differ by IO”,,: therefore the spatial disparity hypothesis predicts that this stereogram, too, should visually fuse into a grating tilted in depth. On the other hand, the harmonic component of lowest spatial frequency in the pseudosquare-wave is approximately 3 times greater in frequency than the pure sinusoidal half-image, a difference far outside the limits for fusion and depth. According to the harmonic analysis model. then, the pair should yield no sensation of depth and. indeed. should be impossible to fuse. Looking now at Fig. 5. we see that there is no hint of fusion. let alone depth: rather, the two half-images engage in binocular rivalry. This observation seems inconsistent with the spatial disparity model, but makes sense based upon the notion of stereopsis by harmonic analysts. It is, incidentally, very unlikely that differences in the waveforms of monocular half-images are responsible for the failure of stereopsis for the pair in Fig. 5. The waveform of a real square-wave is quite different from that of a sinusoid (see Fig. 3). but, as shown

Fig. I. A stereogram composed of two sinusoidal gratings which differ bp about LO”, in cycle width. This figure and the other stereograms presented u-tth this paper should be held at about arms length. although some readers may find the sensation of depth more vivid at a slightly nearer or farther viewing position. Diverging the eyes to achieve fusion makes the left edge of the grating seem rotated away from the observer; crossing the eyes (i.e. the left eye viewing the right half-image) makes the right edge appear farther away.

Fig. 4. A stereogram composed of a square-wave grating and a sinusoid of about width.

Crossing

Fig. 5. A

the eyes to obtain

stereogram

composed

fusion

makes

the

of a pseudo-square-wave cycle width.

Fusion

right

edge of the

grating

pattern

and a sinusoid

is imposstble.

lo”, appear

smaller c!‘cfe farther away.

of about

IO“,, smaller

76

Stereopsis by harmonic analysis Specrrum

Waveform

(a)

77

m

I

F

3F

I

5F

I

7F

Fig. 3. Spatial luminance profiles (waveforms) for gratings used in the present study, and the corresponding Fourier line spectra. Details are given in the text.

in Fig. 4, two such gratings can yield a strong sensation of rotation in depth. The tilt effect, then, does not depend upon waveform similarity: it is critically affected by harmonic similarity. Now imagine a stereogram consisting of grating B (the pseudo-square-wave) and a sinusoidal grating whose spatial frequency is loo/, different from that of grating D (Fig. 6). Because the cycle widths of these two gratings are vastly different, a sensation of depth on the basis of bar-by-bar disparity should be impossible. On the other hand, the sinusoidal half-image differs in spatial frequency by only 10% from the 3j component in the pseudo-square-wave grating: if stereopsis can occur by harmonic analysis, one might perceive a fused grating rotated in depth. If we now stereoscopically view Fig. 6, we can easily achieve fusion and the grating does appear tilted out of the frontoparallel plane. Only the harmonic analysis hypothesis predicts such a percept. When a sinusoid similar to grating D (like the one in Fig. 6) is paired with a real square-wave (grating A), fusion and tilt can be obtained by applying a slow (2 Hz), repetitive variation to the sinusoid’s cycle width (within 10% limits). The stereoscopically combined grating is perceived as swinging (as would a gate) into and out of the frontoparallel surface. The temporal modulation makes the fused, rotating grating stand out vividly despite binocular rivalry between the high-frequency sinusoid and the fundamental component of the square-wave; others (Julesz and Miller, 1975: Mayhew and Frisby, 1976) have already shown that stereoscopic fusion and binocular rivalry can occur simultaneously within the same pat-

tern. Application of such temporal modulation to the rivalrous stereopair shown in Fig. 5, on the other hand, fails to promote fusion. Similarity of the pure sinusoid’s spatial frequency to the frequency of the third harmonic component in the square-wave, then, leads to the sensation of tilt, even though cycle widths differ substantially. Once again stereopsis appears to originate through harmonic analysis. DISCWSIOS

The idea that stereopsis may occur by harmonic analysis is both intriguing and timely, for it represents a radical departure from traditional notions about stereopsis (Ogle, 1950) and is consistent with much recent evidence for spatial frequency selectivity in human vision (Sekuler, 1974). Prior to the present study, however, the only available evidence bearing on this question has been indirect (Blakemore, 1970: Wilson, 1976). Our results give direct, positive support to the harmonic analysis hypothesis: we must, nevertheless, reconcile our observations with those made in the previous investigations. If we consider together the experiments of Blakemore (1970), Wilson (1976), and the present paper, we are led to conclude that there are at least two mechanisms which can mediate stereoscopic vision: given the proper conditions, stereopsis might occur either by spatial disparity computation or by harmonic analysis, with the proviso that similarity of harmonic content between the monocular images is required for either mechanism to function. Blakemore observed rotation in depth under conditions which preclude bar-by-bar

78

EL.GENE LEVIN%N

disparity calculation (e.g. when one member of stereopair drifts): it is reasonable to suppose that these are conditions favorable for the activation of a harmonic content comparator. Wilson observed tilt even when he varied cycle width across the extent of each monocular grating. thus introducing an overlap in the spatial frequency spectra of the paired gratings too extensive to allow a spatial disparity computer to come into play (but note that the harmonic contents of the gratings H’CTPsimilar). The present observations are entirely consistent with this interpretation. We created a stereopair (Fig. 5) for which neither the harmonic content comparator nor the spatial disparity computer can operate. because similarity of cycle width between the two monocular gratings does not mean that there is similarity of harmonic content. The failure of stereopsis during observation of Fig. 5. then. merely shows that harmonic similarity is a necessary condition for any kind of stereoscopic sensation (see also Julesz. 1971: Julesz and Miller. 1975). We also created a stereopair (Fig. 6) for which only a harmonic content comparator can function. The fact that we obtain rotation in depth for Fig. 6. then, conclusively demonstrates that harmonic similarity is also a sufficient condition for stereopsis. That there might be more than one mechanism which can mediate stereoscopic vision is not a new idea (Ogle. 1950: Julesz. 1971). The novel aspect of the present study is the finding that one of these mechanisms can operate by comparing harmonic content in the two monocular half-images. The stereo-

and RANDOLPH BLAKE

grams we have presented here therefore provide a crucial test. Two gratings which are similar in cycle width, but which differ in spatial frequency content. do not give rise to sensations of depth. Two gratings which are similar in spatial frequency content but not in cycle width do yield a depth sensation. These observations represent strong evidence that stereopsis can be achieved solely via harmonic analysis.

REFERESCES Blakemore C. (1970) A new kind of stereoscopic vision. Vision Rrs. IO. 1181-1199. Campbell F. W.. Howell E. R. and Robson J. G. (19711 The appearance of gratings with and without the fundamental Fourier component J. Phrsiol.. Land. 217. l7-18P. Fiorentini A. and Maffei L. (1971) Binocular depth perception without eeometrical cues. Msion Res. 11. 1299-1305. Julesz B. ( 1971)YFoundarion.~ of Cyclopean Perception. University of Chicago Press. Chicago. Julesz B. and Miller J. E. (1975) Independent spatial-frequency-tuned channels in binocular fusion and rivalry. Percrprion 4, 125-143. Mayhew J. B. and Frisby J. P. (1976) Rivalrous texture stereograms. Nature 264, Z-56. Ogle K. N. (1950) Researches in Binocuiar &ion. Hafner. New York. Sekuler R. (1974) Spatial vision. rl. Rec. Ps.hol. 25. 195-232. Wilson H. R. (1976) The significance of frequency gradients in binocular grating perception. C-i077 Res. 16. 983989.