SPATIO-TEMPORAL AVERAGING AND THE DYNAMIC VISUAL NOISE STEREOPHENOMENON R. A. NEILL Physics Department. University of Newcastle, N.S.W., Australia (Receiced 26
October
1979; in
revised form
11 June 1980)
Abstract-When dynamic visual noise (D.V.N.) is viewed binocularly under conditions where the noise display brightness for one eye differs from that for the other, the noise appears to divide into laterally streaming planes that are separated in depth. The aim of this paper is to confirm that spat&temporal averaging plays a role in this D.V.N. stereophenomenon. It is proposed that the duration of spatie temporal averaging for this effect is at least 100 msec. If this is so, for a given noise density, the subjective appearance of the streaming noise planes should be independent of presentation frame rate* in the range 20-120 Hz. A detailed study of the effect of frame rate of D.V.N. on the apparent streaming velocity was made. The results of the experiment are compared with the predictions of three existing models which have been proposed to explain the D.V.N. stereophenomenon. Finally, an alternative model is proposed; this model is based on spati+temporal averaging processes. INTRODUCTION
When dynamic visual noise (D.V.N.) is viewed binocularly with an interocular delay, the noise field takes on a structured appearance. The noise appears to divide into two or more planes in depth; the noise elements which lie on the forward-protruding planes seem to stream laterally in a direction from the undelayed eye to the delayed eye; noise lying on recessed planes streams in the opposite direction. The streaming motion effect is a robust one: subjects who show * Dynamic visual noise can be presented in three ways: as a cinematographic sequence of static random noise “frames”. as a dynamic noise display presented on a raster and as a point-to-point dynamic random noise plot. For a cinematographic display, frame rate is the rate at which independent static random noise fields are presented. Each frame has a definite onset time and offset time, the entire static noise field is presented at full intensity during the “ON” period. A Raster display consists of a grid made up of a descending series of near-horizontal lines which are scanned by an electron beam. The beam scans across each line and then returns to the beginning of the line below it, which in turn is scanned. After the complete raster grid has been covered, the beam returns to the top line and repeats the sequence. In such a display “frame rate” is defined as the frequency at which the electron beam scans past any given point. (In order to avoid flicker, television receivers use “half-frames” in which half of the grid lines are scanned at the first pass and the other half are scanned at the second pass. In this paper the words “frame rate”, when applied to a television receiver, are used to describe the rate at which half-frames are scanned). The noise is created by modulating the strength of the electron beam. As soon as the beam passes a point, the brightness of that point will begin to decay; at no point is the entire screen uniformly illuminated. Thus terms such as onset and offset are not particularly meaningful in this case. In a point-to-point plot the noise is presented randomly on a two-dimensional matrix. Both the vertical and horizontal positions of the electron beam (and hence the noise dot) are changed randomly. In this case “frame rate” is a meaningless term. 673
little or no response to stereotests such as those based on static or dynamic random dot stereopatterns will often perceive the D.V.N. stereophenomenon (Dunlop et a/., 1980). The interocular delay can be created either by placing a light attenuating filter in front of one eye (Tyler, 1974) or by introducing actual interocular delays into haploscopic displays (Ross, 1974). The D.V.N. stereophenomenon is apparently related to the familiar P&rich effect (Pulfrich, 1922). The conventional explanation for the Pulfrich effect is that the light attenuation due to the neutral density filter causes an interocular delay. This in turn creates a spatial disparity, thus yielding a subjective elliptic orbit of the swinging pendulum. It is important to note that the neutral density filter does not simply create an interocular delay. The reduction in luminance due to the filter also creates greater visual persistence (Allport, 1968). Several papers (for example Wilson and Anstis, 1969; Rogers and Anstis, 1972; Julesz and White, 1969) have sought to measure visual delay as a function of luminance. The results of each of these experiments can also be interpreted, at least in part, in terms of increased visual persistence as a function of decreased luminance. Morgan (1977) has proposed a differential visual persistence model for the Pulfrich stereophenomenon. In Morgan’s model the perceived position of a laterally moving object will be a spatio-temporal average of the positions which the object has occupied over the period of persistence. The perceived position of the object will trail its actual position; increased visual persistence due to light attenuation will cause the object to seem to lie even further behind the true spatial position. Hence differential visual persistence could cause the perceived position of the filtered image to lag that of the other. This would lead to the occurrence of the Pulfrich effect. This model is more easily applied to the Pulfrich-type phenomenon which occurs when viewing stroboscopic apparent motion displays (Mor-
R. A.
674
gan and Thompson, 1975; Morgan, 1976) than is the simple visual latency model as proposed by Fertsch (Pulfrich, 1922). In contradiction to Morgan’s model, Ross and Hogben (1975) have shown that a real delay is able to create the stroboscopic Pulfrich effect, even under conditions where differential persistence would not occur. This result shows that Morgan’s differential persistence model can not be applied to the Pulfrich effect at the expense of the conventional visual latency model. However the two models can be reconciled. Morgan (1979) and Burr and Ross (1979) both showed that, for interstimulus intervals of less than about 50 msec, the instantaneous position of a stroboscopically moving object is perceptually the position it would occupy if it were moving in a continuous manner. Hence a temporal lag introduced into the right or left image of an apparent motion display should create a virtual disparity identical to the spatial disparity which would occur with a real motion stimulus. Clearly the Pulfrich effect does occur as a result of the interocular delay creating spatial disparity-real or virtual. When it is present, however, increased visual persistence of the delayed eye will lead to spat&temporal averaging over a longer period, this may accentuate the P&rich effect. While it is also likely that, when it is present, differential persistence contributes to the D.V.N. stereophenomenon, as a first approximation it seems reasonable to neglect its effect. The results which justify this conclusion are: (1) Ross (1974) and Ross and Hogben (1974, p. 1198) reported a very similar phenomenon in which a real temporal delay introduced into haploscopic displays of equal binocular luminance (hence no differential persistence) yielded a streaming motion in depth effect. (2) Tyler (1977) also reported
stereophenomenon 25-100 msec.
resulted
from
that the D.V.N. real delays of
Thus it appears that the interocular delay created by a neutral density filter can be considered to be the source of the D.V.N. stereophenomenon. Just as it appears that spatio-temporal averaging contributes to the stroboscopic Pulfrich effect, it is possible that it also contributes to the D.V.N. stereophenomenon. Using dynamic random dot stereo* Ross and Hogben (1974, p. 1198) reported that streaming occurred with real delays in the range 18-54 msec. Wilson and Anstis (1969) measured visual delay as a function of luminance and found that they could produce delays of up to 100 msec using neutral density filters (neglecting the possible perturbing effect of differential visual persistence on this delay figure). Much of the research into the D.V.N. stereophenomenon has been performed with delays provided by light attenuating filters (Tyler 1974: Mezrich and Rose 1977, Ward and Morgan, 1978). hence this research has probably been performed with delays of up to 100 msec. Tyler (1977) found that an interocular delay of 75msec gave optimal depth sensation in the D.V.N. stereophenomenon.
NEILL
grams with right and left patterns separated by real interocular delays, Ross apd Hogben (1974) found that fusion and clear perception of depth could be maintained for XI-75 msec. They found that performance on their test regimen approached chance levels at interocular delays of 150msec. Julesz and White (1969) displayed cinematographic series of independent static random dot stereograms with right and left members of each pair separated temporally by one frame. They too found that binocular fusion could be maintained with interocular delays of at least 50 msec. The visual processing required for perception of depth and form in dynamic random dot stereograms such as those used by Ross and Hogben and Julesz and White would appear to be somewhat greater than for the perception of the D.V.N. stereoeffect. Random dot stereograms require the perception of both the discontinuity which occurs at the border of the disparate region and of the disparity itself (Marr and Poggio, 1979). In the D.V.N. stereophenomenon there is no discontinuity of disparity within the noise field; it seems reasonable to assume that the duration of memory for continuous disparity may be close to Ross and Hogben’s asymptote of approximately 15Omsec. Ross (1976) claimed that the memory process involved in this phenomenon may be able to cope with interocular delays of up to 2 set, however he did not publish experimental results to justify this claim. The D.V.N. stereophenomenon has usually been investigated with interocular delays of up to 100 msec*; almost certainly the spatio-temporal average which leads to memory for disparity is taken over this duration. Models which are proposed to explain the D.V.N. stereophenomenon should therefore in some way account for these persistence effects. METHOD
A convenient method of testing the proposal that spat&temporal averaging plays a role in the D.V.N. Stereophenomenon is to check whether the subjective appearance of the effect changes with frame rate. If spat&temporal averaging plays a significant role in the effect, the apparent velocity of streaming should be independent of frame rate for frequencies which are high enough to allow averaging to occur over several frames (above about 10-20 Hz). In order to check for any functional dependence between frame rate and apparent velocity of streaming, it was necessary to present to the subject a visual display, consisting of dynamic visual noise, which had a variable frame rate. A means of measuring the apparent drift velocity of the coherent sheets had to be provided, with the obvious constraint that the appearance of the streaming motion be unperturbed by the measurement technique. It was also important to keep constant such factors as overall noise density, number of lines per frame and average display brightness. This meant that, with the exception of very low frame rates (where flicker
Spatio-temporal averaging and D.V.N. became evident), the monocular subjective appearance of the display was independent of frame rate. In the second part of the experiment, the subject was presented with a dynamic random dot pattern which was displayed in the point-to-point mode and as such had no frame rate at all. This experiment should have revealed any overall differences between the subjective perception of streaming in raster type noise displays and noise displays which are randomly plotted in both vertical and horizontal directions. Apparatus
A variable frequency sawtooth generator, which could be synchronised to the +gate output of an oscilloscope, was constructed. The sawtooth waveform was fed into the vertical deflection system of a freerunning oscilloscope and was used to provide a stable raster with a measurable number of lines. By changing the sawtooth frequency, the frame rate can be varied, but this will also alter the number of lines on the raster. By simultaneously adjusting the horizontal sweep rate, the number of lines can be kept constant. In this way, for a given noise frequency, subjective noise density will be independent of frame rate as long as the interframe interval is less than the subjective integration period. The variable amplitude output of a gas-discharge based white noise source was compared with a fixed voltage level in a comparator unit. Whenever the voltage of the noise exceeded the reference voltage, the comparator unit triggered a monostable multivibrator, which gave a 2OOnsec positive going pulse. This pulse was NANDed with the Schmitt output of the sawtooth generator, thus ensuring that the oscilloscope was blanked during fly-back between frames. The output of the NAND was a negative going, 200 nsec pulse which was suitable for unblanking the Tektronix 7603 oscilloscope used in these experiments. The random train of square pulses was input to the Z-axis of the oscilloscope, each pulse illuminating a point on the raster. The display thus took the form of a dynamic random dot pattern. The luminance of the random dots was measured by matching their brightness to that of a large uniformly illuminated field. The luminance of the field was then measured with a Tektronix 516 photometer. The oscilloscope was arranged on the’visual axis such that the display subtended lo” horizontally by 8” vertically. The subject’s head was aligned by a forehead rest and chin cup. A vertical mirror was placed slightly below the screen of the Tektronix oscilloscope and was oriented at 45” to the visual axis. Another oscilloscope, a B.W.D. 501, was placed at right angles to the line of sight and was viewed via the mirror. Using this arrangement, the reflected image of the B.W.D. screen subtended lo” by 8” and lay directly below that of the Tektronix; the two screens were separated by about 2”. Both oscilloscopes had P31 phosphors. The purpose of the second oscilloscope was to proY.R?I S--E
675
vide a visual reference, with variable horizontal angular speed, which could be used to match the apparent velocity of the streaming planes. The vertical axis of the oscilloscope was driven by a triangular waveform generator of moderate frequency. This provided a reference line whose length could be set as desired. The horizontal axis was driven by an asymmetrical triangular waveform generator, a H.P. 3310A Function Generator. The slow sweep represented 85% of the period of the waveform. The reference line was driven off the oscilloscope screen during the faster return ramp. Ward and Morgan (1978) found that it was possible to voluntarily set up smooth tracking eye movements when observing a random noise display, even without light attenuation at one eye. Further, they found that when the D.V.N. stereophenomenon was intentionally tracked, the dots appeared to move more quickly than when fixation was on a single point. While it is not critical to eliminate eye movements altogether, smooth tracking movements should be avoided. It was thus important to avoid the induction, by the reference line, of involuntary smooth tracking eye movements. Ward and Morgan found that the smooth tracking of D.V.N. was sinusoidal, thus an asymmetrical triangular waveform should break down any tendency toward sinusoidal tracking. The 15% dead time should reduce the probability of induction of optokinetic nystagmus: repetitive tracking consisting of a smooth sweep followed by a saccade. The angular speed of the sweep of this reference line across the oscilloscope screen was adjustable and could be read off the frequency scale of the horizontal waveform generator, which had previously been calibrated. A General Radio 1191-Z counter was used to monitor the average dot density of the random dot display and frame rate was measured with an Advance Instruments T.C.9 Timer Counter. Figure 1 is a block diagram of the experimental apparatus. For part two of the experiment, the Tektronix 7603 was used in a point-to-point plotting mode. Two independent 20-bit digital pseudorandom sequence generators were constructed. The eight least significant bits of these generators were converted into analog voltages and used to drive the vertical and horizontal axes of the oscilloscope. An amplifier provided blanking of the screen during the time-of-flight between the points. (For further details see Neil1 and Kennewell, 1979). In all other respects the apparatus is as for part one. Procedure
The density of the dynamic random dot pattern was adjusted to an average of 60,000 dots per sec. The dots had a luminance of 2 nt against a dark background: the apparatus was set up in a darkened room. For each frame rate selected, the time-base sweep rate of the Tektronix oscilloscope was adjusted to yield a raster of 200 lines f2%. The subject’s head was aligned so that his eyes were
676
R. A. NULL
TEKTRCWX
I
I
T.C. 9 COUNTER
/
I B.W.D. SOI
X AXIS
H.P. 33OA FUNCT!ON GENERARJ?
Fig. I. Schematic illustrating the layout of the experimental apparatus. The oscilloscopes and mirror were arranged to give a vertical separation of the two screens of about two degrees. The timebase of the Tektronix 7603 was free running; sweep rate was adjusted to yield a raster of 200 lines for each frame
rate used.
approx~mateiy level with the centre of the noise display. Each subject had a light attenuating filter (1.0 N.D.) placed in front of his right eye, he then observed the noise display, which had been set at a frame rate of 60 Hz. The subject was instructed not to track the streaming effect, but to fixate on the centre of the visual noise display. After the subject had perceived the streaming effect. he was shown how to adjust the angular speed of the reference marker, which was either a faint vertical line (which subtended So) or a spot, so that it matched the apparent streaming velocity of the most dominant perceptual sheet. The subject was allowed to make as many practice velocity matchings as he desired. The frame rate was then altered to a previously determined value and, after a 5 min adaptation period, the subject was required to make six velocity matches. The velocity of the reference line was randomly readjusted by the experimenter after each match, thus approximately half of the matches were made from above and half from below, After each set of six apparent velocity matches, the frame rate was readjusted and the matching procedure repeated. Frame rates used in this experiment were 20, 2.5, 35,
45, 60, 90 and 120 Hz; these were presented to each subject in a different random order. Once the subject had performed the velocity matching procedure for all seven frame rates, the apparatus was set up in the point-t~point plotting mode. Dot density was once again set at 60,000 dots per set and dot luminance was adjusted to 2nt. The apparent streaming velocity was once again estimated. For the most part the subject was allowed to determine his own pace, however in all cases the experimental session was limited to 3 hr. Ten subjects participated in the experiments, 4 males and 6 females; all subjects had normal or corrected to normal visual acuity and all subjects had bifoveal fixation. Subjects D.D., P.D. and J.F. were experienced observers of the type of display used here, all other observers were inexperienced and were unaware of the purpose of the experiment. RESULTS
Most subjects elected to match the velocity of the proximal coherent sheet which lay closest to the plane of the display screen. Subject J.F. matched the vel-
Spatio-temporal
611
averaging and D.V.N.
Table I. Apparent streaming velocity against frame rate. Quantities in brackets are one standard deviation
Subject (sex,
age)
C.D. F 26 D.D. G.D. H.D. P.D. J.F. J.H. E.R. B.W. T.W.
20
4.9(0.7) M56 4.5(0.3) F15 10.5(1.0) F 17 4.0(0.5) F 50 4.6(0.7) M 28 37(5) F 44 4.1 (0.7) F 19 5.1 (0.5) M 25 4.1(0.8) M27 4.1(0.8)
25 S.O(O.6) S.l(O.6) 9.9(1.4) 4.9(1.1) 4.4(0.5) 36 (4) 4.2(0.6) 4.7(0.8) 4.6(0.5) 5.1(1.0)
Frame rate (Hz) 60 35 45 5.0(0.4) 4.7(0.5) 8.9(0.6) 4.2(0.9) 5.2(0.4) 43 (5) 3.7(0.8) 5.0(0.5) 4.7(0.8) 4.2(0.6) Apparent
90
4.6(0.4) S.S(O.5) S.S(O.5) 5.2(0.5) 4.7(0.6) 3.6(0.4) 8.9(1.3) 9.4(1.6) 9.1(1.1) 4.3(0.4) 4.0(1.0) 4.0(0.6) 5.2(0.7) 4.6(0.7) 4.3(0.7) 45 (6) 34 (4) 47 (6) 3.1(0.7) 4.2(0.7) 4.1(0.5) 4.8(0.8) 4.9(0.7) 4.9(0.6) 4.4(0.6) 4.7(0.8) 4.7(0.5) 3.9(0.6) 4.4(0.5) 4.8(0.6) streaming velocity ( f 1 SD) (deg see-‘)
ocity of a sheet which protruded further from the screen. D.D. and T.W. matched the sheet which lay
closest to, but distal to the plane of the display screen. As the experiment was not designed to establish the mean apparent velocity of streaming of any particular coherent sheet, the results of these subjects can be included in the analyses which follow. The results of both parts of the experiment are detailed in Table 1. The standard deviations involved in the velocity matching procedures are large, typically of the order +20% of the apparent velocity of streaming. This resulted from the random readjustment of the angular velocity of the reference line after each trial. Possibly the streaming velocity for any particular coherent sheet is not uniquely defined, but consists of a distribution of velocities over a reasonably wide range. From the results of the experiment it appears that there is little or no demonstrable relationship between frame rate and apparent streaming velocity. Because the results of subjects J.F. and G.D. would heavily bias any statistical analysis of the data, it was necessary to “normal&” the results of the ten subjects. This was done for each subject by taking the mean of the eight average velocities listed across Table 1 and dividing each average velocity by that mean (see Table 2). In this way the results of each individual is given equal weight in a one way repeated measures analysis of variance. From the analysis of the above results, F(7,63) = 0.62, P > 0.25, it can be concluded
* These results are in contradiction to the results of Mezrich and Rose (1977) who performed a similar experiment and found that, as presentation frame rate is varied between 30 and 60 Hz, apparent streaming velocity is approximately doubled (p. 907). However, the results of Mezrich and Rose are presented as semi-quantitative observations; the authors do not provide data regarding the number of subjects used, noise density, number of raster lines, display area or display luminance. It is therefore impossible to replicate their experiments or to suggest reasons for the discrepancy.
120
Point to point
4.1(0.6) 4.4(0.4) 9.5(0.8) 4.3(1.1) 4.2(0.7) 47 (4) 3.5(0.6) 5.0(0.7) 4.8(0.7) 5.0(1.0)
5.3(0.7) 3.3(0.5) 8.8(1.7) 4.6(0.7) 4.8(0.6) 45 (8) 4.1(0.6) 4.9(0.4) 4.7(0.6) 5.8(0.4)
that frame rate and apparent streaming velocity are independent of each other over the range of 20-120Hz* (see Fig. 2). Even the flicker which was evident at the two lowest frame rates did not significantly influence the results. It can also be concluded that there is no significant difference between apparent streaming velocity as perceived in dynamic visual noise presented on a raster type display and that presented on a point-to-point display. DlSCUSSlON The experimental results confirm the proposal that, in the range 2CL120Hz, apparent streaming velocity should be independent of frame rate. It is necessary to compare the experimental results with the predictions of three existing models for the D.V.N. stereophenomenon. The three models to be considered here are the random spatial disparity model proposed by Tyler (1974), an apparent motion model proposed by Mezrich and Rose (1977) and a temporal disparity model attributed to Ross (1974).
Random
spatial
disparity
model
This model was introduced by Tyler in 1974 and considered in greater detail in 1977 (Tyler, 1977). Tyler’s hypothesis is that an interocular delay renders the right and left random patterns independent, thus producing a random distribution of spatial disparities in a D.V.N. stimulus. Figure 3 illustrates the geometry of Tyler’s model (see Tyler, 1974, Fig. 1). From the figure, a sequence of crossed or uncrossed disparities will have an associated rate of directed apparent horizontal motion. For as long as the sequence of random disparities remains either crossed or uncrossed, apparent motion follows. A sufficient density of dots will maintain a sequence of similar disparities, this leads the noise display to take on the characteristic structured appearance. The resulting perception is of a continuum of moving planes, the velocity at each
678
R. A.
NEILL
Fig. 2. A plot of mean normalised velocity against frame rate for the apparent streaming effect. Closed
circles are for the raster display; open circle is for the point-to-point plotting display. The error bars are + one standard deviation for ten subjects. The points lie close to the dashed line, implying that the apparent velocity is independent of frame rate.
depth being proportional to the distance from the central fixation plane. As proposed, the model does not make allowance for visual persistence. In his 1977 paper, Tyler stated that the “D.V.N. patterns in successive frames are completely uncorrelated, and so the presence of an interocular delay means that at any instant the patterns before the two eyes are uncorrelated” (p. 376) and later on the same page “when a binocular decorrelation is produced by a small interocular delay (S-100 msec)“. An interocular delay of 5 msec is nor sufficient to create a binocular decorrelation; this is shown by the results of Ross and Hogben (1974) and Julesz and White (1969). As Ward and Morgan point out, even if visual persistence is neglected, the minimum interocular delay which could produce this decorrelation would be half of the interframe interval (about 10 msec in Tyler’s case). As pointed out above, once the effect of visual persistence is considered, as it should be, the interocular delay required for indepen-
dence of right and left images is at least 50msec and possibly as much as, or more than, 150 msec. The random spatial disparity model therefore requires an interocular delay of at least 50msec before a full appreciation of the D.V.N. stereophenomenon can occur, this is contradicted by Tyler’s own experimental results. Consider the situation where interocular delay exceeds the interframe interval by an amount such that pairing is no longer between successive frames. Figure 4 illustrates the case where interocular delay is three times the interframe interval of the display system. As illustrated, a sequence of either crossed or uncrossed disparities can have either direction of apparent motion associated with it. This is because such a sequence involves both chance spatial disparity and chance apparent movement. The display should therefore take on an incoherent appearance: unlike the coherent streaming motion of the D.V.N. stereophenomenon.
Table 2. The data for each subject has been normalised to a mean value of 1.0 (a dimensionless representation of apparent velocity). This was done to allow a one way repeated measures analysis of variance to be performed
Subject C.D. D.D. G.D. H.D. P.D. J.F. J.H. E.R. B.W. T.W.
20
25
0.98 1.01 1.12 0.93 0.99 0.89 1.06 1.04 0.89 0.88
1.01 1.15 1.06 1.14 0.94 0.86 1.08 0.96 1.00 1.09
Frame rate (Hz) 35 45 60 1.01 1.06 0.95 0.98 1.12 1.03 0.95 1.02 1.02 0.90
0.92 1.08 1.17 1.06 0.95 1.00 1.00 0.93 1.12 0.99 0.81 1.13 0.80 1.08 0.98 1.00 0.96 1.02 0.84 0.94 Normalised velocity (dimensionless)
90
120
Point to point
1.11 0.81 0.97 0.93 0.92 1.08 1.06 1.00 1.02 1.03
0.82 0.99 1.01 1.00 0.90 1.13 0.90 1.02 1.05 1.07
I .07 0.74 0.94 I .07 1.03 1.08 1.06 1.00 1.02 1.24
Spat&temporal
r
619
averaging and D.V.N.
&potwl pasitia,
at 1,
Harizontal Position
Time t,_---_--_-_o
12_-_--_o
f
t3- d
Filter Induced b
Fig. 3. The geometry of Tyler’s model. In this case interframe interval equals filter induced lag. A sequence of uncrossed disparities can only have an associated direction of apparent motion from the delayed to the undelayed eye. A sequence of crossed disparities will have oppositely directed apparent motion. Different shapes (0, q, A etc.,) identify individual members of a sequence of identicul dots. The conventions adopted in this diagram also apply to Figs 4 and 5.
This is in contrast to the situation, illustrated in Fig. 3, where interocular delay equals the interframe interval. In this case an apparent motion from the undelayed eye to the delayed eye follows as a direct consequence of a sequence of chance crossed disparities: oppositely directed motion can only be associated with uncrossed disparities. Thus, according to Tyler’s model, if filter induced lag exceeds the interframe interval of the display system. the D.V.N. stereoeffect should collapse. For the levels of luminance used in this project, interocular delays of around 75 msec would apply on average (Wilson and Anstis, 1969). Therefore the D.V.N. stereophenomenon should not have occurred for frame rates above about 15-20 Hz. This is clearly not the case. The random spatial disparity model can, however, be modified to account for these discrepancies. Apparent
motion
model
Mezrich and Rose (1977) have proposed a model in which monocularly available (but conflicting) apparent velocities are converted into depth as a result of interocular delay. Mezrich and Rose suggest that D.V.N. displays are rich in apparent motion stimuli, but that, because conflicting apparent motions crincel each other, these are normally perceived only as vagrant motions over limited areas of the display. The introduction of an interocular delay to a D.V.N. display creates a retinal disparity in much the same way as a spatial disparity is created in the conventional Pulfrich effect. Thus apparent motion is converted into depth. This allows simultaneous perception of oppositely directed motion over the entire area of the display because “overlapping, conflicting velocities
are separated at different depth” (Mezrich and Rose, 1977). It is difficult to apply this concept of retinal disparity to dynamic visual noise because, unlike the P&rich pendulum, there is no specific object moving across the line of sight. Mezrich and Rose contend that the horizontal apparent motion of noise clusters of similar shape creates the disparity required to separate oppositely-directed apparent motions. As the model requires perceptual matching of noise element patches of similar shape, it does not exclude the possibility of the D.V.N. stereoeffect occurring when independent right and left displays are viewed. This is in contradiction to the results of Ross (1974) and Tyler (1977) who showed that independent right and left D.V.N. displays are unable to produce the D.V.N. stereoeffect. Mezrich and Rose have proposed that there is a characteristic time associated with the D.V.N. stereoeffect. Over the period of this characteristic time, t,, apparent motion clusters will move through a characteristic angle, BC.The apparent velocity of streaming will be approximated by the expression u, = O& According to the apparent motion model, if the interframe interval exceeds this characteristic time streaming velocity will be reduced. Mezrich and Rose selected a characteristic time of l/60 sec. Thus a frame rate of 60 Hz or above will produce the characteristic streaming velocity. Frame rates below 60 Hz will produce streaming velocities below u, This prediction differs significantly from that of the random spatial disparity model, it also differs significantly from the results of the present investigation. It would be possible to account for the frame rate invariance of apparent streaming velocity if a characteristic time of
680
R. A. NULL
A
Apparent position
Apparent
B
at t(
positIOn a1 t4
Apparent
Apparent
position at -
t5
poHion at
t5
5 -------0
1,-----a
t,--i3
~4___-----~
t,-------+
Fig. 4. Consider the case where filter induced lag equals three times the interframe interval. A given direction of apparent motion can have either (A) uncrossed or (B) crossed disparity associated with it. A mirror image of the dot sequence shows that oppositely directed motion can also yield either uncrossed or crossed disparity. This predicted result is contradicted by experiment.
IOOmsec were selected. Unfortunately this would require a characteristic angle of about one degree; as Mezrich and Rose (1977) point out “there is nothing in the visual impression that appears to single out such a gross structure as I” fluctuations”. Overall the model does not appear to adequately account for the results of the present investigation. Temporal disparity
model
When a horizontally moving object is tracked by the eye, each eye sees a different area of foreground and background. When the eyes track an object from left to right, the left eye sees parts of the background before the right and the right eye sees parts of the foreground before and left. Ross (1974) proposed that
the visual system may use this temporal disparity of like retinal events to give an indication of the structure of foreground and background space. Ross’s suggestion of temporal disparity processing has been interpreted as a model for the D.V.N. stereophenomenon by several workers (Morgan and Thompson. 1975; Tyler, 1977; Mezrich and Rose, 1977). One of the problems associated with this model is that when the eyes track a real object, both foreground and background move across the retinae in the same direction, with one eye delayed for the foreground and the other eye delayed for the background. A converse situation applies to the D.V.N. stereophenomenon: in this case only one eye is delayed with foreground and background moving in opposite direc-
Spatio-temporal
ave raging and D.V.N.
tions. The two situations should not be equated automatically; in the former case the motion causes the delay, in the latter case the delay causes the motion. The model does, however, explain the fact that the stereoeffect does not occur in independent right and left D.V.N. displays (where the apparent motion model fails), it can account for the fact that the effect occurs over a wide range of interocular delays (where the random spatial disparity model fails) and, while the model does not rely on spatiotemporal averaging it is able to accommodate its existence. In the temporal disparity model the apparent velocity of the streaming motion should be determined by the temporal disparity alone; it therefore correctly predicts that apparent streaming velocity should be independent of presentation frame rate. The model is ad hoc in nature, it is very difficult to confirm (or deny) and it requires two distinct forms of stereoscopic vision-one operating within narrow spatial and temporal limits and another operating in parallel over a wider temporal range. If it is possible to develop a model for the D.V.N. stereophenomenon which makes use of spatiotemporal averaging and spatial disparity, it is not necessary to invoke a dual form for stereopsis and the temporal disparity model becomes unnecessary. Spatio-temporal averaging model for tie D. KN. stereophenomenon
This model is a modified form of the Random Spatial Disparity model. Basically, the mode1 follows from the assertion that the perceived position of an object is the spatio-temporal average of its spatial position taken over some integration period. The magnitude of this integration period is unknown. It would almost certainly be at least 1OOmsec and it
681
could be as high as 2 set (Ross, 1976). A period of lOO-2OOmsec would seem the most likely. Assume for descriptive convenience that this integration period is 100 msec and frame rate is 50 Hz. In this case the spat&temporal average will be taken over five frames. Where there is no interocular delay, averaging for the left eye will occur over the same five frames as averaging for the right eye. Hence the right and left patterns will be perceptually identical. Now consider the situation where an interocular delay of 40 msec is introduced (see Fig. 5). Integration for right and left eyes will still occur over five frames, but the frames will no longer correspond exactly. For example, if the right eye is delayed, when it is averaging frames 1-5, the left eye will be averaging frames 3-7. Note that frames 3, 4 and 5 are common to both eyes. These common frames will provide the background fixation plane upon which chance random disparities are built. The random disparities result from the perturbing effect of frames 1 and 2 in the right eye and frames 6 and 7 in the left eye. In this way, as a result of the spat&temporal averaging, a range of random disparities occur. With reference to Fig. 5 it is clear that a sequence of crossed disparities will have an associated direction of motion from the undelayed to the delayed eye. For as long as uncrossed disparities are maintained, apparent motion will be in the reverse direction. This model is clearly an extension of Tyler’s random spatial disparity model. But, unlike his model, it is able to explain the occurrence of the D.V.N. stereophenomenon under conditions where interframe interval does not equal interocular delay. It is not even necessary for the delay to be an integral multiple of the interframe interval. If the integration period is sufficiently long (so that
Fig. 5. The geometry of the spakxtemporal averaging model. A mirror image of the dot sequence creates crossed disparity and oppositely directed motion. Averaging occurs over several frames, thus eliminating the difficulty, illustrated in Fig. 4, which is associated with the random spatial disparity model.
R. A. NEILL
682
integration and averaging occur over several frames) the subjective form of the D.V.N. stereoeffect willbe unchanged (except for the perception of flicker at lower frame rates) over the range of frame rates from 20-120 Hz. In this case the visual system will treat in an identical manner a 120 Hz display with a dot density of n dots per frame and a 20 Hz display with a dot density of 6 n dots per frame. Thus, if the integration period exceeds about lOOmsec, the spatietemporal averaging model predicts frame-rate velocity independence for the D.V.N. stereophenomenon over the range of frame rates used in the above described experiments. The results clearly indicate that streaming velocity is frame rate independent, thus lending support to the spatiotemporal averaging model. On a geometric basis, the model predicts the perception of random spatial disparities, that is, a continuous range of depths. However, on a subjective level, perception is either: of a pair of coherent depth planes, one protruding and the other recessed with respect to fixation; of multiple pairs of distinct coherent depth planes; of a cylinder rotating about its vertically aligned axis (Ross. 1976); or of a continuum of depth planes. D.D. was the only subject in the present project who reported perception of a continuum of depth planes after an extended viewing period. The other subjects were evenly divided over the other perceptual forms. While detailed consideration of this point is beyond the scope of this paper, it is tempting to speculate that the cause of the perception of discrete depths (whether in the form of single or muitiple planes, or of a cylinder with a distinct “radius”) is some form of binocular depth mixture. Foley and Richards stated (1978, abstract, p. 251): “If two thin binocular vertical lines which have the same direction but different disparities with respect to a fixation point are flashed simultaneously, they interact in the determination of perceived depth (binocular depth mixture).” Possibly the dense, random range of disparities. which are proposed to occur in the D.V.N. stereophenomenon. interact to form a discrete depth perception plane. The oft-perceived cylinder could be a result of the limitation of the range of disparity which occurs near the edge of the D.V.N. display. Increased visual persistence due to the reduction in luminance of the filtered image will cause averaging to occur over a longer period in the filtered eye. According to the spat&temporal averaging model this differential averaging will enhance the D.V.N. stereophenomenon. CONCLUSIONS The experimental results presented in this paper support the assertion that spat&temporal averaging contributes to the D.V.N. stereophenomenon and that the duration of the integration period is at least IOOmsec. Models which are proposed to account for the properties of stereopsis in general, and for the
perception of the D.V.N. stereophenomenon in particular. should take account of this averaging process. The fact that the perception of the D.V.N. stereophenomenon usually.goes beyond the geometry of the visual display is an illustration of one of the properties of human vision: the visual system is continually seeking to extract order out of apparent disorder. Acknowlerlgemmts-The author would like to thank Dr J. A. Kennewell for technical assistance; also Dr D. B. Dunlop. Mrs P. Dunlop and Dr T. M. Caelli for useful discussion and assistance.
REFERENCES
Allport D. A. (1968) Phenomenal simultaneity and the perceptual moment hypothesis. Br. J. Psycho/. 59, 395%lO6. Burr D. C. and Ross J. (1979) How does binocular delay give information about depth? Vision Rex 19, 523-532. Dunlop D. B., Neil1 R. A. and Dunlop P. (1980) Measurement of dynamic stereoacuity and global stereopsis. Aust. J. Opkth. 8, 3546. Julesz B. and White B. (1969) Short term visual memory and the Pulfrich Phenomenon. I\Iature 222, 639-641. _ Marr D. and Poggio T. (1979) Computational theory of human stereo vision. Proc. R. Sot. 204, 301-328. Mezrich J. J. and Rose A. (1977) Coherent motion and stereopsis in dynamic visual noise. Vision Rrs. 17. 903-9 IO. Morgan M. J. (1976) Pulfrich effect and the filling in of apparent motion. Perception 5. 187-I 95. Morgan M. J. (1977) Differential visual persistence between the two eyes: a model for the Fertsch-Pulfrich effect. J. cup. Psycho/. 3. 484-495. Morgan M. J. (1979) Perception of continuity in stroboscopic motion: a temporal frequency analysis. Vision Res. 19. 491-500. Morgan M. J. and Thompson P. (1975) Apparent motion and the Pulfrich effect. Perception 4, 3-18. Neil1 R. A. and ‘Kennewell J. A. (1979) A clinical test for stereopsis. Amt. Phys. Sci. Med. 2. 463480. Pulfrich C. (1922) Die stereoscopic im dienste der isochromen und heterochromen photometrie. Naturwissen 10. 5533.564. Rogers B. J. and Anstis S. M. (1972) Intensity versus adaptation and the Pulfrich stereophenomenon. Vision Res. 12. 909-928. Ross J. (1974) Stereopsis by binocular delay. Nature 24% 363-364. Ross J. (1976) The resources of binocular perception, Scient. Am. 234, 80-87. Ross J. and Hogben J. H. (1974) Short-term memory in stereopsis. Vision Res. 14. 1195-1201. Ross J. and Hogben J. H. (1975) The Pulfrich effect and short-term memory in stereopsis. t&ion Res. 15. I 289- 1290. Tyler C. W. (1974) Stereopsis in dynamic visual noise. Nature
250. 781-782.
Tyler C. W. (1977) Stereomovement from interocular delay in dynamic visual noise: a random spatial disparity hypothesis. Am. J. Optom. 54, 374386. Ward R. and Morgan M. J. (1978) Perceptual effect of pursuit eye movements in the absence of a target. Nature 274. 158-159. Wilson J. A. and Anstis S. M. (1969) Visual delay as a function of luminance. Ant. J. Psychol. 82, 35&358.