Vol. 34, No. 7, pp. 913-925, 1994 Copyright 0 1994 Ekevier Science Ltd Printed in Great Britain. All rights reserved 0042-6989/94 $6.00 + 0.00
Vision Res.
Pergamon
Some Temporal T. KUMAR,*
Aspects of Stereoacuity
D. A. GLASER*
Received 12 May 1993; in revised form 30 June 1993
Stereoacuity thresholds improve considerably with practice when measured using three vertical lines 15’ apart horizontally and presented briefly. For experienced observers, these thresholds are relatively independent of exposure time for stimulus durations smaller than 100 msec. The thresholds are 2-3 times larger when the outer flanking lines are shown continuously than wkn they and the central target line are turned on and off simultaneously. When the target and tlanking lines are shown sequentially, stereoacuity thresholds can be predicted from the number of times the conflguration is presented. Changes in thresholds can be measured for intervals as small as 5 msec between successive presentation of the relative disparity con&uration. The underlying mechanism is modeled well by a first order auto-regression process. Disparity
Stereopsis
Binocular vision
Depth perception
INTRODUCTION Stereoacuity is better for longer observation times (Ogle & Weil, 1958), requires simultaneous viewing of at least
two features to achieve low thresholds (Westheimer, 1979), and improves when additional features are added nearby (Kumar & Glaser, 1992). Changes in perceived depth can be induced by nearby features and these changes are significantly smaller when the inducing features are shown continuously (Kumar & Glaser, 1993b). Stereoacuity itself may also depend on whether nearby reference features are displayed continuously or briefly, since it may depend on some of the same mechanisms responible for induced depth effects. We do not know of any reports that have addressed this issue directly. Westheimer (1979) reported poor stereoacuity when the comparison and test features were shown sequentially rather than simultaneously, each item appearing only once during a trial. Does performance improve if the comparison and test features are displayed alternately and repeatedly during a trial? If so, disparity information from successive presentations would seem to have a cumulative effect on depth discrimination. There should then be an “integration time” over which such effects accumulate and a time interval between successive stimulus presentations beyond which perceived depth depends only on the last cycle. There could also be circumstances in which the effects of successive presentations depend on the number of stimulus cycles, rather than on their total duration. We found that stereoacuity is indeed better when reference features are shown briefly and synchronously *Departmentof Molecular and Cell Biology, Neurobiology Division, and Department of Physics, c/o Stanley/Dormer ASU, University of California, Berkeley, CA 94720, U.S.A.
Stereoacuity
with test features than when reference features are displayed continuously. Since stereoacuity for even slightly novel stimuli depends strongly on training even for very experienced observers (Kumar & Glaser, 1993a), the participants in this study were given enough practice to attain “stable” asymptotic performance for each condition tested. Although we found the initial stereoacuity to be better for longer stimulus durations as reported by Ogle and Weil (1958), performance improved much more with practice. For very brief presentations, practice lowered stereo thresholds by as much as a factor of 10. Even observers with extensive experience in doing stereoacuity tasks involving long presentation times showed large initial thresholds for short duration stimuli, and those thresholds improved with practice. When the test and comparison lines were shown alternately for less than 50 msec each, displaying the sequence of these lines repeatedly was found to improve performance considerably so that it approached the best obtained for simultaneous presentation of these lines. Repetition improved performance even when the comparison and the test features themselves were displayed for only 10 msec each within any sequential presentation. For stimulus onset asynchronies greater than 50-75 msec there was a strong perception of apparent motion which interfered with the stereo task and thresholds increased. However, when the comparison features were shown continuously and only the test feature was flashed repeatedly for 10 msec at a time, apparent motion was not perceived regardless of the interval between the flashes. Performance depended on the total number of times the test feature was shown whether the comparison features were shown continuously or simultaneously with the test line. The pattern of improvement in threshold due to repetition is consistent
913
914
T. KUMAR
and D. A. GLASER
with the assumption that the responses to multiple presentations are random variables drawn from a normaf d~st~but~on and are correlated. Our data are fitted very well with only the correlation coefficient as a free parameter. The value of this correlation coefficient depends upon the interval between the repetitions,
Stimuli were presented on two Hewlett-Packard vector oscilloscopes (HP-1345) with white P4 phosphor. They were arranged using Polaroid polarizers so that observers could see only one oscilloscope with each eye. The stimuli consisted of three verticaj lines f3 mm arc long with their midpoints 15 mm arc apart and ahgned horizontally. The viewing distance was 3 m. The lines appeared bright against a dark background with a luminance of 34 cd/m” at a distance of 3 m as measured through the nominal 6min arc aperature of the Pritchard I98Ob s~ctrophotometer. The tine width was about 1 mm arc at this distance. The duration and sequences of presentation of these lines are specified for each experiment separately. The Pritchard spectrophotometer was equipped with the pulsed-light modification accessory which in conjuction with a high-speed oscilloscope could be used to study the shape and peak value of pulses as short as 180 nsec. When the pdsed light source is triggered repetitively the corresponding waveform could be examined on a fast oscilloscope. The Z-axis voltage that controlled the intensity on the HP 1345 oscilloscopes was also available for examination. The putsed-bght mod~~cation and the Z-axis vohage of the HP 1345 oscilloscopes were used to calibrate the temporal properties of the stimuli reported in this paper. Temporal intervals were measured using a software programmable timer~counter that used the waveform provided by a 1 MHz solid state crystat as its input clock. For all of the experiments the task was to judge whether the middle line was closer or farther than the outer two lines. A single trial occupied 3-3.5 set beginning with an empty field presented for 200 msec, followed by the three lines presented in a specific sequence for each experiment, and ending with an empty field again for 200 msec. Between trials, a bright square frame 1.07 deg on a side was shown in the center of the field together with its diagonals. When the middle line of the three-line stimulus was shown, its center was at the center of this square. The disparity of the middle line was selected at random for each trial from a set of six disparities, three crossed and three uncrossed, chosen SO that the observers’ responses ranged from almost always “closer” to almost always “farther”. For all of the experiments the observers responded as quickly as they deemed comfortable after the presentation of the test line, and were not forced to wait until the stimulus presentation was completed. Error signals were given if the response was “farther” when the middle line had crossed disparity or if the response was “closer” when the disparity was uncrossed. The error signal was a
horizontal line 3 min arc long displayed for 25 msec and located 1 deg below the center of the middle vertical line. The percentage of “closer” responses as a function of disparity was fitted using probit analysis to a psychrometric function chosen to be the integral of a Gaussian to compute a mean (the disparity value for which 50% of the responses were “closer”) and a threshold (half the difference between the disparity values corresponding to 25% of the responses being “closer” and to 75% of the responses being ‘%Loser”). Each run consisted of 180 trials and at least four runs were conducted to obtain the average of the mean and of the threshold. Additional runs were carried out if there was evidence of training. There were clear indications of improvement with practice for all of the observers tested and for nearly all the experiments reported here. The reported data are averages over all of the runs conducted for each experimental configuration. Error bars in the figures correspond to +_1 SD of the thresholds collected over the many runs, Data for three observers are reported in this paper, and one of the authors conducted and verified all of the results although his results are not reported here, The three observers were undergraduate students with vision that was normal or corrected to normal. EO was an experienced observer who had participated in previous experiments on stereopsis, but she had never participated in an experiment with observation times < 100 msec. AQ had less experience as an observer in psychophysical experiments and had no experience with observation times < 1000 msec. FK had never participated in any psychophysical experiment before this study. RESULTS
Experiment I: effect of observation time The threshold for depth discri~natio~ as a function of stimulus d~ation before and after extensive practice was measured. In previous studies, we found that thresholds for depth discrimination improve with practice for inexperienced observers, and that at least a few hundred trials are needed before observers respond con~de~tly for stimulus durations < 100 msec. Even observers with previous experience in stereopsis experiments required some initial trials before they respond confidently to stimuli of < 100 msec duration. For initial threshold measurements involving observation times of 100 msec and less, the six disparities from which the disparity of the middle line was randomly selected were k37.5, 3-75 and & 112.5 set arc, and for ob~~at~oR times greater than 100 msec, they were rtt7.5, t 15 and + 22.5 set arc, Crossed disparities are defined to be positive and uncrossed, negative. Before any data were collected, observers were shown the stimulus ten times with the middle Line at each of the six disparities. The stimuhts duration was 5 set for this initial familiarization. Observers were told that the middle line would always have one of the disparities that they had seen during the familiarization process. With these sets of disparities, thresholds between 5 and 120 set arc could
TEMPORAL
ASPECTS OF STEREOACWN
915
PRACTICE
Stimulus Duration (msec ) FIGURE 1. Stereoacuity thresholds for three observers as a function of stimulus duration before (open symbols) and after (solid symbols) extensive practice. The stimulus consisted of three vertical lines 13 min arc long, 15 min arc apart and aligned horizontally. The task was to specify whether the line in the middle was closer or farther than the outer two lines. The error bars in all the figures correspond to f 1 SD of the thresholds collected over the many runs.
be measured. Initial thresholds were measured for stimulus durations in the ascending order, 5, IO, 50, 100, 250, 500 and 1000 msec. After the measurement of these thresholds, data were collected again for the same stimulus durations, but in the descending order, beginning with 1000 msec. Observers collected data for each stimulus duration until their threshold for that duration improved to an asymptotic value before they moved to the next shorter stimulus duration. Durations of c 100 msec required 15-20 runs, or about 3000 trials, for one of the observers to reach stable asymptotic thresholds. Initial and final thresholds for three observers are shown in Fig. 1. Observers FK and AQ responded quite erratically to a 5 msec stimulus during their first run of 180 trials and their data yielded a very poor fit to the psychomet~c function, indicating thresholds larger than 120 msec. This initial run was discarded for these two observers and data plotted in Fig. I for that point are the averages of the next two runs. The remaining values are the averages of the first two runs for each value of the stimulus duration. Error bars for the final asymptotic values of threshold are shown in Figs 2 and 3 where these same data are replotted on a magnified scale. Data were also collected for two of the observers when the line at the left side of the three-line stimulus was not shown at all. The final asymptotic thresholds for this condition are also plotted in Fig. 2 for two of the observers. Thresholds for depth discrimination are higher when there is only one comparison line than when there are two comparison lines. This is consistent with our earlier demonstration that depth discrimination improves as the number of comparison
lines increases (Kumar & Glaser, 1992). These asymptotic thresholds for comparing the depth of two lines are similar to those reported by Westheimer and Pettet (1990) and are also shown in Fig. 2 for comparison (observers MWP and SL). While Westheimer and Pettet maintained constant retinal flux for short exposures as required to satisfy the Bunsen-Roscoe-Bloch law, we kept the luminance of the lines constant so that the three lines were very dim for short exposures. For the 5 msec duration the inter-stimulus square appeared very bright compared with the three-line stimulus and seemed to interfere with the percept being measured. We therefore reduced the luminance of the lines in the inter-stimulus square to 6 cd/m2 at a distance of 3 m as measured through the nominal 6 min arc aperature. The stimulus used by Westheimer and Pettet was also different; they measured depth di~~mination using two uniformly luminous rectangular panels of sides 8 x 11 min arc and separated by a 5 min arc gap. Our thresholds are quite similar to theirs in spite of these differences. Ex~er~ent 2: eflect of displaying the comparison lines continuously In this experiment, the outer two lines were displayed continuously during an entire run and instead of displaying a square between trials, a single 1 min arc dot was shown halfway between the outer two lines. The disappearance of this dot for 200 msec at the beginning of each trial cued the beginning of a trial. Thresholds measured under these conditions were higher than when all three lines were “turned on” and “turned off” simultaneously. Since Expt 1 was carried out over a
916
T. KUMAR and D. A. GLASER
3 LINES --t
EO
2 LINES *
EO
*
MWP
I
I
10
100
Stimulus
1000
Duration
FIGURE 2. Stereoacuity thresholds for two observers when the stimulus had only one comparison line (2 LINES) and when it had two comparison lines (3 LINES). Data of MWP and SL was reported by Westheimer and Pettet (1990), and is shown here for comparison.
period of several months, we were concerned that some systematic bias might have affected the data. To control for this possibility, we interleaved runs during which the outside lines were displayed continuously with runs for which all three lines were turned on and turned off simultaneously for observer FK. The duration of the test feature was the same for consecutive runs. We were also concerned that the two lines displayed continuously with the 1 min arc fixation dot might be too sparse a stimulus to allow good maintenance of convergence. FK and AQ
therefore repeated parts of Expt 1 and the corresponding parts of this experiment with the overhead fluorescent lights turned on. Threshold values for three observers (EO, AQ and FK) are shown in Fig. 3. On the average, a data point for continuously displayed outer lines was computed from 17 runs or 3060 trials to make reasonably sure that further significant improvement in performance was unlikely. It is surprising that stimulus durations of nearly a second are required for the thresholds for the two
F
SOY!45 -J 4
ALL 3 LINES SHOWN SIMULTANEOUSLY -m-
-e-4--
OUTER 2 LINES SHOWN CONTINUOUSLY
‘:
-Cl-E0
1
+AQ -A- FK
:
:25z.
% 20; r
i! 15:
g lo-
I
100
Presentation
time(msec)
FIGURE 3. Stereoacuity thresholds for three observers when all three lines in the stimulus were turned on and off at the same time (all 3 lines shown simultaneously; open symbols) and when the outer two comparison lines were shown continuously during a run and only the middle test line was turned on and off for each trial (2 outer lines shown continuously; solid symbols).
TEMPORAL
917
ASPECTS OF STEREOACUITY
I
Onset Asychrony
Middle test line shown for 20 msec
ol,
-200
8,
I”,
0
I
200
“‘r
“,
400
1
“’
600
1
800
“‘I””
1000
1200
Onset Asychrony FIGURE 4. Stereoacuity thresholds for three observers as a function of onset asynchrony. The outer two comparison lines were shown for 1 set and the middle test line was shown for 20 msec.
to become equal. For stimulus durations ~50 msec, the thresholds differ by a factor of 3-4. Thresholds are slightly lower when the overhead room lights are on, but thresholds still depend significantly on whether the comparison lines are on continuously or only pulsed on briefly with the test line. The depth perception process yields a threshold in the 5-15 set arc range for synchronous onset and offset of the three lines for all stimulus durations between 5 and 1OOOmsec, while for continuously present flanking lines, the thresholds are 3 or 4 times larger for a 10 msec stimulus duration and decrease into the 5-15 set arc range only for stimulus durations greater than about 500 msec. The stereo depth system seems to respond best to transient and simultaneous relative disparity (solid symbols, Fig. 3) although thresholds for continuous flanking lines finally reach the same low thresholds for extended viewing times. Thus there appears to be both fast and slow mechanisms for attaining low stereo thresholds depending on the temporal properties of the stimulus. The geometry of the stimulus is the same for both cases while the central line is displayed, yet the response is dramatically different. A viable interpretation, suggested by one of the reviewers, is that stereo mechanisms respond best when comparing transient to transient or sustained to sustained elements of the stimulus. conditions
Experiment 3: eflect of asynchrony between the onsets of the outer comparison line and of the middle test line Perhaps the comparison and test lines must turn on or off at about the same time to achieve low stereo thresholds. We investigated this issue by displaying the two outer comparison lines for 1 set and the middle test line for 20 msec, while varying the time interval between the onset of the comparison lines and of the test line. Eight different onset asynchronies were tested and their designations are given here in msec: (1) the offset of VR 3417-D
the middle line coincided with the onset of the outer lines, -20 msec; (2) the onset of the three lines was simultaneous, 0 msec; (3)-(5) the onset of the middle line was 250, 500 and 900 msec after the onset of the comparison lines; (6) the offset of the three lines was simultaneous, 980 msec; (7) the offset of the comparison lines and the onset of the middle line was simultaneous, 1000 msec; and (8) the onset of the test line was 100 msec after the offset of comparison lines, 1100 msec. As shown in Fig. 4 for two observers, the measured thresholds were fairly independent of stimulus onset asynchrony between the comparison and test lines and were comparable to thresholds observed when the two outer Lines were shown continuously (see Fig. 3). Thresholds were a little higher when all of the lines were turned off together (zero off-set asynchrony), corresponding to a somewhat worse performance. Performance improved again slightly when the middle line was turned on 100 msec after the comparison lines were turned off. The continued presence of the comparison lines after the offset of the test line seems actually to increase the thresholds. Experiment 4: repetitive alternating display of the outer comparison lines and the middle test line Westheimer demonstrated that displaying a comparison line alone followed by the test line alone gave higher thresholds for depth discriminations than showing both lines together for the entire trial. We replicated this result and also found that stereoscopic thresholds are significantly lower when the ceiling lights are on allowing observers to see objects other than the stimulus, in agreement with our earlier report that the threshold of a single line is higher if only the stimulus is visible in a dark room (Kumar & Glaser, 1992). The data reported here were obtained in a dark room so that only the stimulus was visible. Under these conditions, when the middle test line is displayed for 500 msec followed by
918
T. KWMAR and D. A. GLASER
B Outer
A
lines f3
f^estfine
c+p
IT t
“T’
E Outer lines I
G
7 Test line F
7
7
I
1
Our iincas I ‘lesr l&e f L
I
-
I 1
I-
1 f
I
-
Y-Y
FICWRE 5.Timing modes
T
tested in Expt 4. The stimulus consisted OFthree lines: a middle test line and two outer comparison &~es, 13 min arc King and 15 min arc apart and t&g& horizontally. The small duration shown for either the test line or the comparison &~es qresems IOmsec Sequence in mode A and B were rests over and over a@n for a totaj flotation time of about I see.Four q&s of &is I WCsequenceof mode A and B a~. shown as modes E and F. in mode G, the duration of the lines was 75msec. Six sequences were presented during a trial, but only three sequences of mode C are sketched in the figure.
the two
cm&r
comp~son lines for 5OOmsee so
test fineand outer lines are
that
at the same time, the thresholds for all three observers exceeded 120set arc. When some room lights were on, however, thresholds lay between 40 and SSsec arc, which is comparable to the values, 48 and 41 secarc, reported by W~the~me~ for two observers under similar condi~ons. All of the info~ation required for relative disparity discrimination was available simultaneously in the display only at the monent when the middle test line disappeared and the comparison lines appeared. Does the presence af the lines before or after that instant i~uen~ ~rfo~~~ To answer this question, we tested four different timing modes shown in Fig. 5 as A, B, C and D. In mode A, the two outer comparison lines were visible for T msec after which the test line was visible for another T msec. Then the test line was shown for fOmsec followed by an empty dark Geld for T msee. fn mode B, the outer two lines were displayed for 1Omsec followed by an empty dark field lasting T msec. In mode C, the outer two lines were shown for 10msec and followed by the test line for T msec while in mode D, the outer two lines were the
never VisiMe
Brst shown for I*mstr;c ~~~~~w~by the test line for
10msec. We also repeated sequences A and B over and over again for a total presentation time of about 1 sec. Four cycles taken from a 1 set sequence of mode A and mode B cycles are shown in Fig. 5 as modes E and F, respectively_In mode G, the outer two l&s were shown for 7.5msec, then the middle test line was shown for 75 msec, and finally an empty dark field was displayed for T msec. This sequence was shown six times in a trial, of which three are sketched in Fig. 5. Thresholds for three observers are plotted as a fmtion of Tin Fig. 5. Data for all three observers and for modes A--G were fitted tu the function k * eject and the curves shown are the ones yielding the minimum x2 values. The values of k and c for the three observers are shown in Table 1. Alder there are si~~~~t q~n~~~ve diEerences hong the observers, their ~u~i~~ve trends are the same. All of the thresholds increase with time, r, for all modes. In mode B, thresholds increase as the duration of the dark empty field between pulses increases. Thresholds For mode C are higher than those for mode
TEMPORAL
AE
OB
AF
n C
100'
XG
I
..’ ...’
,:’ ,..’ / ,... ...’
150
100
i0 Duration
919
ASPECTS OF STEREOACUITY
200
‘T’ (msec )
FIGURE 6. Stereoacuity thresholds for timing modes shown in Fig. 5. The data were fitted by the function k * exp(cT) and the curves shown are the ones yielding the minimum x2 values. Values of k and c for the three observers are shown in Table 1.
B for all of the observers suggesting that the extended presence of the test line tends to weaken the effect of the relative disparity from the briefly displayed original comparison lines more than an empty dark field of the same duration. This is consistent with the significant difference between the results of modes C and D. Increases in thresholds of mode D are significantly lower than either mode B or C, which indicates that it is the continued presence of the test line by itself, the last feature shown, which tends to weaken the effect of the relative disparity. For large T values results of mode D should approach those given by Expt 3
for onset asynchrony of 1OOOmsec. Observer EO’s thresholds at T = 100 msec for mode D are significantly higher than her thresholds measured in Expt 3. This suggest that EO results might demonstrate a nonmonotonicity as T increases for mode D. However, observer ATQ’s thresholds for these two cases are quite similar to each other and do not suggest a nonmonotonic trend. The unpublished results of one of the authors for these cases are very similar to those of ATQ in that his threshold’s at T = 100msec for mode D are about the same as those measured for onset asynchrony of 1000 msec in Expt 3. Observer EO was not available to verify whether the postulated non-monotonicity actually exists or whether her high threshold measured at T = 100 msec and perhaps at T = 75 msec were anomalous. In our judgment the rest of the data do not support statistically the existence of non-monotonicity. Regardless of this question, there is little doubt that the thresholds for mode C increase significantly faster than those for mode D. Repeated exposure to the sequences of modes A and B lowers thresholds considerably as can be seen by comparing thresholds obtained for modes A and B with those of modes E and F, respectively. Differences in thresholds between modes E and F are insignificant for T less than 100 msec. Repeated display of the basic cycle seems able to compensate for the threshold increase observed in modes A and B as T increases from 5 to 75 msec. For T >75 msec, thresholds for E, F, and G begin to rise. Part of the reason for the increase may be that T greater than about 50-75 msec there is a strong perception of apparent motion which interferes with depth judgments. The stimulus was identical for T = 0 in mode F, and for T = 75 msec in mode E. In mode G the single sequence of mode A at T = 75 msec is repeated six times with an empty dark field interposed between each sequence. How long must the empty dark field extend to prevent improvement in performance due to repetition of the sequence? The time interval for which the threshold in mode G is equal to the threshold at T = 75 msec in mode A is about 150 msec for EO and AQ and about 200 msec for FK. Perhaps this interval would be somewhat longer if there were no interference with depth judgment from the
TABLE 1. Values of the k and c, parameters in the equation k * exp (CT) fitted to the stereoacuity thresholds for timing modes shown in Fig. 5 EO Stimulus A B C D E F G
k 11.3 f 7.2 f 9.3 + 13.0 + 3.7 + 2.6 + 9.4 +
The curves obtains
C
1.3 0.9 0.8 1.6 0.4 0.4 1.2
FK
AQ
0.018 f 0.003 0.024 f 0.003 0.025 + 0.082 0.0089 rf: 0.001 0.014 + 0.002 0.021 + 0.802 0.009 f 0.001
are shown in Fig. 6.
k
11.8 f 12.6 f 8.6 f 14.0 f 8.0 f 4.2 + 14.7 f
k
C
0.9 0.9 0.9 1.2 0.7 0.3 1.6
0.019 f 0.024 f 0.037 + 0.011 f 0.013 f 0.022 f 0.009 f
0.003 0.003 0.085 0.002 0.002 0.903 0.001
12.3 f 1.1 12.2 f 0.4 9.2 f 0.5 13.0 & 1.5 4.3 * 0.3 3.2 f 0.2 14.2 f 1.9
C
0.016 f 0.022 f 0.034 + 0.0077 f 0.010 f 0.017 f 0.005 f
0.002 0.001 0.006 0.001 0.092 0.902 0.001
920
T. KUMAR
strong perception of apparent > 50-75 msec in mode G.
motion
and D. A. GLASER
for intervals
c
Experiment 5: do thresholds for perception of depth using repetitive stereo stimuli depend on the duration of each trial or the number of cycles it contains.7
In Expt 4, thresholds decreased when sequences were repeated for 1 sec. In Expt 5, we investigated the rate of this improvement, but instead of fixing the total duration time to be 1 set, we measured thresholds for different numbers of cycles for modes E and F. For mode E, T was 10, 50 and 75 msec (Fig. 7) for multiple presentations and for mode F, T was 10 and 50 msec (Fig. 8). Most of the reduction in the thresholds occurs during the first four cycles regardless of the value of T and hence the improvement appears to depend on the number of times the sequence is shown rather than to the total presentation time. The stereoacuity threshold was virtually independent of T beyond 4-6 cycles for all three observers, although the total presentation time for 4 cycles was 80msec for T = 10 msec and 600 msec for T = 75 msec. The data were fitted by the equation l+a
Th[n] = Th[l] *
Stimulus:
EO F40m
E
2
4
6
8
10
l
50 msec
A
75 msec
12
14
16
18
20
Number of Cycles
c
ol
0
I
2
4
6
8
10
:
n loz E e 0
= 50 msec
0
,
1
2
3
4
/
5
6
7
8
10
9
Number of Cycles 60
I
’
I
Stimulus:
F
2 5O
0 1Omsec
;
l
40r
: AQ 3u 30: : 20: :
:
z
F ‘0: g 0
0
50 msec
i 1
1
I
/
2
3
4
5
6
/
I
I
7
8
9
10
Number of Cycles 50
1
’ Stimulus:
A
I 1
F
.E I-
0,
0
I
1
I
I
2
3
4 5 6 Number of Cycles
7
8
9
F-
10
FIGURE 8. Stereoacuity thresholds for different numbers of cycles for modes F (see Fig. 5) for three observers. Values of T tested were 10 and 50 msec. The data were fitted by equation (1) and the values of the correlation parameter a are shown in Fig. 9.
Experiment 6: repetitive flashing of the test line while the outer lines are shown continuously
0 1Omsec
0
0 1Omsec
I
I
I
Y
F
(1)
J (1 -a) * n + 2a where Th[ l] is the measured threshold for 1 cycle, n is the number of cycles shown, Th[n] is the threshold for n cycles, and parameter a chosen to give the minimum x2 error. The values of a are shown in Fig. 9. The assumptions and derivation leading to this equation are sketched in the Appendix.
50
Stimulus:
zb 20: E0
12
14
16
18
20
Number of Cycles FIGURE 7. Stereoacuity thresholds for different numbers of cycles for modes E (see Fig. 5) for two observers. Values of T tested were 10, 50 and 75 msec. The data were fit by equation (l), and the value of the correlation parameter n are shown in Fig. 9.
In Expt 2 we showed that stereoacuity thresholds for a test line were significantly higher when the comparison lines were shown continuously than when the three lines appeared and disappeared simultaneously. This difference was measurable even when the exposure time for the test line was 1 sec. We interpreted this to suggest that there are both fast and slow processes for depth perception, If the middle test line were to be flashed briefly many times, would the threshold depend on the duration of the entire sequence, the integrated time the test line is actually displayed, or the number of times it is shown? We investigated this question by displaying the outer two comparison lines continuously during each run as before while the middle test line was flashed repeatedly, 10 msec at a time. The interval between presentation of the middle line was varied in this experiment although it was held constant during each run. The intervals used were 5, 10, 20, 50, 100, 150, 200, 300,400, 500 and 1000 msec. For the 5, 10, and 20 msec intervals, the observers did not see the middle line flicker, but for larger intervals the flicker was very obvious. The observers did not perceive apparent motion for any of the interstimulus intervals tested. In Expt 4 stereoacuity
TEMPORAL
921
ASPECTS OF STEREOACUITY 1
2g
thresholds were found to increase as the asynchrony between the test line and the outer comparison lines increased beyond 50-75 msec and this increase was accompanied by the perception of apparent motion. In this experiment stereo thresholds do not increase even for interstimulus intervals as large as 500 msec as shown in Fig. 10 . This is consistent with our previous suggestion that the perception of apparent motion might interfere with depth judgment (Kumar & Glaser, 1993b). Improvement in performance with multiple pulses appears to reach an asymptote for intervals between 100 and 200 msec. The data were fitted to equation (1) and the resulting values of the parameter a are shown in Fig. 9. Although observers cannot see flicker in a rapidly flashing test line, their thresholds are suprisingly smaller than for a steadily displayed test line for the same total presentation time. Total presentation time is not a good predictor of performance. The total time the stimulus was shown when the test line was flashed n times is
I
: Duration
I
2’0
.,
5
I
between
pulses $I
&I
I
4
(msec)
I
:
150 0
2;
3&
71
400 +
5”oo :
234567091
I 1
,
I
,
0
2
3
4
5
6
o
I
2
3
4
5
51,
\
9I
10 F
9
10
8
1
FIG. 10 0
0.8-
E.O.
FIG. 8 +
E.0
FIG. .7A
E.O.
Number of 0.6.
0 AQ
Cl AQ
0.4-
j
0
S P _m
7
8
FIGURE 10. The outer two comparison lines were displayed continuously. The middle test line was flashed, 10 msec at a time, repeatedly for a fixed number of times in a trial. Stereoacuity thresholds for three observers is shownas a functionof the numberof timesthe test line wasflashedand as a functionof the intervalbetween the pulses of the test line.
4- AQ
s g .z
6
Pulses
0.2-
O-
siii -0.2 -
-0.4 -
-0.6 -
-0.8 3 1
10
100
11
0
‘T’ (msec ) FIGURE 9. Values of the correlation coefficient a determined by fitting the data shown in Figs 7 (stimulus E; Fig. 5), 8 (stimulus F; Fig. 5), and 10 (see Fig. 10 legend) by equation (I). T specifies the appropriate time for the different modes: the duration of the test line or that of the outer two lines for stimulus E, the interval between the test line shown for 10 msec and the outer two lines also shown for 10 msecfor stimulusF. In the stimulus corresponding to Fig. 10 , T is the interval between the pulses of the test line, flashed 10 msec at a time while the outer two lines were displayed continuously. The lines shown are the minimum x2 fits to the data shown in Fig. 10 by the equation u = ks + k, log (T). The values of (k,,, k,) determined for the three observers are: (1.41 k 0.06, -0.82 f 0.03) for EO, (1.23 f 0.07, -0.71 kO.03) for ATQ and (0.70+0.11, -0.74+0.06) for FK.
(10n + i’+r - l)} where Tis the interval in msec between flashed of the middle line. For n = 5, and T = 20 msec the total stimulus presentation time was 130 msec. The data for T = 5, 10 and 20 msec have been replotted as a function of the total time that the stimulus was shown (Fig. 11). Data from Fig. 3 (open symbols) have also been replotted in Fig. 11 for the case in which the outer two comparison lines were shown continuously and the test line steadily during the entire presentation time. The improvement in performance with multiple pulses would appear even greater if the comparison were made as a function of the time, 10 n msec for the target line flashed n times, that the target line was actually shown during a trial. Although it is possible that the stereo mechanisms respond best when comparing transient to transient and sustained to sustained elements of the stimulus, multiple presentations of the transient elements result in significant and rapid improvement in stereoacuity when comparing transient to sustained elements of the stimulus. However, the best stereo thresholds for a given presentation time were obtained when the comparison and test features had synchronous onset and offsets (Fig. 3; solid symbols).
922
T. KUMAR
0
20
40
Total
and D. A. GLASER
+
T=
Sms@C FIG.10
-@-
T = 10 mseCFIG. 10
--t
T= 20 msec FIG. 10
--c
Outer
2 lines shown continuously
60
80 100 120 140
time
the stimulus
0
20
is shown
40
FIG. 3
60
80
100120140
in msec
FIGURE 11. The data for 5, 10 and 20 msec intervals between pulses of the test line shown in Fig. 10 and the data shown in Fig. 3 when the outer two lines are displayed continuously as a function of the total time the stimulus was displayed in a trial. The total time is the presentation time for the data from Fig. 3, and can be expressed as 1Onf T(n - 1) where n is the number of pulses of the test line and T is duration between these pulses in msec for the data from Fig. 10. The associated error bars have not been shown but are given in Fig. 3 and Fig. 10.
DISCUSSION
Even an observer who had participated in psychophysical experiments for a long time, but had no prior experience in making depth judgments for very briefly presented stimuli, showed very poor initial performance (Fig. 1; observer EO). For inexperienced observers, initial stereoacuity thresholds for stimulus presentation times of 500 msec were at least three times larger than the eventual asymptotic threshold after extensive practice. After extensive practice, the stereoacuity thresholds varied little with presentation times for values between 5 and 100 msec and gradually decreased by less than a factor of two as presentation times were increased to 1 see. We do not know the reason for this slight improvement, but perhaps it will be explained eventually in terms of the consequences of involuntary binocular eye movements. Saccadic motion of the stimulus on the retina might have an effect similar to the multiple presentations in Expt 4. The large improvement in stereoacuity thresholds after practice are not understood. The performance of even practiced observers was significantly better when the test line and comparison lines were displayed briefly and simultaneously than when the comparison lines were shown continuously (Fig. 3). Although we did not
monitor eye positions, we doubt that the difference in performance is due to inadequate fixation, The difference in performance between the two conditions persists even for viewing times of 1 set, and even when the overhead room lights are on so that the visually rich environment of the room provides more than adequate stimulus for maintaining convergence for the well-practiced and experienced observers. The relative disparity signal provided by the comparison lines when they are shown continuously is apparently not as effective for depth judgment as when these lines are turned on synchronously with the test line. Are the transient aspects of the features especially effective in determining depth? We examined this question in Expt 3. The results (Fig. 4) were inconclusive in light of the results of Expt 4 (modes: A, B, C and D) indicating that the extended presence of the outer two comparison lines in Expt 3 might actually disrupt the judgment of depth of the test line sufficiently to mask any effect of simultaneous onset or offset of these lines. The thresholds obtained in Expt 3 were less than those obtained in Expt 2 but larger than the asymptotic thresholds obtained in Expt 1 (compare presentation time of 20 msec in Fig 3 and onset asynchrony of zero in Fig. 4). However Expt 6 suggests that the transient part of the signal is indeed used
more
effectively
in relative
disparity
processing.
TEMPORAL
923
ASPECTS OF STEREOACUITY
Experiment 6 showed that performance was better when the test line was flashed repeatedly than when the test line was presented continuously for the same total length of time even when the interval between flashes was too short for the observers to see any flicker (Fig. 11). Steroacuity improves with each additional flash, and the psychophysical data are very well fitted by an equation with a single parameter. This equation is based on the hypothesis that the stereo signal is strengthened as if it depended on correlations among the responses to different pulses of the stimulus. The correlation among these responses changes with duration time between the pulses (Fig. 9). The advantage of multiple pulses of a briefly presented test line over steadily presented test line is evident for duration times between pulses as small as 5 msec (Fig. 11). There is an apparent saturation effect for the correlation factor a for longer durations between pulses, about 150-200 msec for EO, and about 4OO’msec for AQ (Fig. 9). Although the successive responses are negatively correlated for these long intervals and perhaps beyond, the observers do not gain any additional advantage from the multiple responses. For extremely long durations between pulses, it is reasonable to expect that there would be no improvement in performance over that measured for a single pulse or equivalently that Q should approach + I for very large T values. The postulated increase in II beyond the T values tested is probably constrained by the decay of some form of response memory. For a = 0 the responses are independent of each other and this occurs at about T = 50 msec for EO and AQ and at about T = 10 msec for FK suggesting a reasonably fast underlying mechanism. Beyond these times the successive responses show an increasing negative correlation perhaps indicating an increasing confidence in the response of a single pulse. The form of the covariance matrix suggested by the data may arise from a first order auto-regressive process of the form X, = ax,_, + e,,, where X, is the nth element of the stochastic internal variable used to make the perceptual judgment and e,, is a random variable with appropriate mean and variance and uncorrelated to X,. The covariance matrix for this process is identical to the one we used for obtaining equation (1). This process is a discrete first order linear system (Qppenheim, Willsky & Young, 1983) and its impulse response function is known to be a”. Since a was measured for different intervals between pulses (see Fig. 9) the impulse response may be transformed to a function of time. On doing so we were unable to achieve an approximation to a unique impulse response function independent of a; different values of a give very different functions of time. This suggests that it might not be possible to implement this model as a unique impulse response function of time. In summa~ the results of this study are: (1) Stereoacuity thresholds for experienced observers are relatively constant for presentation times less than 5@-100 msec and are less than twice those measured for presentation times of 1 sec.
(2) Stereoacuity thresholds were about 2-3 times larger when the comparison features were shown continuously during a run than when the comparison and test features were turned on and off synchronously for presentation times of up to 1 sec. When the comparison and test lines are presented (3) alternately in sequence, thresholds increase exponentially with the time interval between presentations of the compa~son and the test features, or with the duration of the feature shown last. (4) When the test and comparison features are not shown simultaneously but sequentially, the total duration time of the stimulus is not a good predictor of stereoacuity performance. Stereoacuity thresholds are better predicted by the number of times a relative disparity configuration is presented than by the total exposure time of the stimulus. The underlying mechanism is modeled well by an auto-regression (1) process. RE~REN~ Fukunaga, K. (1990). Introduction to statisticai pattern recognition. New York: Academic Press. Kumar, T. & Glaser, D. A. (1992). Depth discrimination of a line is improved by adding other nearby lines. Vision Research, 32, 1667-1676. Kumar, T. & Glaser, D. A. (1993a). Initial performance, learning and observer variability for hyperacuity tasks. Vision Research, 33, 2287-2300. Kumar, T. & Glaser, D. A. (1993b). Temporal aspects of depth contrast. Vision Research, 33, 947-958. Ogle, K. N. &We& M. P. (1958). Stereoscopic vision and the duration of the stimulus. Archives of Ophthalmology, 59, 4-17. Gppenheim, A. V., Willsky, A. S. & Young, I. T. (1983). ~jg~~ and Systems. Englewood Cuffs, N.Y.: Prentice-Hall. Westheimer, G. (1979). Cooperative neural processes involved in stereoscopic acuity. Experimental Brain Research, 26, 585-597. Westheimer, G. t Pettet, M. W. (1990). Contrast and duration of exposure differentially affect vernier and stereoscopic acuity. Proceedings of the Royal Society of London 3, 24f, 42-46.
Acknowledgement-This work was supported in part by the U.S. Office of Naval Research, contract No. NO001485-K-0692, and grant No. NOOO14-90-J-1251.
APPENDIX Thestimuli are either class 1, the test line is closer to the observer than the comparison lines, or class 2, the test line is farther than the test lines. Following the tradition of applying signal detection theory to psychophysics, let x be some internal variable used to classify the response into either of these two classes. The variable x is considered to be stochastic and is drawn from either the distribution corresponding to class 1 or that corresponding to class 2. The Bayes decision rule for minimum error (Fukunaga, 1990) may be used to determine whether x belongs to class 1 or class 2. This is done by maximizing the likelihood ratio, or equivalently, minimizing the negative of the logarithm of the likelihood ratio. The likelihood ratio is the ratio of the conditional density functions of the two classes: p(xlclass l)/p(xlclass 2) when the a priori probabilities of showing a stimulus from class 1 or class 2 are equal. We assume that when the stimulus is pulsed multiple times this gives rise to an internal vector X that is used to determine whether the stimulus shown was from class 1 or class 2. Each component of the vector results from a corresponding pulse. We further assume that for X, the conditional density functions of the
924
T. KUMAR and D. A. GLASER
two classes are N(0, C) for class 1 and N(M, C) for class 2, where N(A, C) is a normal distribution with the expectation values, E[X] = A and E[(X - A)(X - A)T] = C where ( .)T is the transpose of the inclosed vector. Explicit expressions for p(X 1class 1) and p (X 1class2) are 1
one of two classes; (2) X is normally distributed; (3) X is stationary and (4) l&yes rule for minimum error is used to classify X. To proceed further we have to specify the covariance matrix, C. If we assume covariance stationarity, that is Cj, the 0th element of C is the same for a given value of )i -j] for any value of i and j, only (n - 1) independent variables are required to specify the n x n dimensional covariance matrix. C is given explicitly for n = l-5 below. c
and
b
1 p(X]class 2) = (2n)n,Z,C,‘,*exp[-f(X
- M)TC-l(X - M)].
a
h(X)=MTC-’
x-y (
“) and the decision rule is that if h(X) is negative, X resulted from a stimulus of class 1, and if hO() is positive then the stimulus was from class 2. The error, e 1, in classifying stimuli from class 1 can be explicitly calculated by integrating the density function of h(x): m
el=
p(h(X)(class
1)dh.
TR[I]=
E[X]class l] -r ( = --f‘+fTC-lM = -_m
1;
u. el =
1
Jam ze~‘1J2cls
m
= 1- F
I
z CJ,
where F(x) is the normal error function, x
F(x) =
1
_-cczee’2’2ds
s and at threshold e 1 = 0.25 or p z 0.91 where the value of m at threshold is p. For the familiar one dimensional case M = m and C = u2 which yields MTC-‘M = (m2/02) = 1.82 or p = 1.35~ or in terms of the perhaps more familiar notation of detectability: d’ = p/a = 1.35 for 75% correct classification. Assuming that X is a stationary process so that its moments are translationally invariant, then for the n-dimensional case all n elements of M, are equal to one another, let us say p, and all the n diagonal elements of C are also equal to one another, let us say c2. If all the elements of C were specified in units of c2 then
II ’
c b
a
1 a
d
b
a
c
l_
TR[2]=
TR[3]=(
;r:2,‘,“)‘;‘;
2(2 - a - 26) + 2c
and
TR [5] =
[l-3a2+b+4a2b-2b2-2b3-2ac +4abc-cZ+(l
1
E[(h(X) + m)*]class 1] = MTCm’M = 2m.
Then
c
l+a-(a+b)2+(l+a)c
TR [4] =
E[h(X)]class 1] = MTC-’
>
d’
where all the elements have been specified in units of u2. It is also reasonable to expect that for fixed spatial and temporal parameters of a single sequence of the stimulus the correlations between different elements of X do not change as the number of times the stimulus is shown increases, that is the variables a-d in the above matrices have the same value as the dimension of C increases. The threshold ratios can be computed and for these five matrices and are:
s I)
This can be done since h(x) has a one dimensional normal random vector, because X is a normally distributed random vector and p(h(X)lclass 1) can be explicitly solved by calculating the mean and variance of h(x)
c b
balab
1
The negative of the logarithm of the likelihood ratio is then
b
1 u
1 a
a
71:’ -2a2+b)d
5-8a-a2-6+12ab-8b2-4c +2ac+4bc-c2+(3-4a+b)d
J
Variables a-d may be computed for the data shown in Figs 7, 8 and 10 for the different observers. TR[2] gives a value for a using n = 2 data, and then using this value of a TR[3] can be manipulated to provide a value for b. With these values of a and b manipulation of TR[4] provides an estimate of c which can then be used to compute an estimate of d from TR[5]. The variables b-d were plotted vs the corresponding estimate of a, and the plots clearly indicated a functional relation. Fitting to a polynomial we determined that b = a’, c = a3 and d = a4. These results suggest that the general form of the covariance matrix is such that C,, = ali-jl where a0 = 1. The explicit expression for the inverse of this matrix is known and is
1 ___ 1 -a2
1 --a
-a
0
1+a2
-a
0
...
1
: 1,
‘,.
‘.,
0
0
0 ‘,.
t
...
0
l+a2
-a
-a
I
from which the expression for TR[n] may be solved to give MrC-‘M
= (m2/u2) i i (C-l),, 1=1,=1
where ( . )?iis the 0th element of the inclosed matrix. At threshold this gives 1.35u
_ “=$FjT In this formalism this is equivalent to stating that the threshold when the stimulus is shown n times is
I &=
TRfnl=
J
l+a
(I-a)*n+2a
This equation was fitted to the data to obtain the best estimate of a in the sense of minimizing x2. The estimates of a are given in Fig. 9. The parameters a specifies how two successive observations -m)] for class 2 and EP,X,+ ,] for co-vary, a = EKX,_,)K+I
xv
x@“xpx”
1,
times the threshold when the stimulus is shown only once. We refer to this expression as the threshold ratio (TR[n]) in this paper where n specifies the dimension of C. In arriving at this result, the assumptions that have been made are (1) that corresponding to the stimulus shown there is an internal variable X that is used to classify the stimulus into
a 31+ 0.3
a=0 FIGURE Al
a = - 0.3
TEMPORAL
ASPECTS OF STEREOACUITY
class 1, and as c1 changes the joint probability distribution of X, and X,+i obviously changes. Examples of the change in the distributions for class 1 and class 2 for a = 0.3, 0, and -0.3 are sketched in Fig. Al. The distribution for class 1 is centered at the origin and the distribution for class 2 is centered at X, = M and X,+i = m. For a = 0, X, and X,,, , are independent and the distributions are circularly symmetric. The variance of the Gaussian probability distribution for X, and X,, , are equal. This is the situation for “generic” probability summation which leads to the well known case of TR[n]=
i
A.
925
For non-zero values of a the joint probability distributions acquire elliptical cross-sections. Although X, and X,, , are not independent of each other, (X, + X,,, ,) and (X” - X,, ,) are since E[(X, + X, + ,)(X. - X, + ,)] = 0. The variance of the Normal probability distribution for obtaining (X, + X,, ,) is 2(1 + a)~* and that for obtaining (X,, - X,, ,) is 2(1 - a)a2. For positive a values the major axis of the ellipses are aligned along the 45 deg axis, while for negative a values the minor axis of the ellipses are aligned along that axis. For a given m and Q separating class 1 from class 2 becomes easier as a decreases in value. As (I approaches 1 TR[n] approaches 1 indicating little advantage is gained from multiple observations, and as a approaches - 1 TR[n] approaches 0 for any value of n greater than 1.