Vision Research 39 (1999) 3855 – 3872 www.elsevier.com/locate/visres
Pattern detection in the presence of maskers that differ in spatial phase and temporal offset: threshold measurements and a model John M. Foley *, Chien-Chung Chen Department of Psychology, Uni6ersity of California, Santa Barbara, CA 93106 -9660, USA Received 13 May 1998; received in revised form 2 February 1999
Abstract Four experiments are described in which brief Gabor patterns are detected in the presence of full-field gratings or Gabor patterns that are superimposed in space, but vary in spatial phase and temporal offset (SOA). E1: Threshold versus masker contrast (TvC) functions were determined for relative phases of 0, 90, 180 and 270° at SOA=0. For 0° relative phase, TvC functions decrease (facilitation) and then increase (masking) as contrast increases. For 90°, there is little or no facilitation and thresholds increase with masker contrast. For 180°, the form of the TvC function varies with observer and conditions. E2: Like E1, except that maskers are Gabor patterns. TvC functions are similar in form to those for full-field maskers, but there is less masking. E3: Forward masking. TvC functions were determined for relative phases of 0, 90, and 180° at SOA= −33 ms. The forms of the TvC functions for 0 and 180° are reversed relative to those at SOA =0. E4: TvP (threshold versus phase) functions were determined for SOA’s of −100, −67, −33, 0 and 33 ms at a constant masker contrast of 0.063. Masking occurs at all relative phases. For simultaneous and backward masking, the threshold is minimum for a relative phase of 0 and maximum at 180°. For forward masking, the form of the function is inverted. A model of pattern masking and facilitation (Foley, J. M. (1994a) Journal of the Optical Society of America A, 11, 1710–1719) is extended to account for masker phase and SOA effects. The model assumes four mechanisms tuned to phases 90° apart, and divisively inhibited by stimuli of all phases. Performance depends on the detection strategy of the observer. © 1999 Elsevier Science Ltd. All rights reserved. Keywords: Pattern; Detection; Masking; Phase; SOA
1. Introduction Most studies of pattern vision have employed absolute threshold paradigms. Much has been learned from them, but by their nature they do not provide information about supra-threshold vision. Pattern masking paradigms (e.g. Campbell & Kulikowski, 1966; Nachmias & Sansbury, 1974; Stromeyer & Klein, 1974; Legge & Foley, 1980; Wilson, McFarlane & Phillips, 1983; Ross & Speed, 1991; Foley, 1994a,b) have the potential of providing this. To do so, they must produce measurements that are reliable and sufficiently precise to provide the basis for developing and testing models of masking effects. Although there have been some inconsistencies in the results of masking studies, this paradigm gives promise of making an important
* Corresponding author. Fax: +1-805-893-4303. E-mail address:
[email protected] (J.M. Foley)
contribution to the understanding of pattern vision. On the basis of early measurements, Legge and Foley (1980) proposed a model of simultaneous pattern masking. At the core of the model were mechanisms that summed stimulation linearly over a receptive field, then transformed that sum with a nonlinear S-shaped transform. Recent experiments, particularly experiments that combine two maskers, show that model to be untenable (Foley, 1994a,b). Variations of the Legge and Foley model which substitute other transforms from receptive field excitation to mechanism response are also excluded. A new model was proposed by Foley (1994a). It is based on the biological discovery that cells in the visual cortex of cats and monkeys receive, in addition to an input from a receptive field, a second input, which is very broadly tuned to stimulus dimensions and acts on the response in an approximately divisive way (e.g. Albrecht & Hamilton, 1982; Bonds, 1989; Albrecht & Geisler, 1991; Heeger, 1991). This model has been
0042-6989/99/$ - see front matter © 1999 Elsevier Science Ltd. All rights reserved. PII: S 0 0 4 2 - 6 9 8 9 ( 9 9 ) 0 0 1 0 4 - 2
3856
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
shown to account for the effect of masker orientation and the effect of two maskers at different orientations on the threshold of the target pattern (Foley, 1994a). It has also been shown to account for the effects of temporal frequency (Boynton, 1994) and of pattern color (Chen, Foley & Brainard, 1997) on pattern masking. Here we examine the effect of masker spatial phase and temporal offset on the target contrast threshold. The results are inconsistent with the Legge and Foley model, but we show that the Foley (1994a) model can be extended to account for the effects of these variables. The principal previous studies of the effect of relative phase in simultaneous pattern masking are the following: 1. Kulikowski (1976) studied contrast discrimination using sinewave gratings as target and masker. The target grating was either in phase (contrast increment) or 180° out of phase (contrast decrement) with the masker. He found that threshold decreased and then increased as masker contrast increased in the in-phase case (a dipper-shaped threshold versus masker contrast (TvC) function) and he found a monotonic increasing function in the out of phase case. The two functions came together at high contrast. 2. Lawton and Tyler (1994) studied the effect of relative phase (0 and 90°) on the detection of a target grating that was superimposed on a longer duration background grating. The test stimulus was centered in time relative to the masker. The spatial frequency (1 or 7 c/deg) and the time course (gradual or abrupt) of the target were also varied. The purpose of this latter manipulation was to favor detection by parvo- or magnocellular pathways. They found no significant differences between the 0 and 90° relative phase conditions for either temporal regime at either spatial frequency. Very little facilitation was found; none, in some conditions. TvC functions at 7 c/deg increased more rapidly at high contrast than those for 1 c/deg. 3. Bowen and Wilson (1994) measured thresholds for 30 ms duration D6 patterns on 420 ms sinusoidal grating maskers. Masker contrast was 0.25. Targets were either in phase or 180° out of phase with maskers. Stimulus onset asynchrony (SOA) was varied. At SOA=0, the out-of -phase masker produced a larger threshold elevation. Both thresholds decreased with increasing SOA, but the in-phase threshold decreased faster so that the functions cross at about 50 ms. This experiment shows that when the masker stays on, the SOA interacts with relative phase in the determining threshold. 4. Bowen (1995) measured TvC functions for D6 patterns masked by sinewave gratings. SOA was 0, but the masker stayed on after the 30 ms target went
off. For target and masker in phase he obtained dipper-shaped TvC functions. For target and masker 180° out of phase, thresholds increased monotonically with mask contrast. 5. Yang and Makous (1995) reported results from a somewhat similar paradigm in which a background grating was left on continuously and the target grating had a gradual temporal onset and offset (0.5 cycle at 0.5 Hz). They measured TvC functions at 0, 90 and 180° relative phases. They present data for one observer showing that thresholds at 90° show no facilitation and are substantially higher than thresholds at 0° (a different result than that obtained by Lawton and Tyler). At 180°, thresholds rise rapidly at low masker contrast, then drop suddenly to approximately the 180° function, then rise with it. There are two principal differences in the results of these studies. First, when the masker phase is 180° relative to the target, the TvC function sometimes increases monotonically (Kulikowski, Bowen) and sometimes increases, decreases and increases again as masker contrast increases (Yang and Makous). Second, when the masker is at 90° relative to the target, it sometimes produces no effect (Lawton and Tyler) and sometimes produces more masking than an in-phase masker (Yang and Makous). Although there are some differences in method among these studies, they do not suggest any obvious explanation for the differences in results. Since most of these studies employ paradigms in which conditions for forward, simultaneous and backward masking occur within each trial, as well as, pattern adaptation, the analysis of them may prove to be complex. There have been two principal studies of phase effects in forward masking: 1. The most extensive study is one by Georgeson (1988). He examined the effects of masker spatial phase and SOA in forward masking. SOA (stimulus onset asynchrony) is the onset time of the masker minus the onset time of the target, so it is negative for forward masking. Georgeson’s stimuli were vertical sinusoidal gratings of 1 c/deg. He measured threshold versus masker contrast (TvC) functions at SOA’s of 0 to − 150 ms for a target in phase with the masker using 20 ms pulses (his Figs. 2 and 3). He found that facilitation decreased as the masker occurred earlier in time and that for temporal offsets \ 50 ms facilitation did not occur. Masking did occur at longer temporal offsets, decreasing as offset increased. There was very little masking at 100 ms and none at 150 ms. He also measured threshold versus relative phase (TvP) functions for a constant masker contrast of 0.16 at SOA’s of − 20 to − 140 ms (Figs. 4 and 5). He found that for SOA’s of − 20 and − 50 ms, masking was maximum at
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
relative phases of 0 and 180° and minimum at 90 and 270°. The amplitude of this modulation decreased as SOA increased and there was essentially no modulation for SOA’s \ 70 ms. Masking occurred, however, at SOA’s as large as −140 ms, although its magnitude decreased as temporal offset increased. Georgeson also compared TvC functions for maskers at 0 and 90° relative phase for SOA’s of −20 and − 80 ms (his Fig. 10a). He found greater masking with the 0° maskers than the 90 deg maskers, consistent with his TvP functions. Georgeson also measured TvP functions at masker contrasts of 0.01 or 0.005. There was maximum facilitation at 0° and slight masking at 180° at − 20 ms SOA; at −50 ms this relation reversed. Georgeson also measured TvP functions for forward dichoptic masking (SOA = − 20 ms) and TvC functions for both dichoptic and monocular forward masking for 0 and 90° relative phase at SOA’s of − 20 and − 80 ms. There were no phase effects in dichoptic masking and the slope of the TvC function was less at the longer SOA. For monocular forward masking at 0° relative phase and 20 ms durations, a dipper-shaped function was obtained. At 90° relative phase and SOA = − 20 ms, Georgeson again found a dipper-shaped TvC function. This extensive study shows that the combined effects of spatial phase and temporal offset are complex. 2. Bowen (1997) measured TvC functions for forward masking by a 500 ms grating. The target was a D6 that came on immediately after the masker offset. Here the TvC functions were similar to those in Bowen (1995) except the 180° out of phase condition produced the dipper shaped TvC function and the in-phase condition produced a monotonic increasing TvC function. These two studies are in agreement in showing that in forward masking the forms of the TvC functions for 0 phase and 180° phase are reversed, although Georgeson showed that this reversal does not take place at very short SOA’s. Other studies have focussed on the effect of SOA on masking with target and masker either in phase or in random phase relations. Gorea (1987) measured threshold as a function of SOA at a masker contrast of 0.2 and random phase. Georgeson and Georgeson (1987) measured thresholds with masker and target in phase as a function of SOA at two masker contrasts, masker threshold and 1.5 log units above masker threshold. These studies show that forward masking extends to SOA’s of more than 100 ms at low spatial frequencies and longer at high spatial frequencies. Backward masking extends over a shorter time period. In several cases these studies show that masking is less at an SOA of 0 than at adjacent SOA’s. With their masker at threshold, Georgeson and Georgeson found facilitation only for SOA’s near 0. Georgeson and
3857
Georgeson also measured TvC functions for forward and backward masking with SOA’s of − 50 and +50 ms. They found masking, but not facilitation at these SOA’s. In summary, the literature contains inconsistent results as to the form of the TvC function in simultaneous masking for relative phases of 90 and 180° and whether there is any effect of a change in relative phase from 0 to 90°. TvP functions have been measured only for forward masking. Many of the studies use relatively long duration maskers. Our first goal in the present study is to extend our knowledge of these effects. We use briefly pulsed patterns to examine the TvC function for relative phases of 0, 90, 180 and 270° in both simultaneous and forward masking. We also determine TvP functions for SOA’s from − 100 to + 33 ms. Our second goal is to determine if a model developed for simultaneous masking (Foley, 1994a) can be extended and developed to account for relative phase and temporal offset effects. Our study is closest to the study of Georgeson (1988). All stimuli are brief pulses. We add to his conditions the measurement of spatial phase effects in simultaneous and backward masking as well as forward masking. We measure complete TvC functions at a few relative phases as well as TvP functions at a single contrast within the masking range. Our study consists of four experiments: 1. TvC functions for simultaneous masking of Gabor targets by full-field grating maskers at different relative phases, SOA= 0, spatial frequency= 1 and 2 c/deg, and duration = 33 ms. The effect of relative phase on TvC functions for short pulses has not previously been reported. 2. TvC functions for simultaneous masking of Gabor targets on Gabor maskers at different relative phases, SOA= 0, spatial frequency= 1 c/deg. Here we want to determine what difference, if any, it makes if the masker is the same size as the target. Previous studies of phase effects have used full-field maskers. 3. TvC functions for forward masking of Gabor targets on full-field maskers at different relative phases, SOA = −33, Spatial frequency = 1 c/deg. Here the question is what happens to the form of TvC functions when the masker comes before the target. Georgeson’s work indicates that the form changes considerably. 4. TvP functions for forward, simultaneous and backward masking of Gabor targets on full-field grating maskers, masker contrast= 0.063, SOA= −100 to 33 ms, spatial frequency= 1 c/deg. This experiment extends Georgeson’s work on TvP functions to simultaneous and backward masking.
3858
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
2. Method
2.1. Apparatus The stimuli were generated using a computer graphics system that consisted of an AST 386/20 computer, a Truevision ATVISTA graphics board with 2 MB video memory, a contrast mixer and attenuator circuit, and two video monitors (Sony, model CPD-1304). Truevision Stage graphics software was used for image generation and control. The masker was generated on one monitor and the target on the other, and they were combined by a beam splitter. Images of the fixation field, the masker field and the target field were computed and stored on the graphics board. Each of these images was 512 ×400 pixels and its intensity was specified by an 8 bit number which was an index to a lookup table. This made it possible to change contrast quickly by changing the lookup table. The frame rate was 60 Hz. The methods of contrast control described by Watson, Nielson, Poirson, Fitzhugh, Bilson et al. (1986) were adapted to our system and to the masking paradigm. Target and masker waveforms were stored in separate segments of graphics memory. Their contrasts were controlled independently by lookup tables and could be further attenuated by an analog circuit to produce low contrasts without loss of waveform definition. The lookup tables had the dual role of controlling contrast and correcting for the nonlinear relation between voltage and screen intensity.
2.2. Stimuli The fixation field was uniform except for a small dark fixation point at the center. This field had a luminance of 30 cd/m2. The target patterns were Gaussian windowed sinewave gratings (Gabor patterns) centered on the fixation point and in cosine phase with it. The maskers were either full field gratings or Gabor patterns. Target and masker always had the same spatial frequency, either 1 or 2 c/deg, and the 1/e half width (space constant) of the Gabor patterns was the reciprocal of the spatial frequency (1 and 0.5°, respectively). Both target and masker had durations of two frames (33 ms). Contrast for sinusoidal gratings was defined as the Michelson contrast; contrast for Gabor patterns was defined as the Michelson contrast of the underlying sinewave prior to attenuation by the Gaussian window. All contrasts are expressed in dB re 1, where 1 dB is 1/20 of a log unit of contrast. Viewing distance was 162 cm and the visual angle subtended by the stimulus field was 7° horizontal by 5° vertical. The maskers always occurred within a constant spatial window centered on the fixation point and their spatial phase was varied within the window relative to the fixation point. Temporal offset was specified in terms of
stimulus onset asynchrony (SOA) which refers to the time of onset of the masker minus the time of onset of the target, so that a negative SOA refers to forward masking.
2.3. Procedure The observer fixated on the fixation point throughout each trial sequence. A two-alternative temporal forcedchoice paradigm was used to determine target contrast thresholds. On each trial the target was presented in either the first or the second of two 33 ms observation intervals with a 1266 ms interval between them. The masker was presented in both intervals. The target interval was determined randomly with the probability of each interval being 0.5. The time intervals during which the target might be presented were indicated by a tone. The observer responded by pushing a lever forward or back to indicate that the target was in first or second interval, respectively. The response was followed by a high or a low tone indicating correct or incorrect. The QUEST procedure (Watson & Pelli, 1983) was used to adjust the contrast so as to seek the contrast corresponding to a probability correct of 0.90. This procedure provides an estimate of this contrast which we will refer to as the target contrast threshold. The QUEST sequence was terminated after 40 trials, or 50 trials if there were no errors on the last 20 trials. An outlier test was performed (Rousseeuw, 1991) and measurements that exceeded the outlier criterion were excluded from analysis; 13 out of more than 1400 measurements were excluded, fewer than 1%. In the TvC experiments the measurements were blocked with respect to masker phase and in random order with respect to masker contrast. There was only one phase relation in each session. In the TvP experiments the measurements were blocked with respect to SOA, but in random order with respect to relative phase. Three to ten measurements were made in each condition with more measurements being made at higher masker contrasts where the variance of the measurements was greater. There were four observers. Two of them were the authors and the other two were naive with respect to the purpose of the experiment. All had visual acuity of 20/20 or better, with or without correction, and no visual problems. CCC, JYS and JMF were highly experienced in masking experiments; AHS was not.
2.4. Results 2.4.1. Experiment 1: T6C functions for simultaneous masking of a Gabor target by a full-field grating masker at different relati6e phases Targets were Gabor patterns of 1 or 2 c/deg and maskers were full-field gratings having the same spatial
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
frequency as the target. TvC functions were measured for two observers at each of the two frequencies. There were four relative phases, 0, 90, 180 and 270, although JMF made measurements only at 0 and 180°. There were some differences in the functions at 90 and 270°, but these were small and not consistent across observers. Measurements at these two phases were averaged for the purpose of presenting the results. The results are shown in Fig. 1. In this and the other experiments standard deviations were generally between 1 and 2 dB and tended to increase slightly with masker contrast. Mean standard error in experiment 1 was 0.77 dB. When relative phase is 0, TvC functions for all four observers have the familiar dipper-shape. When relative phase is 90 or 270°, there appears to be 1 – 2 dB of facilitation at very low masker contrasts; at higher contrasts the threshold increases with masker contrast, and the threshold at 90° is usually higher than the threshold at 0° phase. When the relative phase is 180°,
3859
the TvC function takes several different forms. In the simplest case (JYS) the threshold increases monotonically with masker contrast. At the other extreme (JMF), the 180° TvC function rises at low masker contrasts, then decreases abruptly to just below the 0° function, then increases, approximately paralleling the 0° function. In the other two cases a threshold decrease occurs a higher masker contrast, and in one case (CCC) the threshold does not decrease as low as the 0° function. In all figures, the smooth curves correspond to the predictions of a model that will be described below.
2.4.2. Experiment 2: T6C functions for simultaneous masking of a Gabor target by a Gabor masker at different relati6e phases Target and masker were Gabor patterns that were identical except for contrast and the relative phase of target and masker. Their center frequency was 1 c/deg. Here the question was what effect, if any, restricting the
Fig. 1. TvC functions for detection of a Gabor target masked by a full-field grating of the same spatial frequency at different spatial phases relative to the target. SOA =0. Duration =33 ms. Top: 2 c/deg; bottom: 1 c/deg. Thresholds at 90 and 270° were averaged and are labeled 90°. The smooth curves correspond to the best fit of our model. The parameters of this fit are given in Table 1.
3860
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
Gabor-on-Gabor case (experiment 2); the difference in masking is about 4 dB. So decreasing the spatial extent of the masker decreases its masking effect. This confirms a result of Foley (1994a), and it suggests that the spatial region from which a target can be masked is larger than the target.
2.4.3. Experiment 3: T6C functions for forward masking of a Gabor target by a full-field grating masker at different relati6e phases Here masker onset was 33 ms before target onset (SOA= − 33 ms). Since the masker duration was 33 ms, the masker offset was just before target onset. Targets were Gabor patterns of 1 c/deg and maskers were full-field gratings. TvC functions were measured for two observers. There were four relative phases, 0, 90, 180 and 270. The results are shown in Fig. 3. Mean standard error was 0.72 dB. The 0° function here is like the 180° function in experiment 1 and vice-versa, that is, the masker at 0° relative phase masks the most here
Fig. 2. TvC functions for detection of a Gabor target masked by a Gabor masker with the same center frequency and space constant, but differing in spatial phase relative to the target. SOA = 0. Duration= 33 ms. Spatial frequency = 1 c/deg. Thresholds at 90 and 270° were averaged and are labeled 90°. JMF did not make measurements at 90 and 270°. The smooth curves correspond to the best fit of our model. The parameters of this fit are given in Table 1. The open triangles correspond to measurements made in a supplementary experiment in which the observer indicated which interval had the higher contrast.
size of the masker to the same size as the stimulus would have on performance. The results are shown for two observers in Fig. 2. Mean standard error was 1.03 dB. Here the forms of the TvC functions are similar to those of experiment 1 except that in the 180° condition the threshold drops abruptly to or slightly below the threshold at 0° for both observers. This drop occurs at a masker contrast which is 4 – 5 dB higher than the absolute threshold of the masker. In this experiment the threshold at 90° is consistently higher than that at 0° at all masker contrasts. For the one observer that had the same spatial frequency in experiments 1 and 2 (JMF) there is more facilitation and less masking in the
Fig. 3. Forward masking. TvC functions for detection of a Gabor target masked by a full-field grating of the same spatial frequency at different spatial phases relative to the target. SOA = − 33 ms. Duration = 33 ms. Spatial frequency = 1 c/deg. Thresholds at 90 and 270° were averaged and are labeled 90°. JMF did not make measurements at 90 and 270°. The smooth curves correspond to the best fit of our model. The parameters of this fit are given in Table 1.
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
Fig. 4. Threshold versus relative spatial phase (TvP) functions at different values of SOA. Masker contrast =0.063 ( −24 dB re 1). Duration =33 ms. Spatial frequency=1 c/deg. The horizontal line corresponds to the absolute threshold of the target. The smooth curves correspond to the best fit of our model. The parameters of this fit are given in Table 1.
and the masker at 180°, least. Both facilitation and masking are smaller in magnitude than in the simultaneous case. For both observers there are dips in the TvC functions at masker contrasts of −30 to − 20 dB. This worsens the fit of the model and may represent detection by a second set of mechanisms that are the most sensitive in this contrast range.
2.4.4. Experiment 4: threshold 6ersus phase functions for a constant masker contrast at different SOA’s Targets were Gabor patterns of 1 c/deg and maskers were full-field gratings. The masker contrast was 0.063 ( − 24 dB re 1). There were five values of SOA in the range − 100 to +33 ms. A negative value here denotes forward masking (the masker onset before the target) and the positive value denotes backward masking (the masker onset after the target). There were eight values of relative phase between − 180 and + 135°. Measurements were blocked by SOA so that only a single SOA occurred in one session, but the different phases were presented in random order. The results are shown for two observers in Fig. 4. Mean standard error was 0.62 dB. The dashed line at the bottom corresponds to the absolute threshold
3861
for the target. Thresholds vary systematically with relative phase. Masking occurs at all relative phases for all five SOA’s; there is no null phase for masking. There is also an effect of SOA on mean threshold. In simultaneous masking (SOA= 0) the threshold is lowest when target and masker are in-phase and increases smoothly to a maximum at about 180° relative phase. In forward masking, TvP functions are inverted. Masking is maximum at 0° and decreases to a minimum at 180°. In backward masking the two observers produced different functions. For JMF the TvP function has the same form as in simultaneous masking; for AHS the threshold reaches a maximum at 45 or 90° and then decreases for larger phase differences. AHS’s results at SOA= −33 ms are consistent with her results in experiment 3. At the masker contrast used here (− 24 dB) masking is greatest at 0° and least at 180° relative phase. At 0° relative phase forward maskers at SOA’s of − 33 and − 67 ms mask more than a simultaneous masker. The results of our four experiments agree with some of the results in the literature and disagree with others. Each of our experiments shows very clear phase effects, including differences between 0 and 90° relative phase. This is different than the result of Lawton and Tyler, but in agreement with the results of the other studies. Experimental conditions, as noted, were not identical. Our results agree with those of Georgeson’s (1988) forward masking experiments in showing that in forward masking there is a maximum in the TvP functions at 0° relative phase, but they disagree with his results in that he found a minimum at 90° and a second maximum a 180°, while we find a monotonic decrease in threshold from 0 to 180° in forward masking. With respect to the form of the TvC function at 180°, one of the forms that we found agrees well with the results of Yang and Makous (1995); the others do not. Previous studies of the effect of SOA when target and masker are in phase have often found a dip in the threshold at 0 SOA. We found a decrease relative to forward masking, but not backward masking. We will return to these differences in the Section 3.
2.5. Model Foley (1994a) proposed a model of facilitation and masking and showed that it describes the results of experiments in which the orientation of the masker is varied relative to the target and experiments in which there are two maskers with different orientations. The model has been shown to describe results of experiments in which patterns vary in temporal frequency (Boynton, 1994; Boynton & Foley, 1999), color (Chen et al., 1997), and position uncertainty (Foley & Schwarz, 1998). The central elements of the model are the units that respond to patterns. These are referred to as pattern
3862
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
vision mechanisms. One of them is illustrated in Fig. 5. These mechanisms have two types of inputs. The first, shown coming in from the bottom, is an input produced by applying a linear receptive-field-like operator to the stimulus pattern. We will refer to this operator as the receptive field of the mechanism and to its output as the excitatory input to the mechanism. The second input, shown coming in from the left is an inhibitory input. The inputs combine to determine the response in the way shown here. The internal parameters p, q and Z as well as the excitatory and inhibitory sensitivities of each stimulus component are estimated from experimental data. The excitatory exponent, p, is generally greater than 2 and the inhibitory exponent, q, is less than p. Since the inhibitory input acts in an approximately divisive way, it is referred to as a divisive inhibitory input. There are many such mechanisms tuned to different orientations, spatial frequencies, and other pattern dimensions. The observer’s behavioral response in a masking task is determined by computing the absolute value of the difference between the mechanism response to masker plus target and the response to masker alone for each mechanism. The model allows for more than one mechanism response to be used in determining the behavioral response. For all of the mechanisms that are used, the absolute values of the response differences are summed nonlinearly, using the Quick (1974) rule, to produce the detection variable. At
Fig. 5. Schematic illustration of a model of the human pattern vision mechanisms. (Foley, 1994a,b, model 3).
threshold the value of the detection variable is assumed to be 1. The Foley (1994a) model does not specify the origin of the divisive inhibitory signal. It simply assigns to each component of the stimulus (e.g. masker or target) an excitatory and an inhibitory sensitivity whose values are parameters of the model that are estimated by experiment. The component excitations are summed linearly and the component inhibitions are summed nonlinearly to produce the two net inputs to the detecting mechanism. Since the model assumes that each component of the stimulus contributes independently to net inhibition, it is not appropriate to situations in which the components interact as they do when the phase difference is large. Then mutual optical cancellation occurs and the independent effects assumption is untenable. The phase difference beyond which mutual cancellation occurs is greater than 90° and its value depends on the two contrasts. In the extreme case, when one component is 180° out of phase with the other and they are the same in contrast and spatial form, they cancel completely. To correct this limitation on the 1994 model, we created a more explicit version of the model which specifies the mechanisms that mediate performance in the experiments of this study and parameterizes the model by assigning sensitivities to these mechanisms rather than to stimulus components as in the earlier model. Here the mutual cancellation of stimulus components is taken into account in computing the excitatory and divisive inhibitory inputs to the mechanisms. This model is illustrated in Fig. 6 and is stated completely in Appendix A. This model of the effect of relative phase and temporal offset specifies the receptive fields of four mechanisms. The fields coincide in space but are tuned to sinewave patterns that have the same spatial frequency but differ in phase by steps of 90°. There is both biological and psychophysical evidence for mechanisms tuned to phases at 90° intervals. Hubel and Wiesel (1962) found cortical cells in the cat and the monkey (Hubel & Wiesel, 1968), some of which had receptive fields with even symmetry and others with odd symmetry. Pollen and Ronner (1981) found that the phase response of adjacent simple cells in the cat tends to differ by approximately 90°. Field and Nachmias (1984) showed that four mechanisms tuned to 0, 90, 180 and 270° were sufficient to account for their results on the discrimination of phase relations between a fundamental and a second harmonic. Morrone and Burr (1988) used mechanisms differing in phase by 90° in a model of pattern detection and identification. Several models of motion perception incorporate mechanisms tuned to phases 90° apart, as does Heeger’s (1992) model of cat simple cells and the Teo and Heeger (1994) and Watson and Solomon (1997) models of pattern masking. There
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
3863
Fig. 6. Schematic illustration of the model used in this study. There are four mechanisms tuned to phases 90° apart. A common divisive inhibitory signal is derived from the excitation of all four receptive fields. Differential responses of the four mechanisms to target plus masker and masker alone are pooled nonlinearly to determine a detection variable which equals 1 at detection threshold.
have been relatively few quantitative tests of the adequacy of these models. We found that four mechanisms tuned to phases 90° apart are sufficient to account well for the results of the present experiments. We recognize that there are other mechanisms tuned to other orientations, spatial frequencies, and positions. However, we assume that those mechanisms do not contribute to the detection of our targets. Hence, they are not shown here. All four mechanisms have the same values of the internal parameters, p, q and Z. The model computes the excitation of each of the four mechanisms to the patterns that are presented. The inhibitory term for each mechanism is the sum of the excitations of all the mechanisms each raised to the power q. This is similar to the way that the Heeger (1992) model of cortical cells computes the inhibitory term except that his model raises excitation and inhibition to the same power. As described in Appendix A, the model has five parameters in addition to the mechanism sensitivity parameters. These are the internal mechanism parameters, p, q and Z, which are assumed to be the same for all four mechanisms, and Cd and b, which specify the masker contrast above which the out-of-phase (180°)
mechanism is used in the detection of the in-phase target and the weight given to the out-of-phase mechanism in computing the detection variable. There is an excitatory and a divisive inhibitory sensitivity for each pattern component (target or masker). These are assumed to be equal across the four mechanisms, except that the divisive inhibitory sensitivity of the 90 and 270° mechanisms may differ by a factor, a, from the sensitivity of the 0 and 180° mechanisms. Excitatory and divisive inhibitory sensitivities to the masker vary as independent functions of SOA, so when SOA varies there is a pair of sensitivities for each SOA value. The excitatory sensitivity to the target, SEt, was always fixed at 100. It is essential to fix one of the parameters in order to get a unique set of parameter values for each data set. This is because the response function (Eq. (19) in Appendix A) is a ratio and any parameter set which produces the same ratio will fit the data equally well. When the model is applied to families of TvC functions for a single target and maskers of different relative phases there are nine parameters: p, q, Z, Cd and b, plus the sensitivity parameters: a, SIt, SEm and SIm. In experiment 3 where the target and the
3864
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
masker had the same Gabor form, SEm =SEt and SIm = SIm and there were seven parameters. In practice we found that we could fix some of the parameters based on a qualitative examination of the data. When the 180° function was monotonic increasing, we could fix b = 0. Here Cd had no effect and we fixed it equal to 1. When the 180° TvC function dropped to the 0° function, b= 1. In fitting the TvP functions we found that the fits were relatively insensitive to several of the parameters, so we fixed these to values derived from the fits to the TvC functions. Here, since SOA varies there are ten masker sensitivities in place of the two in the TvC experiments, so 17 parameters are required to fit the TvP data. The model was fitted to the data using a routine that finds parameter values that minimize the Sum of Squared Error (SSE) between the measurements and the values predicted by the model. The routine employs the methods of Powell and Brent and uses code found in Press, Flannery, Teukolsky and Vetterling (1986). Our procedure was as follows: for each data set, we first found parameter values by trial and error that gave a rough fit to the data. Then we did 30 least squares fits, starting each time with different set of parameter values which were sampled randomly from distributions centered on the parameters of the rough fit. We then took the best of the 30 fits as the overall best fit. In every case there were at least several fits that were very similar in RMSE and parameter values to the best fit. The smooth curves in the figures correspond to the best fits of the model to the data. Data for each observer in each experiment were fitted separately. The fits are summarized in Table 1. Here the number of free parameters refers to the parameters that were not fixed in advance. Fixed parameters are labeled as such. In all ten data sets the fits are reasonably good. The mean RMSE across the ten fits is 1.12 dB. This is 1.44 times the mean standard error of the measurements which is 0.78 dB. There are no large systematic differences between the model and the measurements. Since JMF was an observer in all four experiments, we did a second fit of his data, fitting the results of all four experiments simultaneously. The target was the same Gabor pattern in all four experiments, and we assumed that the mechanisms that detected it were the same. Thus, in making the fit we constrained the six parameters p, q, Z, SEt, SIt, and a to be the same across the four experiments. Excitatory and divisive inhibitory sensitivities to the full-field masker were allowed to vary with SOA (five values) and different values were allowed for the Gabor masker. So there were 17 free parameters in the overall fit (12 masker sensitivities, one target sensitivity (SEt was fixed), and p, q, Z and a). On the basis of a qualitative examination of the data we fixed Cd = 0.02 and b= 1 in experiments 1 and 2, and b = 0 in experiments 3 and 4. So 20 parameters were used to make the overall fit.
The parameters of the best overall fit to JMF’s data are given in Table 2. The RMSE of the fit was 1.89. This is a reasonably good fit, given that there were seven TvC functions and five TvP functions being fitted simultaneously here. Here a=1.34. There are no large systematic deviations between the model and the measurements. Overall, the model does a reasonably good job of describing the qualitative as well as quantitative aspects of the results and individual differences in performance. Differences in the form of the TvC function for 180° are accounted for by: (1) whether or not the responses of the 180° mechanism are used to determine the threshold; (2) the masker contrast above which they are used; and (3) the weight given to them relative to the responses of the 0° mechanism. We think that this part of the model reflects detection strategy on the part of the observer, although it is not yet clear how much control the observer has over this. To examine whether instructions can select a different detection strategy we did a supplementary experiment. We used the Gabor-on-Gabor paradigm with the same stimuli as experiment 2. Everything was the same as experiment 2 except the instructions. Here the instructions were to indicate the interval with the highest contrast. A response was scored as correct and the correct feedback signal given only if the target plus masker contrast was greater than the masker contrast. Here the model predicts that detection will be mediated by the in-phase mechanism and that the TvC function will increase monotonically with masker contrast. Both observers produced this result. The measurements are shown as open triangles in Fig. 2a and b. The result shows that a change in detection strategy can be produced by a change in instructions. It does not follow that we can select any of the possible detection strategies in this way. It is interesting that the responses of the 180° mechanism are often not used even though using them would improve performance. Four factors seem to influence their use. They tend to be used: (1) by experienced observers; (2) at the higher masker contrasts; (3) when target and masker have the same spatial profile; and (4) when relative phase is blocked over measurements. In our results there is no case in which responses of the out-of-phase mechanism are used at the lowest contrasts at which they could reduce thresholds. How come the system does not take advantage of the responses of the out-of-phase mechanism? Three possibilities are the following: (1) When target and masker are in-phase this mechanism does not respond and a habit of ignoring it may develop (factor 1). (2) Phenomenologically, when target and masker are in phase, the stimulus that contains the target always has the higher contrast and the observer may come to rely on this cue. These two possible explanations involve detection strategy errors
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
on the part of the observer. It is plausible that they might be overcome by experience with feedback. (3) When target and masker are out of phase the stimulus that contains the target will have lower contrast. The observer has to detect and use relative phase to tell apart the two conditions. Perhaps it is difficult and sometimes impossible to do this. All four factors men-
3865
tioned above may help observers to take account of relative phase in making their detection judgements. Fig. 7 shows the excitatory and inhibitory sensitivities of the mechanisms to the masker at the time of target detection as a function of the onset of the target re the masker (this is the negative of the SOA). These were obtained by fitting the model to the data of
Table 1 Fit summary and parameter values estimated by fitting the model to the data Experiment 1: Gabor target; full-field masker; simultaneous masking 2 c/deg JYS CCC Number of data points 31a 33 Number of free parametersb 6 8 RMSE (dB) 1.25 0.97 SEt (fixed) 100 100 Sit 80.60 99.65 187.82 SEm 246.33 SIm 204.80 177.84 a 1 (fixed) 1 (fixed) p 2.66 2.26 q 2.25 1.72 Z 21.00 4.60 Cd 1 (fixed) 0.06 b (fixed) 0 (fixed) 0.18
1 c/deg Number of data points Number of free parameters RMSE (dB) SEt (fixed) Sit Sem Sim a p q Z Cd b
AHS 33 9 1.15 100 91.23 144.79 91.23 1.50 2.23 1.55 1.17 0.04 0.02
JMF 22 7 1.05 100 41.31 185.30 128.79 1.00 (fixed) 2.36 2.20 2.11 0.02 1 (fixed)
Experiment 2: Gabor target; gabor maker;simulatneous masking 1 c/deg JMF CCC Number of data points 33 33 Number of free parameters 5 6 RMSE (dB) 1.66 1.75 SEt (fixed) 100.00 100.00 SIt 55.73 77.58 SEm 100.00 100.00 SIm (locked= SIt) 55.73 77.58 a 1.45 1.50 p 2.37 3.71 q 2.04 3.33 Z 1.39 3.12 Cd 0.02 0.02 b 1 (fixed) 1 (fixed) a
Experiment 3: Gabor target; full-field masker; SOA= 33
1 c/deg Number of data points Number of free parameters RMSE (Db) SEt (fixed) SIt SEm SIm a p q Z Cd b
AHS 33 6 0.84 100.00 169.07 −4.33 476.12 1.00 (fixed) 2.75 1.11 4.69 1 (fixed) 0 (fixed)
Experiment 4: Gabor target; full-field masker, JMF 1 c/deg Number of data points 48 Number of free parameters 13 RMSE (dB) 0.46 Set (fixed) 100.00 Sit(fixed) 60 Marker sensitivities Sem 7.01 −5.74 9.14 78.81 42.56
JMF 22 6 1.23 100.00 144.96 −11.63 258.35 1.00 (fixed) 3.34 1.84 4.32 1 (fixed) 0 (fixed) se6en phases; Fi6e SOA’s AHS 48 13 0.84 100.00 60 SIm SEm 55.59 11.47 90.77 10.83 190.70 3.33 167.20 38.85 58.79 9.12
a (fixed) p (fixed) q (fixed)
1.00 2.60 2.00
1.00 2.60 2.00
Z Cd (fixed) b (fixed)
1.43 1 0
3.78 1 0
SIm 37.75 83.74 137.71 132.49 47.75
There were no valid measurements in two conditions. The number of free parameters shown is the number that were allowed to vary in making the fits. For most data sets there were other parameters whose values were determined by a qualitative examination of the data. b
3866
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
Table 2 Fit summary and parameter values for all four experiments fitted simultaneously for observer JMF Number of data points Number of free parameters RMSE (dB)
125 17 1.89
SEt (fixed) SIt a p q Z
100.00 47.73 1.34 2.15 1.88 1.74
Masker sensitivities
SEm
SIm
Full-field masker SOA (ms) −100 −67 −33 0 33 Gabor masker 0
6.06 −3.44 −5.43 140.27 42.00
36.02 58.87 91.86 115.80 34.72
163.40
166.79
experiment 4. Excitatory sensitivity is a biphasic function of time, that is it increases, then decreases, then increases again; divisive inhibitory sensitivity is a monophasic function of time. Although biphasic, excitatory sensitivity is negative only at one SOA for JMF. Models of temporal impulse response functions for mechanisms tuned to low spatial frequencies are positive for a few ms and then become negative (Watson, 1986). Our excitatory sensitivity functions have the same form, except that they are shifted upward so that most of the values are positive. Although the TvP function is inverted in forward masking for all SOA’s, the sign of the excitatory sensitivity remains positive when the amplitude of threshold modulation with phase is small. The monophasic nature of the inhibitory function cannot be considered to be established by these results, because other parameters sets which yield almost as good a fit have one or more negative sensitivities. On the other hand, early in the model development we found that models in which the temporal modulation of excitatory and divisive inhibitory sensitivity has the same form are qualitatively and quantitatively inadequate to describe our data. JMF detected the same target in experiments 1, 2, 3 and 4. AHS detected the same target in experiments 1, 3 and 4. Here we would expect the same mechanism to detect the target in the different experiments. Thus, in the separate fits to the data of the different experiments (Table 1) the parameters SEt, SIt, p, q and Z should be approximately the same. For JMF there is reasonably good agreement across experiments 1, 2
and 4, except that excitatory sensitivity to the masker is less in experiment 4. AHS shows the same difference and the values of p and q are also quite different for her in experiment 3 (forward masking) than in experiments 1 and 4. If q does decrease in forward masking this will require a modification in the model. However, it would be premature to conclude that this happens on the basis of this one data set. Boynton (1994) fitted TvC functions for a range of SOA’s with the same values of p, q and Z. We would expect the parameter, a, to be constant across experiments for the same observer. In fact it is very near 1 for seven of the ten data sets and near 1.5 in the other three each of which was produced by an observer for whom the parameter was near 1 in other data sets. The bimodal nature of the distribution suggests that it may have some other basis than random variation, but we do not know what it is. Some of our within observer differences may be due to practice effects as the experiments were done over a period of several months during which the observers also participated in other experiments.
Fig. 7. Excitatory and inhibitory sensitivity to the masker as a function of onset time of the masker relative to onset time of the target ( −SOA). Values were estimated by fitting the model to the data.
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
3. Discussion First we will compare our results with other results in the literature. We do find an effect of relative phase in masking. This is in agreement with Georgeson (1988), Bowen and Wilson (1994), Bowen (1995) and Yang and Makous (1995). It is not in agreement with Lawton and Tyler (1994), who found no difference between 0 and 90° relative phase. Their conditions are different from ours in that their target comes on and goes off during the masker, but the other studies cited are similar to theirs in this respect. With respect to the form of TvC functions, we found the familiar dipper-shaped form in the in-phase, 0 SOA case. At 90 and 270° we found a small amount of facilitation in several cases. This is in agreement with Georgeson, but not with Yang and Makous. At 180° we found quite a bit of individual variation. Some functions rise monotonically. Others rise and then drop sharply to about the level of the 0° function, then rise in parallel with it. The drop occurs at different masker contrasts in different cases. Sometimes thresholds do not drop as low as the 0° thresholds before beginning to rise. In forward masking our monotonically increasing function at 0° agrees with Georgeson (1988, Fig. 2) and our dipper shaped function at 180° agrees with Bowen (1997). In general, the form of our TvC functions is in agreement with others, except that we found more variation in the form of the 180° function. Our model accounts well for the form of these TvC functions, including the individual differences, except for the small amount of facilitation that sometimes occurs at 90°. Our TvP functions at 0 SOA and masker contrast − 24 dB have a minimum at 0° and rise to a maximum at 180°. This is consistent with an 180° TvC function that has not dropped at this contrast. As noted above, we found that the 180° TvC function had dropped at this contrast in some cases, but not others. In forward masking we found inverted TvP functions that have a maximum at 0° and decrease to a minimum at 180°. Georgeson (1988) obtained a different form of TvP function in forward masking. He also found a maximum at 0°, but his functions drop to a minimum at about 90° and then rise to a maximum at 180° (Georgeson, 1988, Figs. 4 and 5). We did not obtain minimum masking at 90° in any of our data sets. Georgeson’s data cannot be explained by a version of our model that employs both the 0° mechanism and the 180° mechanism to detect, because that model predicts minima at 0 and 180° with a maximum in the vicinity of 90° at all SOA’s (interestingly, AHS shows this type of TvP at SOA= + 33 ms.) Georgeson’s results can be explained by our model if only the 0° mechanism is used in detection and it is very strongly inhibited by the mechanism tuned to 180°. However,
3867
the principal problem here is to understand the difference in the results of the two studies. 3.1. Other interpretations of spatial phase and SOA effects On the basis of his results, Georgeson (1988) proposed that there are two processes involved in forward masking. One of them is responsible for facilitation and the other for masking. This anticipated Foley’s (1994) model which described two processes quantitatively and incorporated them in a model of facilitation and masking. Geogeson incorporated his idea into a model of forward masking based on detection by units that are sensitive to the direction of motion. He simulated this model and showed that it is qualitatively consistent with his results. Our model does not have a motion unit stage and in that sense it is simpler than Georgeson’s model. Nevertheless, it accounts well for our data and it is at least qualitatively consistent with much of his data. Further, our experiment 4 shows directly that sensitivity to phase does not change with time (except for a sign change), as it would for a motion sensitive mechanism. Yang and Makous (1995) incorporate a model of phase effects into their more general model of masking (Yang, Qi & Makous, 1995). According to this model any stimulus component, including a uniform background, produces excitation that spreads in spatial frequency around that component. The threshold for a target grating is a power function of the excitation at the frequency of the grating minus a subthreshold summation term, which grows at very low contrasts and goes to zero as contrast increases. It is this subtractive term that produces facilitation at low contrasts. Their model is phase insensitive. The relative phase of target and masker has an effect only insofar as it influences the amplitude of the combined stimulus. Thus, as the phase difference between target and masker increases, the threshold will increase by an amount sufficient to maintain a constant amplitude difference between the masker plus target and the masker alone. When the masker and target are sufficiently out of phase that adding the target to the masker produces a contrast less than that of the masker alone, the system detects the decrement in the response to target plus masker relative to masker alone. This decrement threshold depends on excitation at the target frequency in the same way that the increment threshold does. Thus the model predicts phase effects without any additional parameters. The TvC function for the 0 relative phase case is determined by the parameters of the model; functions for all other cases are derived from this assuming a constant contrast discrimination threshold. The model gives a good account of the functions of the one observer that Yang and Makous present. Although none of our experi-
3868
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
Fig. 8. TvC functions for JMF from E2. Smooth curves through the data for 0° phase are derived from our model. The smooth curves for the other two phases are derived for Yang and Makous’ model of the effect of masker phase.
ments employ the sine on sine masking paradigm that they used, their model would seem to apply to our experiment 2 where target and masker where Gabor patterns of the same size. We used our model to fit the 0° TvC and then used Yang and Makous’ model to predict the effects of changes in masker phase. Yang and Makous predict these changes without any additional parameters. These predictions and the results of experiment 2 for JMF are shown in Fig. 8. There are two inconsistencies between the predictions and the data. First, in the 180° condition, the model threshold drops at a lower masker contrast than the measured threshold. Second, in the 90° condition, the model threshold is consistently higher than the measured threshold. The 90° TvC for CCC comes close to the predictions of the Yang and Makous model, but the same discrepancy is found at 180°. Thus, there are systematic qualitative errors in the predictions of the Yang and Makous model here. Our model needs two parameters to account for the changes in the TvC function with masker phase in this case (a and Cd). The Yang and Makous model also predicts the form of the TvP functions. At SOA= 0 and a masker contrast above threshold, their predicted TvP functions have a minimum at 0°, rise to a maximum at 90° and decrease to a minimum at 180°. Our TvP functions are quite different. However, our model predicts similar TvP functions if both the 0 and 180° mechanisms are used to detect the target. The Yang and Makous model says nothing about forward or backward masking, but presumably it could be extended to these paradigms. Bowen interprets his results as indicating that pathways tuned to 0 and 180° phases interact in determining thresholds. Our results and model are consistent
with this and show more generally that the pathways that detect patterns of any phase interact with those that detect any other phase. The model assumes that this pathway interaction has the form of divisive inhibition. Bowen interprets his results as indicating pathway isolation in that a 0° phase target is always detected by a pathway that is excited by this target and a 180° phase target is always detected by a pathway that is excited by a 180° target. This is consistent with his results, but not with ours. When the TvC function has the form shown in Fig. 2, our model interprets this as detection by the in-phase mechanism at low contrast and by the out-of-phase mechanism at high contrast. The model that we fitted to our measurements is more complex than we expected. The complexity arises from our attempt to account for the large differences in performance between individuals in the same experiment and within individuals in different tasks. We attribute these differences to differences in the way that the threshold depends on mechanism responses. The dependence that we propose is the simplest that we found that could account for our results. The processes probably are more complex than those in the model. Our experiments are not sufficient to test different models as to how these processes might work. In summary, we have determined the form of TvC functions for simultaneous masking with briefly pulsed maskers at different phases relative to the target. For 0° relative phase this function is dippershaped, as has frequently been shown. At 90 and 270° relative phase, the function is monotonic, increasing and accelerating as masker contrast increases with a small amount of facilitation at low masker contrasts in some cases. A masker at 90 or 270° masks more than a masker at 0° relative phase. There is good agreement across observers in the form of these functions. When the relative phase is 180°, however, there is considerable variation in the form of the TvC function with condition and observer. TvC functions for forward masking have the same general form as those for simultaneous masking, except that the 0° function in simultaneous masking and the 180° function in forward masking are similar and viceversa. We have determined the form of the TvP function for a single masker contrast at five values of SOA. In simultaneous and backward masking, it has a minimum at 0° and a maximum at 180°. In forward masking this form is inverted. We have presented a model that describes all these results well with relatively few parameters. Variations in the form of functions are attributed to whether the observer uses the out-of-phase mechanism in this task, at what masker contrasts it is used, and how much weight it is given in determining the threshold.
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
Acknowledgements
SEt
This project was partially supported by grants from the US Public Health Service (EY 07201) from the National Eye Institute and the University of California. Some of the results and an early version of the model were presented at the 1994 annual meeting of the Association for Research in Vision and Ophthalmology (Foley, 1994b). Other results and an improved version of the model were presented at the 1995 European Conference on Visual Perception and the 1997 meeting of the Optical Society of America. We thank Jerome Tietz for assistance with the experiments, our observers, AHS and JYS, for the care with which they carried out their task, and two anonymous reviewers for helpful comments on the manuscript.
SIt
Appendix A. Model of the effect of masker spatial phase and temporal offset in pattern masking This model is designed to describe the results of experiments in which the target is a Gabor pattern in cosine phase with the fixation point and the masker is either a Gabor pattern or a full-field grating that varies in phase relative to the target. The model can be generalized to other stimuli, but we do not consider the general case here.
3869
excitatory sensitivity of the mechanisms to target inhibitory sensitivity of the mechanisms to target index for mechanisms excitation of mechanism j halfwave rectified excitation of mechanism j sum of divisive inhibitory inputs (same for all four mechanisms) mechanism parameter. Constant in masking experiments. Equal for all four mechanisms masker contrast above which the 180° mechanism is used in detection the weight given to the out-of-phase mechanism in computing the detection variable, D the detection variable. At threshold D= 1
j E%j Ej I% Z
Cd b
D
A.2. Specification of the stimuli The Gaussian window of target is given by: Let Gt(x,y)=exp(−(y/st)2) exp(−(x/st)2) The Gaussian window of masker is given by:
A.1. Symbols
Gm(x,y)= exp(−(y/sm)2) exp(−(x/sm)2)
Ct, Cm t(x,y), m(x,y) ft, fm, fr
We assume that the mechanisms produce no response to the mean luminance, so only the modulation produced by the target and masker is considered. The modulation produced by the target is:
contrasts of target and masker luminance modulation produced by target and masker center spatial frequency of target, masker and of receptive field spatial frequency sensitivity function st, sm 1/e space constants of target and masker um spatial phase of masker re target (and fixation point) SE0, SE90, excitatory sensitivity parameters of the SE180, SE270 four mechanisms (these were assumed to be equal) SI0, SI90, inhibitory sensitivity parameters of the SI180, SI270 mechanisms (these sensitivities were equated for 0 and 180 and for 90 and 270. Sensitivities at 90 and 270 equalled those at 0 and 180 times a constant, a) a factor by which sensitivity at 90 and 270 differs from that at 0 and 180 t stimulus onset asynchrony (SOA) SEm(t) excitatory sensitivity of the mechanisms to masker at SOA= t SIm(t) inhibitory sensitivity of the mechanisms to masker at SOA= t
t(x,y)= Ct Gt(x,y) cos(2pftx)
(1)
The modulation produced by the masker is: m(x,y)= Cm Gm(x,y) cos(2pfmx+ um)
(2)
Using the theorem for the cosine of the sum of two angles, this can be rewritten: m(x,y)=Gm(x,y) × [Cmcos(um) cos(2pfmx) −Cm sin(um) sin(2pfmx)]
(3)
The masker plus the target is: mt(x,y)= Cm Gm(x,y) × [cos(um) cos(2pfmx)− sin(um) sin(2pfmx)] + Ct Gt(x,y) cos(2pftx)
(4)
A.3. Model Assume that there are four mechanisms with receptive fields centered on the fixation point. Each of these
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
3870
receptive fields is assumed to have a linear spatial-temporal sensitivity function. The spatial sensitivity functions are windowed cosines with spatial phases of 0, 90, 180 and 270° relative to the fixation point. We specify the window only as an even function of x and y, as our predictions are independent of its exact form. We do not specify how sensitivity varies as a function of time, but instead allow this function to be determined experimentally. We define the sensitivity of a mechanism at any time t as that number which when multiplied by the contrast of the stimulus will give the response to the stimulus at that time, i.e. S(t) =R(t)/C. We assume that performance is determined by the response of the mechanisms at some time after the target onset. At this time the mechanism response is assumed to be sampled by the detection process. This time is not specified and it is possible that is not fixed relative to the time of target presentation. All the sensitivities that we will consider are sensitivities at that time. The sensitivity to the masker at that time depends on SOA, represented here by t. Hence, sensitivities of the mechanisms to the masker are expressed as functions of t. The spatial-temporal sensitivity functions of the mechanisms are given by: S0(x,y,t)=S0(t) W(x,y) cos(2pfrx)
(6)
S90(x,y,t)=S90(t) W(x,y) sin(2pfrx) S180(x,y,t)=S180(t) W(x,y) ( −cos(2pfrx)) S270(x,y,t)=S270(t) W(x,y) ( −sin(2pfrx)) where t is SOA and W(x,y) is assumed to be an even function of x and the same for all four mechanisms. Thus the spatial sensitivity functions are windowed cosines that have spatial phases of 0, 90, 180 and 270° re the fixation point. We do not specify this window function. A change in the window function has the same effect as multiplying all sensitivities by a constant. Mechanism sensitivity is space-time separable. Although expressions for the excitation of the different mechanisms by our stimulus patterns are somewhat complex, all can be greatly simplified and written as functions of pattern contrast, spatial phase, and time re stimulus onset. We illustrate this by deriving the expression for the excitation of the 0 phase mechanism by the masker alone. This is given by: E%0m(t)= =
& & & &
−
−
−
−
S0(x,y,t) m(x,y) dx dy S0(t) W(x,y) cos(2pfrx) Gm(x,y)
[Cm cos(um) cos(2pfmx) −Cm sin(2pfmx)]dx dy
& &
Factoring out the terms that are independent of x: =S0(t)Cm cos(um)
−
−
W(x,y)Gm(x,y)
cos(2pfrx) cos(2pfmx) dx dy = S0(t)Cm sin(um)
& &
n
−
−
cos(2pfrx) sin(2pfmx) dx dy
n
W(x,y)Gm(x,y) (7)
= S0(t)Cm cos(um)K0mcc − S0(t)Cm sin(um)K0mcs Although we will write the equations for the general case ( fm " fr), we found that a model that assumes fm = fr describes the results almost as well as a model that allows fr to be a free parameter. Consequently, we assume fm = fr. This eliminates one of the two terms in each excitation function, the term containing the product of the sine and the cosine, since the integral in this term is 0. With this assumption, the masker excitation function may be written: E%0m(t)= CmSE0m(t) cos(um),
(8)
where SE0m(t) (excitatory sensitivity of the mechanism to the masker when it is at phase 0° re the fixation point) is the product of the terms that are independent of masker contrast. For the other three mechanisms the excitation produced by the masker alone can be shown to be: E%90m(t)= CmSE90m(t) sin(um),
(9)
E%180m(t)= CmSE180m(t)(− cos(um)),
(10)
E%270m(t)= CmSE270m(t)(− sin(um)),
(11)
Since in our experiments the target is presented only at 0 phase, the corresponding equations for excitation by the target alone do not depend on phase. They are: E%0t = CtS0t
(12)
E%90t = 0 E%180t = − CtS0t E%270t = 0 Since excitation is a linear process, the net excitations produced by target plus masker are the sums of their individual excitations: E%0mt = CtSEt + CmSEm(t) cos(um)
(13)
E%90mt = CmSEm(t) sin(um) E%180mt = − CtSEt − CmSEm(t) cos(um) E%270mt = − CmSEm(t) sin(um) Thus, there are two excitatory sensitivity parameters, SEt and SEm(t). The latter depends on SOA. We assume that the net excitation is halfwave rectified, so that negative excitation is transformed to 0. This transformation can be expressed as:
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
(14)
E =max(E%,0)
(Note that an unprimed E is used to represent halfwave rectified excitation). The response of a mechanism depends both on its net excitation and the total divisive inhibition that it receives. The divisive inhibition, I, is derived from all four mechanisms. The component inhibition produced by the masker in each mechanism is: I%0m(t)= CmSIm(t) cos(um)
(15)
I%90m(t)=aCmSIm(t) sin(um) I%180m(t)= − CmSIm(t) cos(um) I%270m(t)= − aCmSIm(t) sin(um), where a is a factor by which sensitivity at 90 and 270 differs from that at 0 and 180. The component inhibition produced by the target in each mechanism at the time that the response is sampled is: I%0t = CtSItcos (um)
(16)
I%90t =0 I%180t = − CtSIt cos (um) I%270t =0 Divisive inhibition is first summed across stimulus components for each mechanism, then the divisive inhibition produced by each mechanism is half-wave rectified and raised to the power q. The inhibitory terms from each mechanism are then summed to give I: q
Ij = [max(I%j,0)]
(17)
I = %Ij
(18)
j
The response of a mechanism is: (19)
Rj = Ej /(I + Z),
where Z is a constant in masking experiments. Detection depends on the difference between the response to masker plus target and the response to masker alone in one or more mechanisms. More specifically, the behavioral threshold depends on the value of the detection variable, D, where:
n
D = %b Rjmt − Rjm 4 j
1/4
,
(20)
where the sum is taken over the four phase mechanisms. At threshold D = 1. Which mechanisms contribute to detection varies with conditions, observers and the contrast of the masker. In the case of our experiments only the mechanisms at 0 and 180° can contribute to detection because the other two mechanisms are insensitive to our 0° target. We set the weight of the 0° mechanism to 1. The weight given to the 180° mechanism depends on the
3871
masker contrast and the observer. If Cm 5 Cd, the weight of the 180° mechanism is 0. If Cm \ Cd, the weight of the 180° mechanism is b, where Cd and b are parameters of the model. So in our case, Eq. (20) may be rewritten: D= [ R0mt − R0m 4 + b R180mt − R180m 4]1/4 for Cm \ Cd, and D= R0mt − R0m for Cm 5Cd. (21) Except for the weighting coefficient, b, this corresponds to Quick’s Rule (1974) for integrating over mechanism responses. The assumption that links mechanism responses to the behavioral threshold (Eq. (21)) reflects the fact that both D0 and D180 carry information about the presence of the target. The data show, however, that the D180 signal is not always used. It is used when the masker contrast exceeds the detection rule criterion, Cd, which varies across conditions and across observers. Even when the contrast is above this threshold, the D180 signal is not always used optimally as the weight given to it, b, is sometimes less than 1. In fitting the TvP data, we found that the best fits were obtained by assuming that only the 0° mechanism response was used to make the decision. This implies that in this experiment b= 0. Since the value of Cd makes no difference here, we arbitrarily fix it equal to 1 (see Table 1, experiment 4).
References Albrecht, D. G., & Geisler, W. S. (1991). Motion selectivity and the contrast-response function of simple cells in the visual cortex. Visual Neuroscience, 7, 531 – 546. Albrecht, D. G., & Hamilton, D. B. (1982). Striate cortex of monkey and cat: contrast response function. Journal of Neurophysiology, 48, 217 – 237. Bonds, A. B. (1989). Role of inhibition in the specification or orientation selectivity of cells in the cat striate cortex. Visual Neuroscience, 2, 41 – 55. Bowen, R. W. (1995). Isolation and interaction of ON and OFF pathways in human vision: pattern polarity effects in contrast discrimination. Vision Research, 35, 2479 – 2490. Bowen, R. W. (1997). Isolation and interaction of ON and OFF pathways in human vision: contrast discrimination at pattern offset. Vision Research, 37, 185 – 198. Bowen, R. W., & Wilson, H. R. (1994). A two-process analysis of pattern masking. Vision Research, 34, 645 – 657. Boynton, G. M. (1994). Temporal sensiti6ity of human luminance pattern mechanisms determined by masking. Doctoral dissertation, University of California, Santa Barbara. Boynton, G. M., & Foley, J. M. (1999). Temporal sensitivity of human luminance pattern mechanism determined by masking with temporally modulated stimuli. Vision Research, 39, 1641– 1656. Campbell, F. W, & Kulikowski, J. J. (1966). Orientational selectivity of the human visual system. Journal of Physiology, 187, 437–445. Chen, C. C., Foley, J. M., & Brainard, D. B. (1997). Detecting chromatic patterns on chromatic pattern pedestals. IS&T/OSA Proceedings. Optics and Imaging in the Information Age, 19–24.
3872
J.M. Foley, C.-C. Chen / Vision Research 39 (1999) 3855–3872
Field, D. J., & Nachmias, J. (1984). Sensitivity to spatial phase. Vision Research, 20, 391–396. Foley, J. M. (1994a). Human luminance pattern vision mechanisms: masking experiments require a new model. Journal of the Optical Society of America A, 11, 1710–1719. Foley, J. M. (1994b). Spatial phase sensitivity of human pattern vision mechanisms determined by masking. In6estigati6e Ophthalmology and Visual Science (suppl.), 35, 1900. Foley, J. M., & Schwarz, W. (1998). Spatial attention: effect of position uncertainty and number of distractor patterns on the threshold versus contrast function for contrast discrimination. Journal of the Optical Society of America A, 15 (in press). Georgeson, M. A. (1988). Spatial phase dependence and the role of motion detection in monocular and dichoptic forward masking. Vision Research, 28, 1193–1205. Georgeson, M. A., & Georgeson, J. M. (1987). Facilitation and masking of briefly presented gratings: time-course and contrast dependence. Vision Research, 27, 369–379. Gorea, A. (1987). Masking efficiency as a function of stimulus onset asynchrony for spatial frequency detection and identification. Spatial Vision, 2, 51 –60. Heeger, D. J. (1991). Nonlinear model of neural responses in cat visual cortex. In M. S. Landy, & J. A. Movshon III, Computational models of 6isual processing (pp. 120–133). Cambridge, MA: MIT Press. Heeger, D. J. (1992). Normalization of cell responses in cat striate cortex. Visual Neuroscience, 9, 181–197. Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cats visual cortex. Journal of Physiology, 160, 106–154. Hubel, D. H., & Wiesel, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology, 195, 215 – 243. Kulikowski, J. J. (1976). Effective contrast constancy and the linearity of contrast sensation. Vision Research, 16, 1419–1431. Lawton, T. B., & Tyler, C. W. (1994). On the role of X and simple-cells in human contrast processing. Vision Research, 34, 659 – 667. Legge, G. E., & Foley, J. M. (1980). Contrast masking in human vision. Journal of the Optical Society of America, 70, 1458 – 1471. Morrone, M. C., & Burr, D. C. (1988). Feature detection in human vision: a phase dependent energy model. Proceedings of the Royal
.
Society of London, B, 235, 221 – 245. Nachmias, J., & Sansbury, R. V. (1974). Grating contrast: discrimination my be better than detection. Vision Research, 14, 1039–1042. Pollen, D. A., & Ronner, S. F. (1981). Phase relationships between adjacent simple cells in the visual cortex. Science, 212, 1409–1411. Press, W. H., Flannery, B. P., Teukolsky, S. A., & Vetterling, W. T. (1986). Numerical recipes: the art of scientific computing. Cambridge: Cambridge University Press. Quick, R. F. (1974). A vector magnitude model of contrast detection. Kybernetic, 16, 65 – 67. Ross, J, & Speed, H. D. (1991). Contrast adaptation and contrast masking in human vision. Proceedings of the Royal Society of London B, 246, 61 – 69. Rousseeuw, P. J. (1991). Tutorial to robust statistics. Journal of Chemometrics, 5, 1 – 20. Stromeyer, C. F. III, & Klein, S. (1974). Spatial frequency channels in human vision as asymmetric (edge) detectors. Vision Research, 14, 1409 – 1420. Teo P. C., & Heeger, D. J. (1994). Perception image distortion. Human Vision, Visual Processing and Digital Display V, IS&T/ SPIE’s Symposium on Electronic Imaging: Science and Technology, SPIE Proceeding 2179, 127 – 141. Watson, A. B. (1986). Temporal sensitivity. In K. R. Boff, L. Kaufman, & J. P. Thomas, Handbook of perception and human performance, vol. 1. New York: Wiley. Watson, A. B., Nielson, K. R. K., Poirson, A., Fitzhugh, A., Bilson, A., Ngunyen, K., & Ahumada, A. J. Jr. (1986). Use of a raster framebuffer in vision research. Beha6ior Research Methods and Instrumentation: Computers, 18, 587 – 594. Watson, A. B., & Pelli, D. G. (1983). QUEST: a Bayesian adaptive psychometric method. Perception & Psychophysics, 33, 113–120. Watson, A. B., & Solomon, J. A. (1997). Model of visual contrast gain control and pattern masking. Journal of the Optical Society of America A, 14, 2379 – 2391. Wilson, H. R., McFarlane, D. K., & Phillips, G. C. (1983). Spatial frequency tuning of orientation selective units estimated by oblique masking. Vision Research, 23, 873 – 882. Yang, J., & Makous, W. (1995). Modeling pedestal experiments with amplitude instead of contrast. Vision Research, 35, 1979–1989. Yang, J., Qi, X., & Makous, W. (1995). Zero frequency masking and a model of contrast sensitivity. Vision Research, 35, 1965–1978.