Vision Research 46 (2006) 1091–1098 www.elsevier.com/locate/visres
Saliency from orthogonal velocity component in texture segregation Clara Casco *, Alba Grieco, Enrico Giora, Massimiliano Martinelli Dipartimento di Psicologia Generale, Universita` di Padova, Via Venezia 8, 35131 Padova, Italy Received 13 June 2005; received in revised form 8 September 2005
Abstract We found that a moving target line, more-vertical than 45 deg-oriented background lines, pops-out (d 0 = 1.2) although it moves at the same speed of background elements and although it is invisible in static presentation (d 0 = .7). We suggest that the moving more-vertical target is more salient because the motion system responds to the orthogonal-velocity-component (V? = Dd/Dt sinH) that is larger for the more-vertical target than for distracters. However, motion does not produce high d 0 when the target is more horizontal than background (d 0 = .6). This result is not expected if saliency resulted from the sum of saliency of orientation and motion independently coded but is instead predicted by visual search asymmetry. A line length effect on the moving target saliency also suggests that V? is extracted on the whole line and this operation is facilitated by line length in the same way for more-vertical and more-horizontal targets. Altogether, these results demonstrate that speed-based segmentation operating on V? not only affects speed and direction of motion discrimination, as previously demonstrated, but accounts for high saliency of image features that would otherwise prove undetectable of the basis of orientation-contrast. 2005 Elsevier Ltd. All rights reserved. Keywords: Saliency; Motion; Orientation; Texture
1. Introduction Since the pioneering work of Wertheimer (1912), considerable effort has been put into the analysis of ‘‘common fate’’ as a motion segmentation mechanism that allows the visual system to segment figures from the background on the basis of common direction and speed, and consequently to increase their saliency. How does motion, which involves a change in physical position over time, account for increased saliency of the moving image? The common fate phenomenon demonstrates that saliency from motion does not depend simply on motion perception, resulting from interpretation imposed on figures perceived as segregated from the background at different locations and times (third-order motion, Lu & Sperling, 2001). Independent of any computation or representation of the spatial locations of features segmented from the ground, saliency from motion can result from computation of motion energy, that *
Corresponding author. Tel.: +39 0498276611; fax: +39 0498276600. E-mail address:
[email protected] (C. Casco).
0042-6989/$ - see front matter 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.visres.2005.09.032
is, on the basis of extraction of the velocity of spatial-temporal variation of light intensity (first-order motion) or contrast (second-order motion) (Adelson & Bergen, 1985; Van Santen & Sperling, 1985; Watson & Ahumada, 1985). The velocity of a moving stimulus in one spatial dimension (i.e., horizontal motion in the fronto parallel plan) is given by Dd/Dt, which refers to the change in position (Dd) divided by the time taken (Dt). When a 2-D stimulus (i.e., an oriented line) moves, the spatio-temporal intensity changes alone are not sufficient to determine the local direction and magnitude of velocity, owing to the aperture problem (Marr & Ullman, 1981; Nakayama & Silverman, 1988), which arises because any motion unit, with a receptive field of a limited spatial extent, has access only to a motion component normal to the orientation of a motion contour that encompasses its limits. Because of the aperture problem, perceived velocity of a two-dimensional stimulus reflects a component of velocity orthogonal to line orientation (V?), i.e., Dd/Dt modulated by the orientation (H) of the 2-D stimulus (the line) in space:
1092
C. Casco et al. / Vision Research 46 (2006) 1091–1098
V ? ¼ Dd=Dt sin H i.e., orthogonal velocity component
ð1Þ
For example, if the velocity of a moving line segment in the horizontal direction is 1 deg/s, and its orientation is 45 deg, then V? will be equal to .68 deg/s—since V? = V sin (45 deg). For a line orthogonal to the horizontal direction of motion V? will be equal to 1 deg/s. Several authors (Adelson & Movshon, 1982; Hildreth, 1984; Marr & Ullman, 1981) agree that the local process for motion detection extracts only this well-defined component of velocity (V?). If the motion system extracted V?, the perceived direction of moving lines should depend on their orientation relative to the direction of motion. This has been observed in psychophysical studies (Loffler & Orbach, 2001; Lorenceau, Shiffrar, Wells, & Castet, 1993). Moreover, since V? decreases as the orientation becomes more similar to the direction of motion, perceived speed should be reduced when lines are tilted in the direction of motion. Results from Castet, Lorenceau, Shiffrar, and Bonnet (1993b) and Scott-Brown and Heeley (2001) confirm this prediction. One important question is whether there is an effect of a particular combination of orientation and direction of motion, not only on perceived direction and speed but also on saliency of form. Since motion helps segregation of a target from the background, differences in V? could extend to line saliency. Based on this hypothesis, a prediction can be made that given two line targets presenting the same contrast of orientation (DH) with background lines, the more-vertically oriented target could be more salient than the more-horizontal target, even though direction of motion and velocity of target and background elements are the same. This is because of V?, which is larger for the more-vertical than the more-horizontal target. Our previous findings (Casco & Ganis, 1999; Casco, Caputo, & Grieco, 2001) that a more-vertical target, invisible when stationary, pops-out from background immediately on presentation in apparent motion, lead to this hypothesis. However, in these studies other motion cues (either direction or speed) could account for the effect. In this work, we addressed more specifically two questions relative to the saliency mechanism. First, we asked how general is the contribution of V? to account for target saliency. In particular, we considered whether saliency depended on V? rather than on the other motion cue in the direction of displacement, signalled at the line terminators, both for more-vertical and more-horizontal (Experiments 2 and 3) targets and for target oriented as background. We also ask whether V? accounts for saliency not only when target and distracters move at the same speed (Experiment 2) but also when distracters are static (Experiment 3). The second question is whether V? results from local or global motion process (Experiment 4). To address these issues, we compared saliency (d 0 ) of one element of a line texture (the target), defined by contrast of orientation (DH) with respect to line-segments in the back-
ground, contrast of speed Dd/Dt, resulting from contrast of V? only, and a combination of these. 2. General method 2.1. Stimuli Stimuli were generated by a Cambridge Research System VSG graphics card with 12-bit luminance resolution and displayed on a c-corrected Sony Trinitron monitor with a resolution of 1024 · 768 pixels, with square pixel (0.97 0 ), refreshed at 100 Hz. They were free-viewed binocularly and presented at the centre of a monitor placed at 114 cm from the observerÕs eye. 2.2. Frame definition Each frame contained a background textured region made up of line segments slanted 45 deg clockwise (Fig. 1). Line length was equal to 17.4 0 in all experiments, except Experiment 4 where three line lengths were used (17.4, 21.2 and 25.1 0 ). The texture elements were arranged on a 4 · 4 raster subtending 2.2 · 2.2deg. The spatial position of the elements was randomly deviated from alignment by slight horizontal and vertical change in position around the raster centre (between 0 and 1.96 0 ). In each trial, the target was a line segment, with different orientation with respect to the background lines, presented within the background line-texture in a randomly chosen raster cell. The target differed in orientation with respect to background lines orientation either clockwise (more-horizontal: ‘‘more-h’’) or counter-clockwise (more-vertical: ‘‘morev’’). Target-background orientation contrast (DH) could be equal to 0, 5, 7 or 9 deg. The luminance of target and background elements was 6 cd/m2. Background luminance was 0.33 cd/m2. Differences in luminance of line elements across different orientations due to monitor anisotropy were carefully controlled in two ways. First, the luminance was matched for lines of different orientation through adjustment of the look-up table. Second, each stimulus condition was viewed with the monitor oriented in two different ways: in half of the experimental sessions, the monitor was upright; in the other half it was rotated by 90 deg. 2.3. Frames sequence In each trial, stimuli consisted of a sequence of two overlapping successive frames. Frame duration and inter-frame interval (IFI) were fixed (33 ms). 2.4. Moving and static stimuli The target and background elements were either static, with a displacement from frame to frame equal to 0 (Dd = 0), or in apparent motion, with Dd equal to 3.92 0 in the horizontal direction in all experiments except Experiment 3, in which the vertical direction was also tested. The
C. Casco et al. / Vision Research 46 (2006) 1091–1098
1093
on average, half the background elements moved in the same direction as the target. Target velocity (defined as the ratio of target displacement (Dd) to the stimulus onset asynchrony (Dt), (Dd/Dt)), was equal to 0.98 deg/s. V? for moving distracters was equal to 0.68 deg/s. V? for the target more-h and more-v than distracters (for DH equal to ±9 deg) were equal to 0.56 and 0.80 deg/s, respectively, and proportionally scaled in the other two DH conditions. 2.5. Task The observers, (n = 11, eight naı¨ves and three of the authors) with normal or corrected to normal vision, were asked to perform a two alternative forced choice (2AFC) task in which they had to indicate the presence or absence of a target segregated from the background (Fig. 2). 2.6. Design Target directions of tilt (more-h vs. more-v), the levels of target-background orientation (Experiments 2, 3 and 4) and target presence (present vs. absent) were within-block factors. 2.7. Statistical analysis Repeated measures ANOVA was conducted to compare mean d 0 s each obtained by averaging d 0 s obtained in repetitions of the same condition. The sphericity of the data was tested with MauchlyÕs Test of Sphericity. The sphericity assumption was always supported by this test, and consequently the Greenhouse–Geisser correction of the degrees of freedom was not applied. As the data from upright monitor and the monitor rotated by 90 deg were not statistically different, they were not treated separately. 3. Experiment 1
Fig. 1. A representation of one frame of the two-frame motion stimulus. The frame contained a textured region made up of 16 line segments slanted 45 deg clockwise arranged on a 4 · 4 raster subtending 2.2 · 2.2 deg of visual angle. Each line position was deviated from collinearity by randomly modifying both horizontal and vertical positions around the raster centre. The target was the simple line element having a different orientation. It was positioned randomly in the raster, except for the outermost rows and columns. Target-to-background orientation difference (target tilt) was equal to 0 in Experiment 1 and varied in Experiments 2 and 3 (0, 5, 7 or 9 deg) either counter clockwise (more-v) or clockwise (more-h).
more-h target was more parallel to the direction of motion, whereas the more-v target was more orthogonal to the direction of motion with respect to background elements. The direction of motion of background elements was the same as the target or opposed (180 deg), randomly. Thus,
In Experiment 1, we measured d 0 in detecting the presence of a target segregating on the basis of motion contrast alone. DH was equal to 0. Dd of the target was equal to 3.92 0 (speed equal to 0.98 deg/s). Dd for distracters varied in independent blocks according to three levels: 0.98, 1.96 and 2.94 0 corresponding to a speed of 0.25, 0.5 and 0.75 deg/s. Target displacement was rightwards, whereas the displacement for distracters was either rightwards or leftwards, randomly, with equal probability. A block consisted of 120 trials and the target was present in half of the trials, randomly. Mean d 0 of each subjects (CC, AG, FZ), performing six blocks each (Fig. 3), increased with target-background contrast velocity (from 0.25 to 0.75 deg/s) [F2,4 = 39.9; p < .003]. The results of the present experiment show that the smallest contrast velocity was below threshold for all subjects (mean d 0 equal to 0.28). When background elements move d 0 increases exponentially with velocity contrast. As expected, sensitivity for discriminating velocity
1094
C. Casco et al. / Vision Research 46 (2006) 1091–1098
Fig. 2. The sequence of events in two successive trials. In each trial, stimuli consisted of a sequence of two overlapping frames. Stimulus duration was 33 ms. Inter frame interval (IFI) was equal to 33 ms. In the two trials represented in figure, the target is static, more-vertical in the first trial and morehorizontal in the second trial. Trials with a moving target are like those with static target with the only difference that the target was displaced either rightwards by 3.92 0 , either horizontally or vertically, from frame to frame.
Fig. 3. d 0 as a function of contrast velocity (0.25, 0.5, 0.75 deg/s) for individual subjects. The heavy black line fitted on the mean d 0 results indicates that sensitivity increases exponentially with orientation contrast (equation: Y = .1107 exp (3.8022*x), R2 = .9998).
in our multi-element displays is much lower than sensitivity for discriminating velocity (5%) with only two stimuli (McKee, 1981). 4. Experiment 2 Here, we tested how d 0 depended on DH (equal to 5, 7 and 9 deg) in static and moving target conditions. In both conditions, Dd for background elements was equal to 3.92 0 . Dd for the target was 0 when static (only H contrast was present) and 3.92 0 when moving (both H and motion contrast was present). A block consisted of ten repetitions of each target-background angle difference for each direction of tilt, for both target present and target absent conditions.
The results are shown in Figs. 4A and B. Repeated measures ANOVA with kinetic target condition (static or in apparent motion) and tilt-direction as factors, executed on mean individual d 0 s, resulting from 12 repetitions of each orientation level (six with the monitor upright and six with it rotated), revealed a significant effect tilt-direction [F1,2 = 91.46, p < .02], and kinetic target condition · tilt-direction interaction [F1,2 = 240.13, p < .005]. As evident from Fig. 4, the static target was barely detectable at both orientations. Mean d 0 in 5, 7 and 9 deg orientation contrast condition was equal to .1, .7, .7 and to .03, .8, .7 with horizontal and vertical target, respectively. Mean sensitivity to the moving target was significantly larger than the static when more-v (mean d 0 equal to .3, .8, 1.2) not when more-h (mean d 0 equal to .2, .4, .6) [p < .01 in post hoc with Tukey comparison test]. This indicates that saliency for the moving target is not symmetric with respect to DH, suggesting that V? rather than Dd/Dt accounts for target saliency. The results of this experiment show that the visual system is sensitive to a very small V?difference between target and background elements (.12 deg/s). This difference is smaller than the smallest velocity that allows the target to be discriminated from background elements only on the basis of motion contrast (results of Experiment 1). The sensitivity to V?contrast we found is less than 10%, a value comparable to that for discriminating velocity (5%) with only two stimuli (McKee, 1981). One obvious reason for this is that when, as in Experiment 2, an orientation difference is present, V? differs not only in amplitude but also in direction and this may render sensitivity to V?contrast very high.
C. Casco et al. / Vision Research 46 (2006) 1091–1098
1095
was equal to 3.92 0 , either horizontally or vertically, in independent blocks. Target velocity was equal to .98 deg/s and V? (for DH = ±9 deg) was equal to .80 deg/s for target more-v, .68 deg/s for target tilted as background and .58 deg/s for target more-h. Four subjects participated (two naı¨ves and two of the authors). The observerÕs task was to indicate the presence or absence of a moving target amongst non-collinear background elements. Two repeated measures ANOVAs, one with direction of motion (vertical: ‘‘V’’ vs. horizontal: ‘‘H’’) and tilt-direction (more-h and more-v) as factors and one to compare more-v, 0 and more-h motion, were executed on the mean of the four d 0 s, obtained with two monitor directions (upright and rotated) and two target orientations (7 and 9 deg). The results of the first ANOVA showed a non-significant effect of direction of motion [F1,3 = 6.7, p > .05]. Moreover, the increased saliency of the more-v target [F1,3 = 66.8, p < .005] did not depend on direction of motion [F1,3 = 1.62, p > .05]. These results, illustrated in Fig. 5A, indicate that although distracters were static, the tilt-direction effect is still present at both directions of motion, suggesting that target saliency was still modulated by V? regardless of whether the target moves horizontally or vertically. Moreover, a tilt effect was also revealed by the second ANOVA [F2,6 = 17.2, p < .005]. As Fig. 5B shows, Fig. 4. Individual d 0 as a function of orientation difference for target more-h (dotted lines) and more-v (continuous line) separately for moving (filled circles) and static (unfilled circles ) targets. Dd was equal to 3.92 0 for background elements and either 3.92 0 (A) or 0 (B) for the target.
5. Experiment 3 In Experiment 3, we checked for the generality of the tilt effect. First, we tested an alternative interpretation that, rather than being sensitive to differences in the velocity component perpendicular to the moving line, observers were more sensitive to horizontal motion. Indeed, the more vertical is the target line, the more V? is similar to the velocity in the displacement direction. Second, we asked whether V? accounts for target saliency also when the target pops-out because it is the only element which moves (Dd for distracters is 0). Since at the line terminators Dd/ Dt itself provides a strong motion cue for segregating the target, the further cue provided by V? could not affect performance. The simple prediction is that, when distracters are static, the tilt effect should disappear if V? was irrelevant, whereas we should still have a tilt effect like in Experiment 2 if V? is still used. We addressed this last issue by comparing d 0 when the target was defined by DH and motion contrast, with respect to when it was defined by motion contrast alone. Within a block, three target conditions were randomly interleaved: target oriented more-horizontally, more-vertically (DH = ±9 or ±7 deg) and as distracters (DH = 0). Dd for background elements was equal to 0 and that for the target
Fig. 5. (A) Individual d 0 obtained in the more-h and the more-v tilt condition for both horizontal (H) and vertical (V) direction of motion. (B) Individual d 0 separately for the more-h, the more-v target and for the target oriented as the background averaged for 7 and 9 deg and for horizontal and vertical direction of motion.
1096
C. Casco et al. / Vision Research 46 (2006) 1091–1098
d 0 obtained with the target tilted as the distracters (1.74) was larger than that obtained with more-h target (1.4) and smaller than that obtained with more-v target (2.27), indicating that V? is a strong cue for motion also when distracters are static, regardless of direction of motion. 6. Experiment 4 Castet et al. (1993b) found a length effect, asymmetric with line orientation in a speed judgement with lines longer than ours. We assessed whether an asymmetric length effect was present when the subjects had to judge the presence of the line rather than its speed. This was done by varying line length in independent conditions according to three levels: 17.4, 21.2 and 25.1 0 . Distracters were static and target Dd was equal to 3.92 0 , rightwards. Seven observers (five naı¨ves and two of the authors) participated, each performing the same number of sessions with the monitor upright and rotated. The results are shown in Fig. 6. Repeated measures ANOVA, executed on the average d 0 , revealed a significant effect of kinetic target condition [F1,6 = 107.1, p < .0005], tilt-direction [F1,6 = 8.7, p < .03], length [F2,12 = 21.2, p < .0005], kinetic condition · tilt-direction interaction [F1,6 = 23.2, p = .003] and kinetic condition · length interaction [F2,12 = 3.9, p < .05]. These results replicate those of previous experiments: d 0 was larger when the target was moving and the tilt-direction effect was only present when the target was moving, confirming the suggestion that V? in this condition is a cue for target detection. The new result is that the increase in d 0 with line length was significant only with moving targets, indicating that only the moving line was more salient when longer. Note however that the length effect is the same for more-h and more-v targets (Burr, 1981). Instead, Castet et al. (1993b) found an asymmetry of length effect due to line orientation in a speed judgement with lines longer than ours. They interpreted these results
Fig. 6. Mean d 0 as a function of line length, separately for moving (filled circles) and static (unfilled circles) targets, and target direction of tilt morehorizontal (broken lines) and more-vertical (continuous lines) than background lines.
as indicative of some global integration operation of local motion signals along the line. Motion integration has been suggested by Hildreth (1984) and Castet et al. (1993a) as a form of spatial pooling over the population of active neurons. In this model, the pooled signal should be a simple function of the number of units responding, and should therefore be dependent on line length. We did not find this asymmetry, may be because it does not manifest to this range of lengths. However, we found a length effect on the basis of which we cannot exclude a global motion process. Instead of resulting from pooling, after V? has been computed on small segments of the line, it could result from facilitation of extraction V?. Longer lines could make the extraction of V? itself easier, in a similar way for moreh and more-v targets (Burr, 1981). 7. General discussion To summarize, we found that a moving target line, more vertical than 45 deg-oriented background lines, is more salient than a static target (larger d 0 ) even if it moved at the same speed of background elements, indicating that the motion system detects an increase of orthogonal velocity component (V? = Dd/Dt sinH). To confirm that V? is the cue, we have to rule out the alternative obvious explanation that the larger d 0 for the target results from the sum of saliency for two dimensions, orientation and motion, which could be possible if the two dimensions were coded independently (Nothdurft, 2000). Both, results of Experiments 2 and 3 demonstrate that this is not the case. Indeed, assuming that d 0 for target oriented the same as the distracted (condition with DH = 0 of Experiment 3) indicates baseline saliency (since V? contrast is the same for target and distracters), if orientation and motion were coded independently, we would expect saliency to increase linearly when a second dimension (a difference in orientation) was added. We found an increase of d 0 for the more-v target, whereas the more-h target is significantly less salient than the target oriented as the background, a result that rules out the independent coding hypothesis. However, opposed suggestion of dependent coding, as resulting from the physiology, cannot be accepted either. Indeed, if all cells responding to both orientation and direction of motion of a target were maximally activated by one dimension (i.e., the large targetÕs motion contrast in Experiment 3), one should expect no or only little increase in the responses when the target also displays orientation contrast. In agreement with Casco et al. (2001), our suggestion is that the larger saliency of moving with respect to static targets results from V? contrast. This implies a different model of dependent coding based on an integration process that takes together form and motion and responds to V?. Our finding of a line length effect specific for the moving target conditions, similar to that previously found by
C. Casco et al. / Vision Research 46 (2006) 1091–1098
other authors (Castet et al., 1993b) suggests indeed that the mechanism analysing together form and motion has global properties. Our results are compatible with a global processing model similar to that suggested by other authors (Casco et al., 2001; Mather, 2000), in which V? is extracted directly from the whole line, in some integration process, taking together form and motion. The length effect suggests that, regardless line orientation, this operation is facilitated by line length, may be because the result is less noisy when lines are enough long. The existence of a global motion mechanism tuned to large scales of the order of the line length, able to process the motion of the line as a whole, is supported by the motion capture effect described by Ramachandran and Cavanagh (1987). It is interesting to speculate whether there is a physiological correlate of the result that extraction of V? is a powerful cue that accounts for increased saliency of moving with respect to static images. Single-neuron electro-physiology experiments in the visual cortex of macaque monkey and cats have identified in V? ‘‘component-motion cells’’ that respond to the direction of motion orthogonal to local contours (Albright, 1984; Movshon, Adelson, Gizzi, & Newsome, 1986). In higher motion areas, ‘‘pattern cells’’ respond to the direction of motion independent of the orientation of contours making up the stimulus pattern (Li, Chen, Li, Wang, & Diao, 2001). This suggests that saliency for motion mechanism operates at very early level of processing in the central visual system. We remain with the last problem of explaining why, although d 0 increases with line length also when the target is more-h its detection depends either on orientation contrast (when there are no motion cues, as in Experiment 2) or on Dd/Dt, when defined also by motion contrast (as in Experiment 3) but not on a combination of the two as the vertical target does. This suggests that, although V? is available even when it was smaller or equal than V? for distracter (see results of Experiment 3), different mechanisms account for target saliency depending on whether V? of the target is smaller or larger than V? of distracters. If the target has the smallest V? (target more-h than distracters), saliency is accounted for either by a mechanism that tracks the target with different orientation (when, as in Experiment 2, target and distracters move at the same speed, as predicted by third-order motion, Lu & Sperling, 2001), or by motion contrast (when, as in Experiment 3, distracters are static). It is interesting to speculate why a V? for the target smaller than V? for distracters does not allow the target to pop-out as a V? larger than distracters does. Note that the pop-out effect is different from the similarly unexpected pop-out effect found in search for conjunction of motion and form (McLeod, Driver, & Crisp, 1988), since our target differentiates from distracters on the basis of simple feature differences (i.e., differences in V?) not of conjunction of features. Therefore, the presence or absence of pop-out effect, depending on the direction of target and distracters
1097
difference in V? is rather interpretable on the basis of Duncan and HumphreysÕ general theory of search (1989) according to which target-distractors similarity is the crucial factor. However, this theory is not sufficient because our main result is an asymmetry of the effect of V?, depending on its direction. Asymmetries in visual search have been found in several features dimensions (Treisman & Gormican, 1988; Wolfe, 2001). Ivry and Cohen (1992) proved this asymmetry to hold for motion speed. They found that searching for a fast target among slow distracters, search performance was minimally affected by distracters number. In contrast, reaction time to detect a slow target among fast distracters was slow and linearly related to the number of distracters. Interestingly, their interpretation is that the asymmetry cannot be attributed to differences in temporal frequencies or discriminability. The authors considered an hypothesis that the asymmetry reflected differences in representing the output of spatio-temporal filtering (Adelson & Bergen, 1985). They argued that a set of high-pass speed detectors with different low-speed cut-off would explain the results, since there will be a class of cells that respond to a fast target and not the slow distractors, but not a class of cells that will respond to the slow target but not the fast distracters. Indeed, the assumption that speed detectors operate as high-pass filters in the velocity domain, is supported from single-cells recording in MT (Almann, Miezin, & McGuinness, 1985; Maunsell & van Essens, 1983) showing that motion detectors are high-pass filters for speed. Our results agree with this interpretation, which instead cannot apply to other asymmetries in simple-search in which motion is involved (Dick, Ullman, & Sagi, 1987; Driver, McLeod, & Dienes, 1992) but is specific to the way the motion system represents the speed output of spatio-temporal filtering. What are the general implications of the finding that saliency can result primarily from V?? The common fate principle of Gestalt psychology (i.e., common direction and speed of an array of image intensities) can explain why a stimulus, invisible when static, pops-out as soon it starts moving (Wertheimer, 1912) simply on the basis of computation of local motion signals in one single dimension. However, a more complex motion mechanism is required to explain why saliency for the shape increases when an object starts to move, although the target and background speed and direction are the same. In this case, velocity difference or common fate cannot account for increased saliency. We isolated a mechanism different from common fate, which accounts for increased saliency of moving objects in addition to perceived speed and motion direction. Acknowledgment This work was supported by Grants (MIUR 60%, PRIN 2001, 2003 to Clara Casco).
1098
C. Casco et al. / Vision Research 46 (2006) 1091–1098
References Adelson, E. H., & Bergen, J. R. (1985). Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America (A), 2, 284–299. Adelson, E. H., & Movshon, J. A. (1982). Phenomenal coherence of moving visual patterns. Nature, 300, 523–525. Albright, T. D. (1984). Direction and orientation selectivity of neurons in visual area MT of the macaque. Journal of Neurophysiology, 52, 1106–1130. Almann, J., Miezin, F., & McGuinness, E. (1985). Direction- and velocityspecific responses from beyond the classical receptive field in the middle temporal visual area (MT). Perception, 14, 105–126. Burr, D. C. (1981). Temporal summation of moving images by the human visual system. Proceedings of the Royal Society of London (B), 211, 321–339. Casco, C., & Ganis, G. (1999). Parallel search for conjunctions with stimuli in apparent motion. Perception, 28, 89–108. Casco, C., Caputo, G., & Grieco, A. (2001). Discrimination of an orientation difference in dynamic textures. Vision Research, 41, 275–284. Castet, E., Lorenceau, J., & Bonnet, C. (1993a). The inverse intensity effect is not lost with stimuli in apparent motion. Vision Research, 33, 1697–1708. Castet, E., Lorenceau, J., Shiffrar, M., & Bonnet, C. (1993b). Perceived speed of moving lines depends on orientation, length, speed and luminance. Vision Research, 33, 1921–1936. Dick, M., Ullman, S., & Sagi, D. (1987). Parallel and serial processes in motion detection. Science, 237, 400–402. Driver, J., McLeod, P., & Dienes, Z. (1992). Motion coherence and conjunction search: Implications for guided search theory. Perception and Psychophysics, 51, 79–85. Duncan, J., & Humphreys, G. W. (1989). Visual search and stimulus similarity. Psychological Review, 96, 433–458. Hildreth, E. C. (1984). The computation of the velocity field. Proceedings of the Royal Society of London (B), 221, 189–220. Ivry, R. B., & Cohen, A. (1992). Asymmetry in visual search for targets defined by differences in movement speed. Journal of Experimental Psychology: Human Perception and Performance, 18, 1045–1057. Li, B., Chen, Y., Li, B. W., Wang, L. H., & Diao, Y. C. (2001). Pattern and component motion selectivity in cortical area PMLS of the cat. European Journal of Neuroscience, 14, 690–700. Loffler, G., & Orbach, H. S. (2001). Anisotropy in judging the absolute direction of motion. Vision Research, 41, 3677–3692.
Lorenceau, J., Shiffrar, M., Wells, N., & Castet, E. (1993). Different motion sensitive units are involved in recovering the direction of moving lines. Vision Research, 33, 1207–1217. Lu, Z. L., & Sperling, G. (2001). Three-systems theory of human visual motion perception: Review and update. Journal of the Optical Society of America (A), 18, 2331–2370. McKee, S. P. (1981). A local mechanism for differential velocity detection. Vision Research, 21, 491–500. Marr, D., & Ullman, S. (1981). Directional selectivity and its use in early visual processing. Proceedings of the Royal Society of London (B), 211, 151–180. Mather, G. (2000). Integration biases in the Ouchi and other visual illusions. Perception, 29, 721–727. Maunsell, J. H. R., & van Essens, D. C. (1983). Functional properties of neurons in middle temporal visual area of the macaque monkey: I. Selectivity for stimulus direction, speed, and orientation. Journal of Neurophysiology, 49, 1127–1147. McLeod, P., Driver, J., & Crisp, J. (1988). Visual search for a conjunction of movement and form is parallel. Nature, 332, 154–155. Movshon, J. A., Adelson, E. H., Gizzi, M. S., & Newsome, W. T. (1986). The analysis of moving visual patterns. Experimental Brain Research, 11, 117–152. Nakayama, K., & Silverman, G. H. (1988). The aperture problem-II. Spatial integration of velocity information along contours. Vision Research, 28, 747–753. Nothdurft, H. C. (2000). Salience from feature contrast: Additivity across dimensions. Vision Research, 40, 1183–1201. Ramachandran, V. S., & Cavanagh, P. (1987). Motion capture anisotropy. Vision Research, 27, 97–106. Scott-Brown, K. C., & Heeley, D. W. (2001). The effect of the spatial arrangement of target lines on perceived speed. Vision Research, 41, 1669–1682. Treisman, A., & Gormican, S. (1988). Feature analysis in early vision: Evidence from search asymmetries. Psychological Review, 95, 15–48. Van Santen, J. P. H., & Sperling, G. (1985). A temporal covariance model of motion perception. Journal of the Optical Society of America (A), 1, 451–473. Watson, A. B., & Ahumada, A. J. Jr., (1985). Model of human visualmotion sensing. Journal of the Optical Society of America (A), 2, 322–341. Wertheimer, M. (1912). Experimentelle Studien u¨ber das Sehen von Bewegung. Zeitschrift fu¨r Psychologie, 61, 161–265. Wolfe, J. M. (2001). Asymmetries in visual search: An introduction. Perception and Psychophysics, 63, 381–389.