Vision Res. Vol. 30, No. IO, pp. 1421-1428, 1990 Printedin GreatBritain.All rightsreserved
0042-6989/90
$3.00 + 0.00
Copyright0 1990PergamonPressplc
THE USE OF DIFFERENT FEATURES BY THE MATCHING PROCESS IN SHORT-RANGE MOTION A. SIMPBON
WILLIAM
Department of Psychology, University of Winnipeg, 515 Portage Avenue, Winnipeg, Manitoba, Canada R3B 2E9 (Received 26 June 1989; in revisedform 23 March 1990) Ahstraet-The perception of the direction of motion in luminance kinematograms breaks down when the displacement exceeds a few element widths (this limit is called 0,). When kinematograms whose elements differ only, in hue are used, motion can be seen but performance declines at smaller displacements. The short-range process, once thought to be only a luminance correlator, is thus able to use hue. If the sixe ofD_ indicates how well a feature stimulates the motion sensors, hue might be said to have an especially weak input to motion sensors. In order to find the relative potency of luminance and hue as bases of the short-range matching process, D, for these features was compared to those obtained for kinematograms whose elements differed in phase (T-phase or L-phase) or orientation (/or\). The matching process could use all the features tested. Luminance seems to be the pmferred basis for motion matching, with hue and phase (a tie) yielding smaller D_ and orientation the smallest. Motion
Short-range process
Correspondence problem
INTRODUCTION
Most models of real or “short-range” visual motion sensors take as their input the motion of elements having some luminance contrast with neighbouring areas (for a review see Nakayama, 1985). A favourite stimulus for studying such sensors is the random-dot kinematogram-a two-frame motion display typically composed of 50% white and 50% black elements with the second frame being a displaced version of the first. The perception of the direction of motion breaks down as the displacement increases, hence the name short-range (Braddick, 1974). The theory that there are separate long-range and short-range motion channels is currently under review (see Petersik, 1989); I use the term “short-range motion sensor” only as a short-hand for “those sensors responding to random dot kinematograms”. Originally it was proposed (e.g. Anstis, 1970, p. 1425; Anstis, 1980) that short-range motion sensors did a luminance-based cross-correlation. That is, the sensors use only luminance to match elements in successive frames. However, Cavanagh, Boeglin and Favreau (1985) found that motion can be seen in kinematograms whose elements differed only in hue. There are two ways in which this result can be interpreted: (1) there is one class of short-range motion
sensor which uses both luminance and hue; or (2) there are motion sensors that use luminance and sensors that use hue. The first interpretation is more parsimonious and has received additional support from the finding that chromatic motion can be cancelled by the luminance-based motion in the opposite direction (Cavanagh & Favreau, 1985; Cavanagh, Tyler & Favreau, 1984). Gorea and Papathomas (1989) prefer the second interpretation. Cavanagh et al. (1985) found that the distance at which the discrimination of motion direction broke down (D_) for hue kinematograms was only about half as large as that for luminance kinematograms. If we make the assumption that a smaller D, indicates a poorer stimulation of the motion sensors by the feature in question, then we might say that hue provides an especially weak input to motion sensors. Before jumping to the conclusion that hue is an especially poor feature for motion sensors, however, we would have to see how hue fares in relation to other features. Almost all motion theories are framed in terms of matching features in successive frame on the basis of luminance, but the matching process could use any feature. It might be that hue is not an especially bad feature for motion but that luminance is especially good. If luminance is especially good, then we expect smaller D, for all features other
1421
1422
WILLIAM
than luminance. If hue is especially bad, we expect larger D, for all features other than hue. So far the only features that have been examined in kinematograms are luminance and hue. In this paper I measure D,, for kinematograms whose elements differ in luminance, hue, orientation, or phase. By phase I mean the relative placement of the line segments with respect to one another in the cell (this use of the term follows that of Julesz & Schumer, 1981, p. 594; and Julesz, 1981, p. 94). Only by looking at performance with other features can we tell the relative effectiveness of luminance and hue as matching features. One row of a kinematogram for each type of feature is shown in Fig. 1. Each cell of the kinematogram contains an element having one of two possible values for the given featureblack or white, red or green, slanted left or right, in T-phase or L-phase. The T vs L pairing is a standard one in the literatures for visual search and texture segmentation (Beck & Ambler, 1973; Duncan & Humphreys, 1989; Julesz & Bergen, 1983). The terminology I will use to describe the stimuli-luminance, hue, orientation, and phase kinematograms-names //I //I
/
/ /
/ /
/ /
/I/// /
/I///
frame 1 frame2
A.
SIMPSON
the only feature available in each as a basis for matching elements across frames to obtain motion. Other attributes may be present. but are irrelevant for the matching task. Note that the cells are equated for every feature other than the one of interest. For example in the phase kinematogram all cells have the same mean luminance, contain line segments of the same length and orientation, and have the same hue. At a resolution smaller than that of a cell there are of course differences in luminance between the T and L-the distributions of luminance within the cells differ. If in fact the motion sensors cannot correlate the frames using phase and instead must use luminance, then no global motion will be seen. An incoherent set of local motions will result. The top of Fig. 2 shows a small portion of a phase kinematogram with the local and global motion solutions. The local luminance-based solution matches the horizontal and vetical segments within each cell to their counterparts in the same cell in the following frame. The resulting local motions are diagrammed. If phase cannot be used to match elements this local solution will result because the distances between line segments within the cell are shorter than the distance between cells (the minimal distance for matching Ts or Ls). The global
LUMINANCE
TLI’ /\\ LTI”“’ \/\ ORIENTATION
I-I-TLTLTLTLTLLTTT-ITLLLL LT-ITLTLTLTLTLLTTTTTLLL
trameI 2
PHASE Fig. 1. Each frame of one row of a kincmatogrun whox ckmcnts differ in luminance, hue, orientation, or pbue. The real display had 25 rows of 40 elementi. Here the second frame is displaced one ekmcnt to tbc right. lo all cae~ the elements were bright against a black bockgrougd, in the hue condition the elements were green or red slashc8 of the same luminance.
Fi& 2. Top, two frames of a small one-row aa%ion of a kincmatogmm. At the kft the kinematognun’r elements dill& in phre, at ri&t they differ in orientation. Below each kind of kincmatogram is shown a local (middle) and global (bottom) motion solution for the two-frame sequence.
The matching process in short-range motion
phase-based solution matches Ts to Ts and LS to Ls. It is a global solution because it gives the same motion to the whole field. The same comments apply to the orientation kinematograms. If a local luminance-based match is performed, the result will be an incoherent mixture of clockwise and counterclockwise swivelling of each line segment. If orientation can be used to match elements, a global motion will be seen. Some might be wondering if the aperture problem would not lead to the perception of a diagonal motion of the oriented line segments. According to the aperture problem, only the motion component normal to the contour can be detected. Interestingly, in the following experiments the endpoints of the line segments effectively constrained the global motion to the horizontal. That is, the “correct” horizontal motion signals arising from the endpoints were enough to over-rule the diagonal motion signals arising from the other points on the segments. Diagonal coherent motion was never seen. DETERMINATION OF ISOLUMINACE
Before a comparison of the relative effectiveness of various features as short-range motion stimuli can be made, we have to ensure that the elements in the kinematograms are matched in all respects except in the feature of interest. This is easy for most features except hue. It is rather difficult to have elements of different hue but matched luminance. Briefly, the method used here to determine the isoluminance point is to present red/green kinematograms with a fixed displacement but varying relative brightness between the red and green elements. At isolwninance the motion perception will be impaired. The minimum of the proportion correct motion judgements vs luminance ratio function is thus the isoluminance point. Method Subjects. The author (WS) and a naive observer (LB) served as subjects for all experiments. Both had normal or corrected-tonormal vision and were experienced in motion psychophysics. Stimuli. The stimuli were 2-frame kinematograms in which the elements could take on one of 2 equiprobable levels of hue (red or green). The elements were slashes (/) that were either red (CIE xy-coordinates; 0.60, 0.34) or green (0.34, 0.55) against a black background.
1423
Slashes were used instead of the traditional squares (as used in an ordinary random dot kinematogram) in order to make the stimuli more comparable with the orientation and phase stimuli used in following experiments. Note that when slashes are used the elements have a luminance border with the background. Other researchers have used square elements whose edges abut each other and hence have only hue borders (ignoring any luminance borders introduced by chromatic aberration). The slash’s luminance border with the background cannot be used to make the motion match, however, so hue is the only possible basis of motion correlation. The red element’s luminance contrast with the black background was fixed at 90%. The green element’s luminance contrast with the red elements could take on any of seven values from -80% (red brighter) to 60% (green brighter). Each frame of one row of a kinematogram (the display had 25 rows of 40 elements) is illustrated for each type of feature in Fig. 1. Each element sat in a cell that was 5 min arc wide and high (giving a total display size of 3.33 deg wide by 2.08 deg high). Each frame was displayed for 100msec and flipped with no ISI. The displacement of the whole display was fixed at four element widths (20 min arc) and was either to the left or to the right. An Amiga computer presented the displays (on an analogue RGB monitor) and recorded the responses. Isoluminance was determined by finding proportion correct direction of motion judgements as a function of red/green contrast. The minimum of this function was taken to be the isoluminance point. Procedure. On each trial the subject was presented with a display that randomly (with a 50% probability) moved either to the left or to the right. The subjects’ task was to indicate which had occurred (by pressing one of two keys). The different red/green contrasts were presented in random order. Results and discussion
Proportion correct direction of motion judgements as a function of red/green contrast are shown in Fig. 3. When the red and green elements have equal luminance (at a displacement of four elements or 20min arc), performance drops to chance. This red/green contrast was used to determine the function for hue in the next experiment.
1424
WILLIAM A. SIMPSON
1.0'
u
0.8
s 1
I-v0.6 ws
u 0.41
-80 -80 -40 -20 0 20 40 80
Red/greencontrast(percent)
0.44
n -80 -80 -40 -20 0 20 40 60
Fled/greencontrast(percent) Fig. 3. Proportion correct direction of motion judgements as a function of red/green contrast for observers WS and LB. The displacement was fixed at 20 min arc (4 element widths). Each point is based on 120 trials. The minimum of the function was taken to be the isoluminana point. DIRECTION DISCRIMINATION AS A FUNCIION OF DlsPLACEMENT
This experiment compares the relative effectiveness of luminance, hue, phase, and orientation as features for motion computation. Kinematograms whose elements differ in one of these features are presented and the proportion of correct motion judgements is measured as a function of the size of the displacement in successive frames. Method Stimuli. The stimuli were 2-frame kinematograms as previously described. Here the elements could take on one of 2 equiprobable levels of luminance, hue, orientation, or phase. Each frame of one row of a kinematogram (the display had 25 rows of 40 elements) is illustrated for each type of feature in Fig. 1. The 2 levels for each feature were chosen to be opposites: for luminance the elements were slashes (line segments oriented at 45 deg) that were black (invisible against the black background) or white, for phase the elements were Ts or Ls, for hue the elements were slashes that were red or green, and for orientation the elements were slashes oriented at k 45 deg (left and right of vertical).
The use of T and L as an element pair differing only in phase (the relative position of the vertical and horizontal line segments) is standard in the visual search and texture segmentation literatures. In the orientation condition, the orientations of & 45 deg were chosen since they are physically orthogonal. It was also important that the elements had contours that were not parallel to the direction of motion (horizontal here). The closer to parallel, the weaker the horizontal motion signal due to the aperture problem. The aperture problem dictates that only the component of motion normal to the contour can be detected. If the segments were, for example, horizontal and vertical, the motion signal arising from the horizontal segments would be very weak (only the endpoints would give a signal). The luminance, orientation, and phase kinematograms were viewed in counterbalanced order (3 blocks of 50 trials); the hue kinematograms were viewed later. Each datum point was thus based on 150 trials. The displacement of the whole display was always an integral number of elements and was either to the left or to the right. The dispiacements were presented order. An Amiga computer in random presented the displays (on an analogue RGB monitor) and recorded the responses. Results and discussion
Figure 4 shows the psychometric functions for proportion correct direction judgements as a function of displacement for kinematograms whose elements could differ in luminance, hue, phase, or orientation. The point on the function where performance reaches 0.75 is termed D_. D_ and the slope of the psychometric functions were determined by nonlinear regression (simplex method; the model function was 2AFC logistic). The least-squares fitted values (with 95% confidence intervals) for D_ are given in Table 1. All of the features tested were used for motion detection, although D_ varied markedly between features. Note that 0, is much higher for luminance kinematograms than for the other types of display. The 0, for phase and hue are not statistically different, and are intermediate between luminance and orientation. In finding a smaller D_ for hue than for luminance, I replicate the results of Cavanagh et al. (1985) and Sato (1988). The lowest D, is obtained In fact, with orientation kinematograms.
1425
The matching process in short-range motion
0
20
10 Displacement
0
(min
20
10 Displacement
(min
30 arc)
30 arc)
Fig. 4. Proportion correct direction judgements as a function of displacement for kinematograms whose elements could differ in luminance, hue, phase, or orientation. Each point is based on 150 trials.
performance collapses with displacements greater than only one element. The result that is smaller when orientation is used to D, match elements between frames rather than hue confirms a similar finding by Gorea and Papathomas (1989), who used a quite different kind of stimulus. The fact that the psychometric functions for hue and phase are the same (Fig. 3) supports Troscianko’s (1987) contention that the poor motion performance seen with hue kinematograms is due to the introduction of a positional uncertainty in the neural representation of isoluminant stimuli. Troscianko’s explanation is this: red/green motion performance is poor because the neural representation has lost the precise location of each red and green element. Although the explanation was originally framed for isoluminant red/green stimuli, it can be applied to the phase stimuli since here the elements are also isoluminant (with each Table 1. Fitted values ‘for D_ ( f 95% confidence interval) in min arc FCature Luminance Phase Hue Orimtation
LB
ws 25.98 f 15.39 f 15.23 f 7.98 f
1.95 0.14 1.03 3.10
30.09 f 5.22 13.36k2.11 11.25 f 2.24 5.66 f 4.32
other, not with the background). In the phase stimuli- Ts and L s-the representation will have lost information about where the horizontal and vertical segments are within a cell. This amounts to it having lost the locations of the Ts and Ls (since they are defined by the relation of horizontal and vertical segments). In both cases-hue and phase-the wrong elements will thus be matched, yielding poor motion performance. The idea that a positional uncertainty is introduced into the neural representation of isoluminant stimuli could also be used to explain the poor performance with orientation kinematograms. Again, the elements (oriented line segments) are isoluminant with one another. If the location of each element is lost, then the wrong element will be matched and poor motion performance will result. An account that explains the poor performance with isoluminant elements on the basis of an introduction of positional uncertainty into the neural representation can handle the fact that the D_ for hue, phase and orientation kinematograms are smaller than for luminance. It does not, however, predict the finding that D, is smaller for orientation than for phase or hue kinematograms. A simpler interpretation would be that the process underlying short-range motion detection prefers luminance to other features as a basis for matching elements. It can use phase, hue and orientation, although less well than it can use luminance.
THE EFFECT
OF ELEMENT SHAPE
I obtained for luminance kineThe D, matograms was about 29 min arc. This value is certainly greater than Braddick’s (1974) proposed 15 min arc limit for displays of this type. To check the possibility that my large D_ was due to the use of oriented line segments as elements, I obtained displacement psychometric functions using the traditional squares as elements. Method Stimuli. The stimuli were kinematograms as described in the previous experiment. The elements were 5 min arc black or white squares with abutting edges-in other words, the stimuli were ordinary random dot kinematograms. Procedure. As previously described.
WILLIAM A. SIMPSON
1426
Results and discussion As can be seen in Fig. 5, there is no difference between the functions for oriented line segments and for squares. The 95% confidence intervals for the D_ fitted to slashes (WS: 25.98 + 1.95; LB: 30.09 + 5.22) and squares (WS: 26.20 f 6.61; LB: 31.34 + 8.52) overlap. The element shape does not cause the larger D_ here. Subsequent to Braddick’s (1974) proposal of a 15 min arc limit for the perception of coherent motion in random dot kinematograms, other researchers have found that Dmx varies as a function of the area of the moving field (Baker & Braddick, 1982) and as a function of element size (Cavanagh et al., 1985). Either of these factors may be responsible for the large D,,. In any case, this experiment shows that my results for luminance kinematograms using slashes are not anomalous. In Gorea and Papthomas’s terminology (1989) the display with square elements stimulates a non-oriented luminance channel and the slashes display stimulates an oriented luminance channel. Their terminology pays attention to how an element is defined in one frame. Use of this terminology encourages one to think that motion detection with squares and slashes might differ (Gorea & Papathomas, 1989, p. 599). I WS
1.7
0 Di&cemerF(min
arc;
call both the slashes and the squares displays luminance kinematograms. My terminology stresses what features are available for motion matching between frames. Since the matching here must be based on luminance regardless of element shape (oriented or non-oriented), my terminology correctly suggests that performance should be the same for the two. DISCRIMINABILITY
OF ELEMENTS
One explanation of the rank ordering I found for the efficacy of various features as input to short-range motion sensors is that this is just a rank ordering of the discriminability of the elements. If it is difficult to discriminate the elements it will be difficult to match them correctly in successive frames. Although the elements for each feature were chosen to be polar opposites, it is possible that, for example, it is harder to tell the difference between red and green than it is to tell the difference between black and white. To check this possibility, I measured the proportion correct responses and latency for discriminating pairs of elements (within a feature type). Method Stimuli. Elements as described previously were presented in pairs in the centre of the screen, surrounded by a white circle of 15 min arc radius. The elements differed in one traiteither luminance, phase, hue, or orientation (i.e. only one feature type was presented at a time). Each element randomly assumed one of 2 values-e.g. for orientation the possible combinations would be //, /\, \/, and \\. The stimulus pair was presented for 200 msec (the same total display time as used in the motion experiments). Procedure. The subject’s task was to indicate whether the two stimuli were the same or different. Proportion correct responses and response latency (only for correct responses) were recorded. The different features were presented in counterbalanced order-4 orders x 100 trials. Results and discussion
“._
0
lo
20
3.0
Displacement(minarc) Fig. 5. Proportion correct direction judvnts as a function of displacement for luminance @lack/white) kinematograms whose elements were either the traditional adjoining squares or oriented line segments (slashes:/) Each point is based on 150 trials.
Figure 6 shows proportion correct responses plotted against latency (each point is based on 400 trials). Note that the 95% confidence intervals overlap for both measures for all features. I conclude that the features are roughly equated for discriminability, and that they differ in their ability to stimulate short-range motion sensors.
The matching process in short-range motion
1
460 ws
T
1
1427
Table 2. Effectiveness of different features as bases of motion matching Short-range Luminance Hue Phase Orientation
0.95 0.93 0.91 Proportion correct
Fig. 6. Proportion correct same/different judgements and response latency for pairs of elements that could differ in luminance hue, phase, or orientation. The error bars indicate 95% confidence intervals. Each point is based on 400 trials. GENERAL DISCUSSION
These experiments show that all the features tested-luminance, hue, phase, and orientation-can be used by the short-range process. Not all features yield the same Q_, however. If we make the assumption that a larger O,_ reveals a more preferred stimulus, then shortrange motion sensors prefer to match elements on the basis of luminance, with hue and phase (a tie) being less preferred, and orientation least preferred. Hue turns out to be a cue of middling strength for short-range sensors. Although I have been following the rest of the literature in interpreting a smaller D_ as revealing poorer performance, this is not the only possible inte~retation. As pointed out to me by D. Regan (personal communication), one can say that a small O_ simply indicates that the sensor is tuned to slower speeds; a large D,, indicates broader velocity tuning (Gorea & Papathomas, 1989, also make this interpretation). On this inte~retation and sticking with the assumption of one class of short-range motion sensors, we would say that when the matching of elements is based on luminance the sensors are tuned to the widest range of stimulus speeds, when phase or hue are used the velocity tuning is sharper, and when orientation is used the sensors are most sharply tuned to the slowest speeds. Regardless of how L)M.Vis interpreted, it is now clear that features other than luminance are used by the short-range motion mechanism. This suggests that it is a very general mechanism which can find correlation in any form. At any rate, it must receive inputs from mechanisms that compute orientation and phase, as well as hue and luminance. By using any feature avail-
Long-range
Best :: worst
2: Xne&ctive OK
able the motion sensors enlarge the range of circumstances in which they can function; by using all features simul~neously the amount of error (number of false matches) can be reduced. The experiments in this paper were not designed to shed light on the theory of separate long-range and short-range motion mechanisms (Braddick, 1974). However, comparison of the results obtained here with kinematograms and Green’s results (1986, 1989) using large displacements of large circular patches may be useful in deciding the issue, Green found that motion was detected equally well when luminance, hue, or o~entation was available as the basis for matching. He also found that phase could not be used as a matching feature. These different patterns of results (shown in Table 2) obtained with putative long-range and short-range stimuli could be used to reinforce the case for the existence of two motion mechanisms.
Acknowledgements-I thank the visitors to my ARVO 198!2 poster, the referees of an earlier draft, and Cbieko Murasugi for helpful comments.
BEFEBENCES Anstis, S. M. (1970). Phi movement as a subtraction process. Vision Research, JO, 141l-1430. Anstis, S. M. (1980). The perception of apparent movement. Philos~p~icaJTrmsactions of the RoyaJSociety of .bndon B, 29@, m-168. Baker, C. L. & Braddick, 0. J. (1982). The basis of area and dot number effects in random dot motion perception. Vision Research, 22 1253-1259. Beck, J. & Ambler, B. (1973). The effects of concentrated and distributed attention on peripheral acuity. Perception d PsychQphysi~, J4, 225-230. Braddick, 0. (1974). A short-range process in apparent motion. Vision Research, 14, W-527. Cavanagh, P. & Favreau, 0. E. (1985). Color and luminanoe share a common motion pathway. V&ion Research, 25, 1595-1601. Cavauagb, P., Tyler, C. W. A Favreau, 0. E. (lQ84) Perceived velocity of moving chromatic gratings. JmmmJ of the OpticarSociety of America A, I, 893-W. Cavanagh, P., Boeglin, J. & Favmau, 0. (1985). Penxption of motion in ~uilumino~ kin~ato~~. Percepfim, J4, 151-162. Dtmcan, J. & Humphreys, G. W. (1989). Visual sea& and stimulus similarity. PsychoJogicafReview, %, 433-4S8.
WILLIAMA. SIMPSON
1428
Gorea, A. & Papathomas, T. V. (1989). Motion processing by chromatic and achromatic visual pathways. Journal of the Optical
Society
of America
A, 6, 590-602.
Green, M. (1986). What determines correspondence strength in apparent motion? Vision Research, 26, 599-607. Green, M. (1989). Color correspondence in apparent motion. Perceprion & Psychophysics, 45, 15-20. Julesz, B. (1981). Textons, the elements of texture perception, and their interactions. Nature, London, 290, 91-97. Julesz, B. & Bergen, J. (1983). Textons, the fundamental elements in preattentive vision and perception of textures. Bell System
Technical Journal,
62, 1619-1645.
Julesz, B. & Schumer, R. A. (1981). Early visual perception. Annual Review of Psychology, 32, 575-627. Nakayama, K. (1985). Biological image processing: A review. Vision Research, 25, 625-660. Petersik, J. T. (1989). The two-process distinction in apparent motion. Psychological Bulletin, 106, 107-127. Sato, T. (1988). Direction discrimination and pattern segregation with insoluminant chromatic random-dot cinematograms. ATR Technical Report no. TR-A-0027. Troscianko, T. (1987). Perception of random-dot symmetry and apparent movement at and near isoluminance. Vision Research,
27, 547-554.