The effects of motion and stereopsis on three-dimensional visualization

The effects of motion and stereopsis on three-dimensional visualization

Int. J. Human—Computer Studies (1997) 47, 609 —627 The effects of motion and stereopsis on three-dimensional visualization GEOFFREY S. HUBONA Departm...

650KB Sizes 0 Downloads 23 Views

Int. J. Human—Computer Studies (1997) 47, 609 —627

The effects of motion and stereopsis on three-dimensional visualization GEOFFREY S. HUBONA Department of Information Systems, Virginia Commonwealth University, P.O. Box 844000, Richmond, VA 23284-4000, U.S.A. GREGORY W. SHIRAH National Aeronautics and Space Administration, Goddard Space Flight Center, Code 522, Greenbelt, MD 20771, U.S.A. DAVID G. FOUT Century Computing, Inc., Laurel, MD 20707, U.S.A. (Received 8 October 1996 and accepted in revised form 29 May 1997) Previous studies have demonstrated that motion cues combined with stereoscopic viewing can enhance the perception of three-dimensional objects displayed on a twodimensional computer screen. Using a variant of the mental rotation paradigm, subjects view pairs of object images presented on a computer terminal and judge whether the objects are the same or different. The effects of four variables on the accuracy and speed of decision performances are assessed: stereo vs. mono viewing, controlled vs. uncontrolled object motion, cube vs. sphere construction and wire frame vs. solid surface characteristic. Viewing the objects as three-dimensional images results in more accurate and faster decision performances. Furthermore, accuracy improves although response time increases when subjects control the object motion. Subjects are equally accurate comparing wire frame and solid images, although they take longer comparing wire frame images. The cube-based or sphere-based object construction has no impact on decision accuracy nor response time. ( 1997 Academic Press Limited

1. Introduction During this era of federal downsizing and cost reductions, many spacecraft control centers have been levied with requirements for ‘‘lights-out operations’’ so that software systems can autonomously control the routine operations of spacecraft without human intervention. However, autonomous systems cannot currently handle every conceivable operations anomaly that occurs in complex spacecraft systems. When these anomalies occur, there is a need to notify spacecraft engineers quickly so they can diagnose the problems. The engineers use diagnostic tools that help them to interpret the system’s parameters and to formulate a solution to the problem. Three-dimensional visualization techniques can help engineers to understand the state of a spacecraft’s operations. Ground control systems typically contain many heterogenous data points that cannot be effectively visualized through existing two-dimensional approaches. A number of three-dimensional visualization prototypes† are under † See World Wide Web URL http://groucho.gsfc.nasa.gov/eve/VR.html. 609 1071-5819/97/110609#19$25.00/0/hc970154

( 1997 Academic Press Limited

610

G. S. HUBONA E¹ A¸.

development at the National Aeronautics and Space Administration (NASA) Goddard Space Flight Center (GSFC) to help convey operational parameters in a typical spacecraft ground control system. These visualization prototypes use spatial abstractions, such as solid and wire frame cube-based and sphere-based objects to represent parameter information. The facility with which human operators can use these high-level visual abstractions to effectively interpret and understand a spacecraft’s operational data is one motivation for this research. The purpose of this research is to investigate the complexities of human operators interpreting and interacting with moving three-dimensional images on a computer screen. Of particular interest are the effects of user-controlled and uncontrolled motion with three-dimensional images and the resulting impact on object recognition performance. Specific objectives of this research are: (a) to determine what effect stereoscopic depth information has on the ability of users to visualize abstract computer-displayed objects, (b) to investigate object recognition performance effects resulting from controlled and uncontrolled motion and (c) to look at the utility of various object shapes and surface characteristics in facilitating object recognition. Any two-dimensional representation of a three-dimensional object is inherently ambiguous so there is a fundamental conflict in any attempt to represent a three-dimensional object on a two-dimensional computer display. For example, consider Figure 1(a). The Necker cube illusion is a classic representation of a three-dimensional object presented in a two-dimensional display. The human perceptual and cognitive subsystems ‘‘automatically’’ try to resolve this ambiguity by ‘‘seeing’’ one of two possible faces of the cube as being ‘‘in front’’. That is, Figure 1(a) can be alternately perceived as (b) or (c). However, the result is often a disconcerting shifting back and forth between the two ‘‘front’’ faces, because three-dimensional depth cues rendered on a two-dimensional display are ambiguous. This is true whether the display is static or moving. However, with moving displays, time compensates for some of the lost and ambiguous spatial information. Humans visually interact with their environment by: (1) detecting visual information; (2) recognizing the ‘‘external source’’ of the information; and (3) interpreting its significance. Human perception and interpretation of visual information is an extremely complex process, involving several levels of anatomical, neurochemical and psychophysical processing inherent in vision and cognition. The human eye is not a camera-like

FIGURE 1. Necker cube illusion.

EFFECTS OF MOTION AND STEREOPSIS ON THREE-DIMENSIONAL VISUALIZATION

611

instrument for recording an image. Rather, it serves as an ‘‘optical interface’’ between the external environment and the neural components of the human visual system, to provide the basic visual attributes of form, field, color, motion and depth.

2. Theory and background The increasingly prevalent contemporary use of three-dimensional computer display technology has fostered an increased interest in the use of stereoscopic viewing to convey information. The perception of stereoscopic images is produced by binocular disparity. Because human eyes are located 6 cm apart, an object that is viewed with both eyes within a distance of about 30 m projects two disparate images onto each retina. The brain fuses these two images, and in the process, is provided with significant depth cues about the relative size, shape, location and orientation of the object. However, we perceive depth and the relative positions of objects in space for even the most distant objects. Thus, stereopsis is not the only mechanism for perceiving depth. The operative visual cues that enable humans to perceive depth have been extensively studied. Often these cues are placed in two categories: (1) primary cues that include binocular disparity, convergence and accommodation; and (2) secondary cues that include perspective and elevation, size, texture, shading and shadow, motion, reference frames and others [see Kelsey (1993) for a complete discussion of these primary and secondary depth cues]. The everyday human perceptual experience occurs within a context of nested motions. Moving eyes, as well as moving objects, provide valuable perceptual cues about the environmental and spatial properties of perceived objects. Wallach and O’Connell (1953) demonstrated that people can recover three-dimensional form when viewing two-dimensional images of rotating objects. They constructed three-dimensional wire forms and projected the shadows of these wire forms onto screens. When the wire forms were stationary, the viewers reported seeing only two-dimensional configurations of lines in the shadow projections. However, when the wire forms were continuously rotated, viewers could accurately report on the three-dimensional configurations of the wire forms. Wallach and O’Connell labeled this phenomenon and Kinetic Depth Effect or KDE. The KDE is well documented as a powerful depth cue and has been demonstrated to exist under relatively impoverished conditions, such as when viewing projections of rotating dot patterns (Braunstein, 1976; Todd, 1985). Proffitt and Kaiser (1991) discuss minimal stimulus conditions necessary to perceive environmental properties in objects. They maintain that motion is a minimally sufficient condition for perceiving a variety of environmental properties, including threedimensional form. The motion of an object relative to the viewer is referred to as its displacement. When displaying static objects, the conveyance of displacement can only be achieved through very artificial techniques. However, when displaying moving objects, the natural medium of time provides displacement information for the viewer. Theory and technique relating to dimensional integrality are relevant to this study. Specifically, when scientific data contain more than two dimensions, how are they best rendered so as to facilitate comprehension? Suggested techniques include: X½ plots with Z represented through intensity or color (Liu & Wickens, 1992) or with surface contours (Van Damme & Van de Grind, 1993); by the concurrent use of two, or even three,

612

G. S. HUBONA E¹ A¸.

two-dimensional planar representations (Andre, Wickens, Moorman & Boschelli, 1990; Wicken, Merwin & Lin, 1994); three-dimensional perspective renderings (Carswell, Frankenberger & Bernhard 1991; Liu & Wickens, 1992); or by using stereopsis (Gallimore & Brown, 1993; Sollenberger & Milgram, 1993; Brown & Gallimore, 1995; Ware & Franck, 1996). Wickens (1992a, b) and Wickens et al. (1994) argue that the proximity compatibility principle asserts there is an inherent advantage of an additional display dimension (e.g. a three-dimensional over two planar two-dimensional displays or an X½ plot over two X plots) when multiple sources of data must be integrated. A number of recent studies have investigated human performance with stereoscopic user interfaces in the following task domains: cockpit situational awareness (Bridges & Reising, 1987; Andre et al., 1990; Reising & Mazur, 1990; Yeh & Silverstein, 1990); the viewing, manipulation and/or recognition of object images (Horst, Rau, LeCocq & Silverman, 1983; Spain & Holzhausen, 1991; Gallimore & Brown, 1993; Wickens et al., 1994; Brown & Gallimore, 1995; Ware & Franck, 1996); relative depth perception (Reinhart, 1990); and medicine (Sollenberger & Milgram, 1993). The use of stereoscopic viewing did not uniformly assist performance in these studies, although many report that stereopsis helps user performance by augmenting certain monoscopic visual cues. In particular, the combination of object motion with stereoscopic viewing is a powerful combination for providing depth information about the object. Shepard and Metzler (1971) first introduced the mental rotation paradigm. Subjects were presented with pairs of perspective drawings of stationary three-dimensional objects consisting of blocks. The task was to determine whether the presented images represented different objects or different angular orientations of the same object. They concluded that subjects formed a cognitive image of one of the objects and mentally rotated it for comparison with the second object. Gallimore and Brown (1993) modified the mental rotation paradigm by enabling subjects to rotate one of the presented objects to assist in their object comparison decisions. They found no significant performance differences in terms of accuracy or response time when block objects were presented as three-dimensional as compared to two-dimensional images. Stereo viewing was neither faster nor more accurate. Furthermore, they reported higher error rates and slower response times when the block objects were presented as wire frame as compared to solid objects. In a subsequent study, Brown and Gallimore (1995) again used the mental rotation paradigm with solid and wire frame block objects. Subjects had more difficulty in accurately comparing wire frame as compared to solid objects. However, they reported some performance improvements from stereoscopic viewing and suggested that stereopsis is most effective when other monoscopic depth cues are absent or when the task is visually and perceptually demanding. Sollenberger and Milgram (1993) investigated the effectiveness of stereoscopic and rotational display techniques with respect to the accuracy of tracing three-dimensional network-like paths. In one experiment, they found that motion alone contributed more to accuracy than did stereoscopic viewing alone, although the combination of motion and stereopsis improved accuracy more than either condition by itself. In a second experiment they looked at the effect of subject-controlled and uncontrolled object motion, again reporting improved accuracy resulting from both the motion and stereopsis conditions, although the type of motion did not affect performance.

EFFECTS OF MOTION AND STEREOPSIS ON THREE-DIMENSIONAL VISUALIZATION

613

Liu and Wickens (1992) used perspective as a depth cue to present three-dimensional ‘‘looking’’ objects to subjects. Specifically, they presented images consisting of numerous vertical lines projecting from a two-dimensional plane. By adding surface contours to the ends of the vertical lines, they made the image appear three-dimensional. Subjects performed an integration task (i.e. cluster detection) and a focused attention task (i.e. judging the magnitude of similarity between pairs of objects). They reported that the surface contours (i.e. the three-dimensional ‘‘look’’) facilitated performance in the integration task, but that it impeded performance in the focused attention task. Wickens et al. (1994) had subjects view three-dimensional data sets representing complex surfaces from which discrete points had been sampled. Subjects answered questions that required focus of attention on certain data points or integration across numbers of data points. Similar to Liu and Wickens (1992), Wickens et al. found that three-dimensional representations enhanced performance, but only for integrative questions. Further, they reported that motion provided no performance benefits. Van Damme and Van de Grind (1993) made a distinction between passive and active motion. KDE is an example of passive motion such that the object moves. In contrast, active motion results from the observer moving his head relative to the position of the object. Subjects categorized three-dimensional quadratic surfaces with randomly chosen shapes and fixed amounts of curvature into one of eight shape categories. Both passive and active motion improved subjects’ identifications of the three-dimensional shapes. However, there was no difference between passive and active motion in facilitating shape identification. Ware and Franck (1996) conducted two experiments demonstrating that the combination of motion and stereo viewing effectively increases the size of a network graph that can be perceived. Stereo vision alone and motion alone, both enhanced path-tracing task performances, but stereo vision combined with motion was even more effective. Although different motion cues were examined, including head-coupling (e.g. active motion), handguided (controlled) motion and automatic (uncontrolled) motion, each type of motion was equally effective in aiding the perception and understanding of the network graphs.

3. Hypotheses Many recent studies indicate the benefit of stereoscopic viewing in perceiving, recognizing and understanding object shapes (McWhorter, Hodges & Rodriguez, 1991; Sollenberger & Milgram, 1993; Brown & Gallimore, 1995; Ware & Franck, 1996), although some studies do not support the superiority of stereopsis (Gallimore & Brown, 1993) and others suggest that these benefits are task specific (Liu & Wickens, 1992; Wickens et al., 1994). However, stereopsis does provide depth cues about object shape that are absent without stereopsis. Consequently, the following two hypotheses are proposed: H1A: Performance accuracy is greater when objects are viewed as stereoscopic images than when objects are viewed as monoscopic images. H1B: Response time is shorter when objects are viewed as stereoscopic images than when objects are viewed as monoscopic images.

Although the preponderance of evidence indicates that motion facilitates object recognition performance (Wallach & O’Connell, 1953; Braunstein, 1976; Todd, 1985;

614

G. S. HUBONA E¹ A¸.

Gallimore & Brown, 1993; Sollenberger & Milgram, 1993), evidence suggests that it does not matter whether the object or the observer is moving (Van Damme & Van de Grind, 1993). Further, although there is some evidence that the rate of motion affects the ability to recognize the object (Sollenberger & Milgram, 1993), there is no evidence that motion controlled by the observer is superior to uncontrolled motion (Ware & Franck, 1996). The relative advantages of controlled vs. uncontrolled motion have practical importance for building visual three-dimensional interfaces. If the user cannot effectively improve object perception and understanding by controlling the motion of the object, then the user should not have this control. Commensurate with previous studies indicating that the type of motion is inconsequential, the following two hypotheses are proposed: H2A: There is no difference in performance accuracy when object motion is controlled or uncontrolled. H2B: There is no difference in response time when object motion is controlled or uncontrolled.

There is evidence that three-dimensional wire frame objects are more difficult to perceive and interpret than are solid, opaque three-dimensional objects because of the ‘‘see through’’ characteristic of wire frame (Sanford, Barfield & Foley, 1987; Brown & Gallimore, 1995). In a three-dimensional representation, the presence of ‘‘hidden features’’ (sometimes referred to as interposition) has been demonstrated to facilitate the identification and recognition of object shape (Gallimore & Brown, 1993). Thus, the following two hypotheses are proposed: H3A: Performance accuracy is greater with solid objects than with wire frame objects. H3B: Response time is shorter with solid objects than with wire frame objects.

There has been very little research specifically investigating how different abstract shapes affect three-dimensional object recognition. A notable exception is Van Damme and Van de Grind (1993). In terms of abstract three-dimensional objects, cubes (Shepard & Metzler, 1971; Mariani & Lougher, 1992; Gallimore & Brown, 1993; Brown & Gallimore, 1995) and lines (Liu & Wickens, 1992; Wickens et al., 1994) or networks (Sollenberger & Milgram, 1993; Ware & Franck, 1996) have been used. The physiological basis of visual detection is such that there is justification for the use of cube- and line-based shapes in this type of research. Specifically, the horizontal and vertical line detection features inherent in the human perceptual system (Kelsey, 1993) might suggest that lines and cubes are more readily perceived than are contoured object shapes. However, curved shapes have been used in three-dimensional visualization research (Liu & Wickens, 1992; Van Damme & Van de Grind, 1993) and, perhaps more importantly, many contemporary scientific data visualization systems are based on representations of curved shapes and objects. Thus, there is justification to further investigate curved object shapes. However, there is no salient theoretical basis to suspect a priori performance advantages for either cube-based or sphere-based objects. Consequently, the following two hypotheses are proposed. H4A: There is no difference in performance accuracy when viewing cube-based objects as compared to sphere-based objects. H4B: There is no difference in response time when viewing cube-based objects as compared to sphere-based objects.

EFFECTS OF MOTION AND STEREOPSIS ON THREE-DIMENSIONAL VISUALIZATION

615

4. Method 4.1. EXPERIMENTAL TASK

A variant of the mental rotation paradigm, first introduced by Shepard and Metzler (1971) and used in similar studies (Gallimore & Brown, 1993; Brown & Gallimore, 1995), was used in this research. Subjects were presented with pairs of object images. Their task was to determine whether the two images in each pair represented the same or different objects. The image pairs were presented on the left and right halves of a computer screen. Figures 2—5 show representative examples of the four types of solid and wire frame, cubical and spherical object images. Figure 2 also depicts the experimental user interface as seen by the subjects with ‘‘start’’, ‘‘same’’ and ‘‘different’’ buttons embedded. One half of the presented image pairs were the same, and the other half were different. In each pair, the left image was always stationary and the right image was always in motion. In one half of the trials, subjects could control the motion of the right object by rotating it around the center point in any direction for 360°. In the remaining trials, the right object rotated automatically (i.e. uncontrolled motion) in one direction about the center point at a fixed rate of approximately 18° per second. The computer hardware used in these tests included a Silicon Graphics (SGI) Indigo2 Extreme equipped with CrystalEyes stereoscopic glasses. Figure 6 displays one of the researchers wearing the stereoscopic glasses while viewing the test software on the SGI. The test software was developed in C## programming language using SGI’s Open Inventor graphics toolkit. All images were developed and presented using a perspective projection.

FIGURE 2. Solid cube-based object image pair with user interface.

616

G. S. HUBONA E¹ A¸.

FIGURE 3. Wire frame cube-based object image pair.

FIGURE 4. Solid sphere-based object image pair.

4.2. EXPERIMENTAL DESIGN

The experiment used a counterbalanced 2]2]2]2 within-subjects design, manipulating the independent variables: viewing mode (stereo, mono); type of motion (controlled, uncontrolled); object surface characteristic (wire frame, solid); and object shape characteristic (cube, sphere). The dependent variables included error rate (i.e. the percentage of incorrect responses) and response time (measured in milliseconds). Repeated measures of the dependent variables were automatically recorded by the test software. Each subject participated in 16 practice trials representing every combination of the independent variables. Subjects were informed that these trials were practice for the purpose of becoming familiar with the experimental procedures. Following the practice trials, each subject engaged in 4 sets of 52 trials each, for a total of 208 observations per subject. Each

EFFECTS OF MOTION AND STEREOPSIS ON THREE-DIMENSIONAL VISUALIZATION

617

FIGURE 5. Wire frame sphere-based object image pair.

FIGURE 6. Experimental apparatus.

of the 208 image pairs was unique. Each subject was presented with the same 208 image pairs, although the presentation order varied. Subjects were permitted to rest briefly between each of the four experimental trial sets. The total time for each subject to complete all practice and experimental trials varied from 45 to 70 min.

4.3. SUBJECTS

The subject population consisted of 14 female and 17 male employees and contractors of the Goddard Space Flight Center. Subjects voluntarily participated in the experiment. Subjects with corrected vision wore their eyeglasses/contacts underneath the

618

G. S. HUBONA E¹ A¸.

stereoscopic viewing glasses. All subjects had professional occupations and included engineers, computer programmers and computer scientists. Each subject completed a preliminary questionnaire soliciting demographic information. The mean age of subjects was 34.97 years with 16.48 mean years of education, 17.03 mean years of computer experience and 12.65 mean years of professional work experience.

5. Results Table 1 displays the means and standard deviations of error rates and response times by viewing mode (stereo, mono), type of motion (controlled, uncontrolled), surface characteristic (wire frame, solid) and object shape (cube, sphere). The mean error rate across all experimental trials was 11.04%. The mean overall response time per image pair was 13.01 s. The data were analysed by fitting a repeated measures multivariate analysis of variance model (MANOVA) to the 6448 experimental observations. The MANOVA model tested each of the four main effects (viewing mode, type of motion, surface characteristic and object shape) on the error rate and response time dependent variables. There were significant differences (at the 95% confidence level) in the mean values of the dependent variables (error rate and response time) as a function of three main effects: viewing mode; type of motion and surface characteristic. The effect of object shape (i.e. cube or sphere) was not significant with respect to the performance measures. Viewing mode significantly affected subjects’ image comparison performances in the omnibus MANOVA model (F(2,6411)"44.57; p"0.0001). Subjects viewing image pairs in stereo made fewer errors (the mean stereo error rate is 8.28%; F(1,6412)"51.54; p"0.0001) than did subjects viewing the image pairs in mono (the mean mono error rate is 13.8%). Hypothesis H1A is supported. Moreover, subjects viewing stereoscopic images made their decisions more quickly (the mean stereo response time per image pair is

TABLE 1 Means and standard deviations of error rates and response times Independent variable Viewing mode: Mean Std. dev. Type of motion: Mean Std. dev. Surface char.: Mean Std. dev. Shape char.: Mean Std. dev.

Error rate (%)

Response time (s)

Stereoscopic 8.28 12.26 (27.56) (7.82) Controlled 9.27 13.37 (29.01) (10.87) Wire frame 10.45 13.67 (30.60) (11.32) Cube 10.16 13.20 (30.21) (10.90)

Error rate (%)

Response time (s)

Monoscopic 13.80 13.75 (34.50) (10.99) Uncontrolled 12.81 12.64 (33.43) (8.05) Solid 11.63 12.34 (32.07) (7.36) Sphere 12.00 12.80 (32.50) (7.88)

EFFECTS OF MOTION AND STEREOPSIS ON THREE-DIMENSIONAL VISUALIZATION

619

12.26 s; F(1,6412)"43.95; p"0.0001) than did subjects viewing monoscopic images (the mean mono response time per image pair is 13.75 s). Hypothesis H1B is also supported. The type of motion also significantly impacted subjects’ comparison performances in the omnibus MANOVA model (F(2,6411)"16.94; p"0.0001). Subjects controlling the motion of the right-hand object image, also called the ‘‘comparison object’’ (recall that the left image was always stationary) were more accurate (the mean controlled motion error rate is 9.27%; F(1,6412)"21.14; p"0.0001) than were subjects who could not control this motion (recall that in these cases the comparison object rotated at a fixed rate; the mean uncontrolled motion error rate is 12.81%). Hypothesis H2A is not supported by the data. Furthermore, subjects controlling the motion took longer to make their comparison decisions (mean controlled motion response time is 13.37 s; F(1,6412)"10.44; p"0.0001) than did the subjects who could not control this motion (the mean uncontrolled motion response time is 12.64 s). Hypothesis H2B is also not supported by the data. The surface characteristic of the images (wire frame or solid) had a significant effect on comparison performances in the omnibus MANOVA model (F(2,6411)"16.59; p"0.0001). However, there was no significant difference in the error rates (F(1,6412)"0.85; p"0.3571) of subjects viewing the wire frame as compared to the solid objects. The results are inconclusive with respect to hypothesis H3A. Comparison performances were significantly affected by surface characteristic in terms of response time (F(1,6412)"31.43; p"0.0001). Subjects viewing wire frame images exhibited a longer mean response time of 13.67 s compared to subjects viewing solid images, with a mean response time of 12.34 s. Hypothesis H3B is supported by the data. The appearance of the object (cubical or spherical) did not have a significant effect on comparison performances in the omnibus MANOVA model (F(2,6411)"2.18; p"0.1129). Therefore, one cannot conclude that object shape had an impact on error rate or response time. Hypotheses H4A and H4B are not refuted by the data. Table 2 presents a summary of the empirical results and the implications of the hypotheses. Table 3 presents mean error rates and response times factored into the 16 TABLE 2 Hypotheses and results summary Hypothesis H1A H1B H2A H2B H3A H3B H4A H4B

Stimulus

Expected outcome

Result

»iewing mode Stereoscopic Stereo more accurate Supported for errors vs. monoscopic Stereo faster Supported for time ¹ype of motion Controlled vs. No difference accuracy Not supported for errors uncontrolled No difference response time Not supported for time Surface characteristic Wire frame Fewer errors solid Inconclusive for errors vs. solid Shorter response time solid Supported for time Shape characteristic Cube vs. No difference accuracy Supported for errors sphere No difference response time Supported for time

Significance p"0.0001 p"0.0001 p"0.0001 p"0.0012 p"0.3571 p"0.0001 p"0.1129 p"0.1129

620 TABLE 3 Means and standard deviations of error rates (in %) and response times (in seconds) by viewing/motion/surface/shape condition combinations Surface/shape characteristic

Stereoscopic viewing Controlled motion

Wire frame Solid Cube Sphere

Monoscopic viewing

Uncontrolled motion

Controlled motion

Uncontrolled motion

Error rate

Response time

Error rate

Response time

Error rate

Response time

Error rate

Response time

4.84 (21.47) 8.31 (27.62) 4.66 (21.09) 8.65 (28.12)

12.61 (8.69) 12.26 (7.84) 12.44 (8.44) 12.43 (8.11)

8.56 (28.00) 11.41 (31.82) 8.12 (27.34) 12.00 (32.52)

12.40 (7.93) 11.77 (6.67) 12.09 (7.60) 12.08 (7.03)

13.40 (34.09) 10.55 (30.73) 12.54 (33.14) 11.35 (31.75)

15.80 (16.31) 12.81 (7.85) 14.90 (15.74) 13.66 (8.79)

15.01 (35.74) 16.25 (36.92) 15.29 (36.01) 16.00 (36.68)

13.87 (10.04) 12.53 (6.98) 13.36 (9.71) 13.02 (7.39) G. S. HUBONA E¹ A¸.

EFFECTS OF MOTION AND STEREOPSIS ON THREE-DIMENSIONAL VISUALIZATION

621

viewing/motion/surface/shape combinations. These data are presented in order to further discuss the effects and interactions of the independent variables on object comparison performances.

6. Discussion Hypotheses H1A and H1B, regarding the performance benefits of stereo viewing, were supported. Stereo viewing resulted in both improved accuracy and shorter response times in accomplishing the object comparison task. Similar results have been reported in several recent studies (Brown & Gallimore, 1995; Sollenberger & Milgram, 1993; Ware & Franck, 1996; Wickens et al., 1994) although there have been contradictory findings. Curiously, Gallimore and Brown (1993) found no performance benefits from stereo viewing, in terms of accuracy or speed, in an object comparison test that used the mental rotation paradigm. They offered two possible explanations for the absence of stereo performance benefits. Since motion depth cues were available in every trial, they speculated that (p. 375): ‘‘dominating depth information afforded by motion was sufficient, such that subjects were able to adequately perform the given task’’. In addition, they suggested that subjects performed a feature-by-feature comparison strategy in conjunction with the controlled motion so that the cue of stereopsis was an unnecessary aid to performance. Brown and Gallimore (1995) suggested that stereopsis is most effective when other depth cues are not available or when the task is visually and perceptually demanding. Sollenberger and Milgram (1993) observed that stereo viewing helped performance, but that stereo viewing combined with motion provided the maximum benefit. However, they noted that (p. 498): ‘‘stereoscopic displays may prove to be better than rotational displays for tasks involving visualization of relative spatial locations of objects or manipulation and navigation of objects in three-dimensional space’’. Ware and Franck (1996) also reported singular (and combined) benefits of stereo and motion, but reported that motion contributed more to accomplishing the task than stereo viewing. In this study, stereo viewing was clearly beneficial, regardless of whether the motion of the object was controlled or uncontrolled, whether the object surface was wire frame or solid or whether the object shape was cubical or spherical (see Table 3). Stereo viewing particularly assisted the accuracy of object comparisons when the motion was controlled by the subjects and the object was a wire frame (mean error rate of 4.84%) or cube (mean error rate of 4.66%). Furthermore, stereo viewing was most helpful in reducing the response time for solid objects that rotated at a constant rate (i.e. uncontrolled motion). Kelsey (1993) cites Neisser (1967) in discussing a particular model of the human visual system, that of preattentive and attentive processing textons, that might help explain the performance benefits of stereo viewing. He noted that the speed of visual perception implies a preattentive level of processing early in the perceptual process, followed by subsequent and more focused attentive processing. Preattentive processing is automatic and very fast, such that visual stimuli are processed in parallel. Certain features called ‘‘textons’’, such as color, line orientation, and line intersections, are detected at the preattentive level. Texton differences enable rapid recognition of a target, to be followed by eye-head movement to bring the target into foveal vision for more focused attentive processing. Attentive processing is more deliberate, requiring conscious cognitive

622

G. S. HUBONA E¹ A¸.

comparisons and the use of memory. The introduction of stereo viewing, particularly for wire frame images, makes the image comparison task easier by shifting more of the perceptual features from the attentive to the preattentive level. Specifically, line orientations and intersections that are seen preattentively at the same visual depth in mono are seen preattentively at different visual depths in stereo. Thus, stereo viewing shifts some of the burden of comparing the objects from the more deliberate and conscious attentive level requiring the expenditure of mental effort onto the more automatic and faster preattentive perceptual level. Thus, object comparisons are facilitated. The two rejected hypotheses (i.e. hypotheses H2A and H2B) stating that the type of motion does not affect comparison performances are perhaps the most significant findings of this study. Previous studies indicate that motion is powerful depth cue, particularly when combined with stereoscopic viewing. There have been numerous studies demonstrating that motion facilitates object recognition (Wallach & O’Connell, 1953; Braunstein, 1976; Todd, 1985; Gallimore & Brown, 1993; Sollenberger & Milgram, 1993). However, it is unresolved whether the type of motion makes a difference in this regard. Several studies suggest that motion is a powerful depth cue no matter how the motion occurs. For example, Van Damme and Van de Grind (1993) found no difference in performance tied to whether the object or the observer was moving. Ware and Franck (1996) reported that performance was the same whether the observer controlled the object motion or simply observed continuous, uncontrolled motion. Intuitively, one might expect that controlled object motion would foster superior comparison performances, particularly, when a feature-by-feature comparison strategy is utilized, because controlled motion enables the observer to rotate the comparison object into a position to compare specific features. The results of this study show that comparison accuracy was significantly improved when object motion was controlled, but response times were longer when object motion was controlled. Controlling the motion of the comparison object improved the quality (i.e. accuracy) of the decision, but at the expense of taking significantly more time to reach that decision. This latter finding is especially surprising because the objects in the uncontrolled motion trials rotated continuously at the relatively slow rate of 18° per second, thus requiring about 20 s for a complete rotation. Upon completing the experiment, many subjects commented that they felt this slow rate of automatic rotation compelled them to wait for at least one complete rotation (i.e. about 20 s) before reaching a decision. Nevertheless, the mean response time for the uncontrolled motion trials was 12.64 s, whereas the average response time for the controlled motion trials was 13.37 s. The average response time across all experimental trials was 13.01 s. Further, many subjects strongly expressed the opinion that they were both more accurate and faster when they controlled the motion. They were more accurate, on average, but they took longer. So the type of motion did impact comparison performances. This finding is at variance with the findings of previous studies (Van Damme & Van de Grind, 1993; Ware & Franck, 1996). Why did the type of motion affect performance in this study? Many of the subjects were observed using a feature-by-feature comparison strategy, similar to that reported by Gallimore and Brown (1993). More specifically, they used a feature-by-feature negation strategy. That is, they would meticulously compare the shapes of the two objects looking for features that did not match. Many subjects said they felt it was easier to confirm that

EFFECTS OF MOTION AND STEREOPSIS ON THREE-DIMENSIONAL VISUALIZATION

623

any two objects were different rather than alike. When they controlled the motion of the comparison object, they were better able to execute this strategy because they could precisely calibrate the angular orientations of the two objects. However, calibrating the angular orientations of the more complex objects often took some time. When they could control the comparison object motion, they were less sensitive to the time because they believed they could definitively match the two objects (or determine they were mismatched, if different) and therefore arrive at the correct decision. In contrast, when they could not control the comparison object motion, they would ‘‘give up’’ after a while and make an educated guess, because they realized they could not rotate the two objects into the exact same orientations. Thus, they modified their ‘‘feature-by-feature negation’’ strategy and hurried a response. Consequently, they were less accurate, but faster to respond, when they could not control the motion of the comparison object. What is the implication about the desirability of controlled and uncontrolled object motion for three-dimensional visual interfaces under development for NASA satellite mission control operations? The features (stereo vs. mono; wire frame vs. solid; cube vs. sphere) of the experimental treatments were derived from actual data visualization prototypes under development. It is generally accepted that the effective, interactive use of information presented via computer technology is largely a function of the ‘‘fit’’ or symmetry, among: (1) human information processing characteristics; (2) the nature of the task to be performed and (3) the manner of presenting the information in the computer interface (Vessey, 1991; Vessey & Galletta, 1991). With respect to the specific task of matching object images in a mental rotation paradigm, the implication is that decision quality (i.e. accuracy) is improved by controlled object motion, although the efficiency (i.e. speed) of making the decisions may suffer. If this object matching task can be generalized to the domain of interpreting satellite operations data presented through similar three-dimensional visual interfaces, then the implication is that operators may be able to interpret system anomalies more accurately, thereby improving the overall quality of spacecraft control. Some specific topics on the utility of motion cues that might be useful to study further include: (1) optimal rates of motion, (2) more finely grained tests of the effectiveness of different types of motion (i.e. object motion, observer motion, head-coupled motion, different varieties of controlled and uncontrolled motion) and (3) the combination of motion with depth cues other than stereoscopic viewing. In this latter regard, the power of object shadows to convey depth information is well-documented. A problem that has impeded research and development in the use of shadowing techniques to convey three-dimensional information is that rendering realistic shadows for moving, concave objects in a real-time display in computationally very expensive. However, the processing power of commonly available desktop computers has increased dramatically in recent years. Thus, these hardware advances can now support practical and more focused research about the utility of effective shadowing techniques in promoting three-dimensional visualization. The main effect of surface characteristic (wire frame, solid) significantly affected response time (i.e. hypothesis H3B was supported), but not error rate (i.e. the data was inconclusive with respect to hypothesis H3A). Specifically, the mean object comparison error rate from trials with wire frame objects was not significantly different from that of solid objects. However, subjects were significantly slower to respond to the wire frame

624

G. S. HUBONA E¹ A¸.

object pairs than to the solid object pairs. Previous studies have also reported performance difficulties dealing with wire frame objects relative to solid objects (Sanford et al., 1987; Gallimore & Brown, 1993; Brown & Gallimore, 1995). Gallimore and Brown (1993) suggested that the presence of hidden or obscured features that are apparent with solid objects, but not with wire frame objects, enabled subjects to perceive and interpret the shape of the solid objects more easily. In this study, stereo viewing facilitated object comparisons for wire frame images more than for solid images. There was a statistically significant interaction of viewing mode and surface characteristic on image comparison performances in the omnibus MANOVA model (F(2,6411)"9.61; p"0.0001). (Note that in reporting the MANOVA main effects, type III sums of squares have been used. Type III sums of squares account for any related interaction influence before main effects are computed.) This interaction effect was significant with respect to both error rate (F(1,6412)"6.66; p"0.0099) and response time (F(1,6412)"13.84; p"0.0002). Figures 7 and 8 graphically present these interactions. Figure 7 illustrates the differential impact of stereo and mono viewing on the accuracy of comparing wire frame and solid objects. In the monoscopic viewing mode, error rates are slightly higher when viewing wire frame images. However, in the stereoscopic viewing mode, the accuracy of comparing wire frame objects improves dramatically and is superior to solid object comparison accuracy. In short, Figure 7 suggests that the introduction of stereo viewing improves the accuracy of comparing wire frame objects to

FIGURE 7. Viewing mode]surface characteristic interaction on error rate.

EFFECTS OF MOTION AND STEREOPSIS ON THREE-DIMENSIONAL VISUALIZATION

625

FIGURE 8. Viewing mode]surface characteristic interaction on response time.

a greater extent than that of comparing solid objects. Similarly, Figure 8 shows that object comparison response time improves to a greater extent for wire frame objects than for solid objects when shifting from monoscopic viewing to stereoscopic viewing. Considered together, Figures 7 and 8 clearly suggest that stereoscopic viewing improves the performance of comparing wire frame images more than solid images. This observation belatedly fulfills the expectation of Gallimore and Brown (1993) that was unconfirmed in their own experimental data. There was no difference in comparison performances tied to the object shape (cube, sphere). Thus, hypotheses H4A and H4B could not be rejected. However, separate ANOVA analyses indicate that subjects compared cube-based objects more accurately than they compared sphere-based objects. (MANOVA is the appropriate statistical technique for this experimental data because the dependent variables are correlated.) An examination of the comparison results data in Table 3 indicates that under the stereo viewing condition, subjects were much more accurate (but not faster) comparing cubes than spheres. However, in the mono viewing condition, there was no advantage to comparing cubes. Kelsey (1993) describes feature detection models of human vision that suggest there are specific visual nerves that only fire with particular stimuli, such as when excited by a particular color, a certain shape or form, a vertical line, an angled line and so forth. The presence of feature detectors in the human visual system is well-documented. Developed through the process of evolution, these ‘‘visual filters’’ presumably serve some adaptive purpose in object/scene recognition performance. In these models, ‘‘feature detectors’’

626

G. S. HUBONA E¹ A¸.

winnow the flood of information detected by the eyes so as to select certain features. The visual field is made up of individual features that are important to recognition and interpretation. Thus, to enhance human performance while using computers, the focus should be on implementing these easily recognizable features in visual displays. It is known that there are retinal cells that specifically detect horizontal and vertical lines (Kelsey, 1993). This could explain why humans might be able to perceive straight and angled lines more readily than curved lines. Thus, cube-based images might be more readily perceived and interpreted than sphere-based images. This research was sponsored by Code 522.2 of the Goddard Space Flight Center (GSFC) of the National Aeronautics and Space Administration (NASA) in Greenbelt, Maryland, USA.

References ANDRE, A. D., WICKENS, C. D., MOORMAN, L. & BOSCHELLI, M. (1990). SID International Symposium Digest of ¹echnical Papers, 21, 347—350. BRAUNSTEIN, M. L. (1976). Depth Perception through Motion. New York: Academic Press. BRIDGES, A. L. & REISING, J. M. (1987). Three-dimensional stereographic pictorial visual interfaces and display systems in flight simulation. SPIE Proceedings, 761, 102—109. BROWN, M. E. & GALLIMORE, J. J. (1995). Visualization of three-dimensional structure during computer-aided design. International Journal of Human—Computer Interaction, 7, 37—56. CARSWELL, C. M., FRANKENBERGER, S. & BERNHARD, D. (1991). Graphing in depth: perspectives on the use of three-dimensional graphs to represent lower-dimensional data. Behaviour & Information ¹echnology, 10, 459—474. GALLIMORE, J. J. & BROWN, M. E. (1993). Visualization of 3-D computer-aided design objects. International Journal of Human—Computer Interaction, 5, 361—382. HORST, R. L., RAU, P. S., LECOCQ, A. D. & SILVERMAN, E. B. (1983). Studies in three-dimensional viewing in teleoperated systems. Proceedings of the 31st Conference on Remote Systems ¹echnology, 1, 30—34, KELSEY, C. A. (1993). Detection of visual information. In W. R. HENDEE & P. WELLS, Eds. ¹he Perception of »isual Information. pp. 30—51. New York: Springer. LIU, Y. & WICKENS, C. D. (1992). Use of computer graphics and cluster analysis in aiding relational judgement. Human Factors, 34, 165—178. MCWHORTER, S. W., HODGES, L. F. & RODRIGUEZ, W. E. (1991). Comparison of 3D display formats for CAD applications. Technical Report GIT-GVU-91-04. Atlanta: Georgia Institute of Technology, Graphics, Visualization and Usability Center. MARIANI, J. A. & LOUGHER, R. (1992). TripleSpace: an experiment in a 3D graphical interface to a binary relational database. Interacting with Computers, 4, 147—162. NEISSER, U. (1967). Cognitive Psychology. New York: Appleton-Century-Croft. PROFFITT, D. R. & KAISER, M. K. (1991). Perceiving environmental properties from motion information: minimal conditions. In S. ELLIS, Ed. Pictorial Communication in »irtual and Real Environments. pp. 47—60. Bristol, PA: Francis & Taylor. REINHART, W. F. (1990). Effects of depth cues in depth judgments using a field-sequential stereoscopic CR¹ display. Unpublished doctoral dissertation, Virginia Polytechnic Institute and State University, Blacksburg, VA. REISING, J. M. & MAZUR, K. M. (1990). 3-D displays for cockpit: where they payoff. SPIE Proceedings, 1256, 35—43. SANFORD, J., BARFIELD, W. & FOLEY, J. (1987). Empirical studies of interactive computer graphics: perceptual and cognitive issues. Proceedings of the Human Factors Society, 31st Annual meeting, 2, 519—523. SHEPARD, R. N. & METZLER, J. (1971). Mental rotation of three-dimensional objects. Science, 171, 701—703.

EFFECTS OF MOTION AND STEREOPSIS ON THREE-DIMENSIONAL VISUALIZATION

627

SOLLENBERGER, R. L. & MILGRAM, P. (1993). Effects of stereoscopic and rotational displays in three-dimensional path-tracing task. Human Factors, 35, 483—499. SPAIN, E. H. & HOLZHAUSEN, K. P. (1991). Stereoscopic versus orthogonal view displays for performance of a remote manipulation task. SPIE Proceedings, 1457, 103—110. TODD, J. T. (1985). Perception of structure from motion: is projective correspondence of moving elements a necessary condition? Journal of Experimental Psychology: Human Perception & Performance, 11, 689—710. VAN DAMME, W. J. M. & VAN DE GRIND, W. A. (1993). Active vision and the identification of three-dimensional shape. »ision Research, 11, 1581—1587. VESSEY, I. (1991). The paradigm of cognitive fit: an information processing analysis of the table versus graphs controversy. Decision Sciences, 22, 219—240. VESSEY, I., & GALLETTA, D. (1991). Cognitive fit: an empirical study of information acquisition. Information Systems Research, 2, 63—84. WALLACH, H. & O’CONNELL, D. H. (1953). The kinetic depth effect. Journal of Experimental Psychology, 45, 205—217. WARE, C. & FRANCK, G. (1996). Evaluating stereo and motion cues for visualizing information nets in three dimensions. ACM ¹ransactions on Graphics, 15, 121—140. WICKENS, C. D. (1992a). ¹he proximity compatibility principle: its psychological foundation and its relevance to display design. Technical Report ARL-92-5/NASA-92-3. Savoy, IL: University of Illinois Institute of Aviation, Aviation Research Lab. WICKENS, C. D. (1992b). Virtual reality and education. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, pp. 842—847. WICKENS, C. D., MERWIN, D. H. & LIN, E. L. (1994). Implications of graphics enhancements for the visualization of scientific data: dimensional integrality, stereopsis, motion, and mesh. Human Factors, 36, 44—61. YEH, Y. &. SILVERSTEIN, L. D. (1990). Visual performance with monoscopic and stereoscopic presentations of identical three-dimensional visual tasks. SID International Symposium Digest of ¹echnical Papers, 21, 359—362. Paper accepted for publication by Associate Editor, Prof. P. Barker.

.