Optimizing Virtual Superimpositions: User–centered Design for a UAR Supported Smart Home System Anne Wegerich ∗,∗∗ Jeronimo Dzaack ∗∗ Matthias R¨ otting ∗∗ ∗
Research Training Group prometei, Technische Universit¨ at Berlin, Germany (e-mail:
[email protected]). ∗∗ Department of Human-Machine-Systems, Technische Universit¨ at Berlin, Germany (e-mail: {jdz,mro}@mms.tu-berlin.de).
Abstract: With using Ubiquitous Augmented Reality (UAR) technologies information systems take the next step towards getting and presenting virtual information everywhere in 3D space. But especially the UAR visualization implies new challenges for the user of such systems. This article shows one first evaluation concerning the incorporation of user-centered design while developing UAR systems for a smart home scenario. Furthermore, we focus on spatial AR (sAR) displays which do not constrain the user’s movements. With the described experiment we evaluated two extracted visualization parameters of sAR information presentation (realism and redundancy of the virtually displayed information) with the help of projected indoor navigation aids (usable in smart homes). Therefore, we tested different forms of superimposed maps and arrows. The results show objective and subjective tendencies and significant effects for schematic and more redundant projective AR presentations. Keywords: Ubiquitous Augmented Reality, User Interfaces, Visualization, Smart Home, Navigation Systems, Performance Evaluation 1. INTRODUCTION In our modern living and working environments computing systems are present everywhere. They take over several tasks and provide information selected by a collection of technologies (i.e. RFID, Grid Computing). In recent years there has been a shift from personal computing devices, where a ”person and the machine [are] staring uneasily at each other across the desktop [to] ubiquitous computing [where] technology recedes in the background of our lives” Weiser (1991). To do so ubiquitous computing (UbiComp) systems combine several technologies for creating an integrated service environment and allow contextual awareness, adaptation and personalization of computing devices. In combination with Augmented Reality (AR) methods ubiquitous computing is extended into 3D space. AR allows adding virtual computer-generated information to physical real-world environments in real-time by merging or augmenting physical elements with virtual elements. In UbiComp environments this enables humans to interact with these digitally enriched surroundings. Thus, in Ubiquitous Augmented Reality (UAR) the way of human information gathering, processing and sharing is changed by new interaction concepts and methods to support humanmachine interaction in three-dimensional space not only for perception but also for manipulation tasks. In this context several research questions arise dealing with information perception and presentation of virtual information and their manipulation taking into account human factors and environmental contexts.
Therefore we designed a real-world scenario of a smart kitchen environment called Ubiquitous Augmented Kitchen (UARKit). The UARKit is equipped with several information panels (e.g. displays, projection screens and holographic projection screens) for augmentation. Within this environment a collection of sensors enables a ubiquitous computing system to track the user, his actions, and the contextual state of the environment (see Fig. 1). The goal of the information system is to adapt AR visualizations automatically to user’s perception based on predefined knowledge about cognitive processes and real-time context data.
Fig. 1. Smart Kitchen Project UARKit with some (intended) display devices and exemplary projections. With focus on the applicability of UAR technologies we state that these technologies need to be non-intrusive, hu-
man centric and transparent. Thus, we investigate human– centered design of interaction methods in real-word and 3D smart environments. There are three topics comprised in this research: (1) presentation, (2) perception and (3) manipulation of virtual information in 3D space. In our project we combine psychological knowledge of human information processing and perception and engineering knowledge regarding the design and the possibilities of technical systems. This article presents research concerning the first aspect: the presentation of virtual information in UbiComp systems on information panels. It focuses on the question how virtual information needs to be presented for the best possible user support. Our hypotheses and ideas mainly base on the investigations of Bertin (1983) and his parameters for discrete data visualization in two-dimensional spaces as well as visualization models for parameter dependencies (e.g. Pfitzner et al. (2003)). Here several rules are stated which support data presentation and provide a modular framework incorporating these rules. In our research we extend this framework with new parameters that are necessary to allow the realization of the specific AR data presentation in 3D spaces. 2. VISUALIZATION ADAPTATION FOR AUGMENTED REALITY The question how AR information should be visualized to support the user is a neglected problem so far but very important, because of the specialty the AR information presentation implies. The entire information is formed by a nearly uncontrollable real part and a virtual superimposition that has to be unambiguous. Thus, one main issue is how to adapt displayed information to the perceptual abilities and limitations of humans and to the specific contextual environment. Therefore, we investigate different virtual information visualizations in experiments. With these experiments we follow the theory of Bertin and Pfitzner to extract AR specific parameters of visualization based on different data types to determine the best possible form of information presentation needed in an AR application. Within the project UARKit we develop several user– adapted visualizations to support the right interpretation of the superimposed data. So the data types mainly concern navigational aids, details of cooking steps, and warnings.
incorporates video-, and optical-see-through displays like head-up displays (HUDs) and projective AR (Bimber and Raskar (2005)). These technologies only allow worldrelated superimpositions which are less precise for the user’s angular field. But they do not constrain the user’s movements and therefore they are more acceptable in the presented context. The second problem relates to the configuration and location of the virtual superimposition within the (video) image presented to the user. A lot of research is done on tracking and registration methods with high accuracy and on rendering algorithms to get high realistic and aesthetic visualizations. But in most applications the presented objects are just the best possible visualizations the current AR system is mathematically able to provide. Until now no effort has been spent on the investigation of how virtual information should be adapted to a user to support the right interpretation of the virtual data. There are no specifications about how to make sure which object of the real world is meant with it (and which not) or what to do with it. Thus, the whole information often is ambiguous for the user (see Fig 2). To investigate exploratively which parameters are applicable to improve perception of AR information we first conducted an experiment that involves AR indoor navigational aids. The tested AR specific visualization parameters here are realism and redundancy of the virtual information. Realism means how realistic the virtual part of the information is visualized. This can be achieved by varying aspects such as textures, shapes and perspective of the superimposition. The redundancy of virtual data is the amount of information that is repeated from reality by virtual information (e.g. shape, color). Both parameters are important resources for AR visualizations and it is expected that they have an effect on user performance.
Fig. 2. Example for an AR application with ambiguous points of interest (unclear meaning and orientation)
2.1 User–centered design for Spatial Augmented Reality For user-centered development of AR visualizations in smart home environments two problems have to be taken into account: (1) physical constraints and (2) information presentation. The first focuses on the physical constraints of current AR display technologies. In a smart home context handheld systems (e.g. PDAs, mobile phones) are not applicable because the user has to hold the device in his field of view. Recent studies showed that head mounted displays are not feasible for users in everyday life because of their weight, their poor comfort, associated cables, and less resolution (Hwang et al. (2006); Jeon and Kim (2008)). Hence, we use spatial AR (sAR), which
2.2 AR Indoor navigation aids Until now only isolated rules exist to optimize AR visualizations following human perception (e.g. Drascic and Milgram (1996); Walther-Franks and Malaka (2008); Laramee and Ware (2002)). But they are not yet summarized and for our purpose they are not concrete enough or not independent from the displaying device they were proposed for. The investigations often focus on specific applications in environments like cockpits (e.g. Aragon and Hearst (2005)) or general visualization dependencies for display types (often restricted to HUDs, e.g. Fischer et al. (1980);
Crawford and Neal (2006)). A general approach or general visualization model for AR visualizations (in smart environments) is missing. However, beside the purpose of visualization optimization concerning the two parameters the evaluation has to follow existing constraints from research concerning displaying navigation aids. These are mainly based on perception, attention, and memory-related findings as well as research about navigation as a process of problem solving. In addition to light conditions, colors, and gestalt laws one important part of visual perception for AR describes the perception of the spatiality and dimensionality (depth cues) of the information and the information space (Furmanski et al. (2002); Jurgens et al. (2006); Swan et al. (2006)). But for world-related projective AR superimpositions no egocentric depth judgements have to be made. Thus, we only need to follow coresponding aspects of spatiality perception (virtual object related) for projective AR. This means e.g. for displaying a navigational aid (for indoor applications) in one perceived spatial depth, all parts of the virtual augmentation have to be in the same spatial depth (Atchley et al. (1997); He and Nakayama (1995)). Furthermore, the information has to have a realistic dimension or dimensional adaptation (Aragon and Hearst (2005)). Neuronal scientists state that the more realistic a virtual information is the more it is perceived and processed like information in the real world (Baumgartner et al. (2006)). While we suggest realism as an important variable parameter of AR visualization we want to evaluate if a more realistic virtual information is a support for a sufficient immersion. So the projected aids in the experiment only occured in one distance to the user and were shown in a 3D perspective. There are only a few findings concerning the amount of redundancy in the visualization. Actually, there is no evidence that more redundancy makes information more clear or unclear, beside the known confusing effect of clutter (Crawford and Neal (2006)). Hence, with the study we additionally wanted to find out whether specified redundancy is also an important parameter which has an influence on user performance. In our application it was evaluated with regard to the content or to formal characteristics of the environmental content. Constraints given by navigation research are also relevant for indoor navigation. To find an object in a local environment no overview knowledge or knowledge about the location of the target is needed. Often this small search space does not require movements of the user. Hence, it is sufficient for the user to have information in his fiel of view (FOV) that shows the right direction (like an arrow) even if the target is not visible (Steck and Mallot (2000)). For a global navigation movements are necessary because the search area ranges over a big environment like buildings or city (districts). Therefore, overviews and knowledge about landmarks and routes are needed (e.g. like in a map) to decide the optimal direction when arriving a (known) turning point (Steck and Mallot (2000)). We assume that the requirements for indoor navigation are right in between these two principles. Movements are needed but the search
space is very small and easy to discover so orientation is easy. Thus, findings from navigation research are not negligible because of the trade-off between local and global navigation constraints. To find out which AR visualization improves the understanding and performance of the user we evaluated different visualization parameters exploratively. In a first step we did a navigational aid experiment concerning the parameters realism and redundancy which have not yet been tested with respect to the constraints summarized aboved. 3. EVALUATION OF SAR VISUALIZATIONS FOR INDOOR NAVIGATION In this section we describe the experiment concerning the navigational type of information in the UARKit project. Regarding the constraints for the visualization of navigational aids in indoor settings there are a lot of possibilities to display a sAR aid. To reduce complexity for the experiment we selected three relevant visualizations with the help of a short prior questionnaire. The expected results of the evaluation based on the reviewed research were than put in five hypotheses which also test established visualization parameters (animation and graphical vs. textual forms). This is to prove the assumption that the basic 2D visualization metaphors are also transferable to AR applications. The presented hypotheses are alternative ones, the null hypotheses state that there is no effect of the independent variable respectively. • H1: The search times will be shorter when there is a directional cue than when the only navigational aid is the target marker. • H2: A graphical visualization of the navigational information leads to greater usability (shorter search time, less errors, higher ratings) than a textual cue (established 2D-visualization parameter). • H3: Varied redundancy of the graphical visualization has an influence on usability. • H4: In case of a redundant visualization, more realism leads to greater user satisfaction (higher ratings). • H5: Animation has an influence on usability (established 2D visualization parameter). The influence of visualization type (H3) and animation (H5) is not further specified, since there is no literature on the influence of these factors in one room indoor navigation. Therefore, the investigation of redundancy and animation influences conducted in this study has an explorative character. All other effects are expected to concur with results from previous studies and can hence be pronounced as directed hypotheses. 3.1 Method The evaluation was carried out by means of a usability test involving 40 participants (20 female) in a within-subject design. Participants were frequent computer users but generally not familiar with digital navigational support. They rated their navigational skills as average to good and were mostly used to finding their way around unfamiliar
environments. Most subjects were between 21 and 40 years old. The great majority (> 70%) held a degree qualifying them for university studies or above. The experiment required the participants to locate several objects in a medium sized room (4×5m) using the array of navigational aids described in the previous section. Three graphical visualizations in a static and an animated version were compared to a textual cue and a control condition without a navigational aid. A target marker was presented in all cases, including the control condition. The resulting 3x2+1 design is presented by Fig 3 and 4. The animated arrow moves in the direction of the searched object. The animated part in the map involved a flying starting point that made a beeline to the target without showing the best route. Furthermore, for all visualizations we use the same amount of content information.
At the beginning of each search, a photograph of the target was shown to the participant. Having recognized the target, the subject was asked to give a verbal signal. This signal was used as a cue for the test supervisor to manually switch the projection from the target photo to the navigational cue, while at the same time starting the time measurement. During the search, the supervisor recorded all errors according to their classification. The predefined error classes consisted of direction errors, doubting or reassurance, not recognizing the target marker and not finding an object or finding the wrong object. When the target had been found, the subject was asked to point to the object and give another signal so that the supervisor could stop the timer. This procedure made it possible to register the point of target discovery from the participant’s point of view. The room was equipped with AR projection systems. Shelves with a closed back side were positioned in a way that separated the room in four quadrants as illustrated in Figure 5. One of these quadrants represented the starting point where the search objects and the directional cues were presented to the participants.
Fig. 3. Test design except control condition (no hint, visualized with a red X).
Fig. 4. Examples for presented navigation support (arrow, schematic and textured map). The influence of the factors visualization type (realism, redundancy) and animation on the usability of the navigational aid was measured using search time, error frequency, error type and user rating (questionnaire) as dependent variables. Search time will give information on efficiency. Error frequency and type represent the effectiveness of the navigational aid. User satisfaction is characterized by the ratings. The experiment consisted of five trials for each condition (including the control condition of no hint), adding up to 40 searches per participant. The 40 targets were the same for all participants, but the visualization types were assigned randomly to the objects to rule out an influence of the object location on the visualization performance. During the experiment, the subject was left alone in the test room to exclude outside influences. To recognize search errors, video cameras were located inside the test room and providing live images of the starting point and each target region. These video images were also recorded in case it would become necessary to look at the searches again during the data analysis.
Fig. 5. Setting of the test environment. Numbers 1 to 3 show the three target regions (edges with min. 2 shelves; arrows show open side of each shelve). The open sides of the shelves in the remaining three quadrants could not be seen from the starting point. They made up the target regions 1 to 3. To control the influence of confounding variables due to the test set up, the distance from the starting point to each of the target regions was kept equal. The possible navigation paths are represented by the green lines in Figure 5. Moreover, the distribution and type of the target objects in each region were randomized in vertical and horizontal position. After the experiment, participants were asked to grade the helpfulness, exactness, comprehensibility and agreeable appearance of each visualization of the navigational information on a scale from 1 (best) to 5 (worse). Moreover, the questionnaire provided information about age, gender, education, profession and navigational skills of the test subjects.
3.2 Results The first step towards analyzing the experimental data was to compare the times and error protocols to discover possible measurement errors. For extraordinary long or short search times with no error recording, the videos were consulted to find out if there had been a mistake by the supervisor or if the irregularity could be explained by the subject’s behavior. The time values with supervisor mistakes were replaced with the average search time of all participants for that particular trial and search condition. The ratings also had to be preprocessed, because there were some missing values. These gaps were filled by inserting the appropriate average from all valid ratings for the respective condition. Additionally, a value for the overall rating was obtained averaging helpfulness, exactness, comprehensibility and agreeable appearance ratings for each navigational aid. Since there are significant positive correlations between all individual ratings of each navigational aid, an overall rating was used to measure user satisfaction in the subsequent analysis. The different factors of usability are first evaluated separately in an analysis of variance (ANOVA) each for time and ratings. The level of significance for all cases lies at 5% (p = 0.05). The analysis of error rate has not yet been completed, but several tendencies can already be perceived in the two other dependent variables. The effect of gender is taken into account, but no significant difference in usability measures for men and women could be found. The other between-subject factors such as age and orientation abilities could not be included into the ANOVAs because they could not be divided into sub-groups of equal sizes. Search Times In the analysis of the search times, a strong learning effect is found which manifests in considerably shorter times for the repeated trials of each tested condition. It was therefore decided to accept a certain training phase consisting of the first trials and only to include into the analysis the four remaining searches respectively. Comparing the mean search times for the different search conditions (Figure 6), users perform significantly (F = 14.682; df = 4.347; p < .001; η 2 = .279) better being provided with a directional cue than with no cue (control condition). Moreover, the mean time of all graphical navigation aids is with 9164 ms significantly shorter than that of the textual cue (12554 ms).
Fig. 6. Comparison of search times in control condition and with textual and graphical navigation aid. η 2 = .080). As expected (well-known effect) animated navigation aids were rated better (1.95) than static visualizations (2.19). But the interaction diagram in Figure 7 shows that the preference for animated arrows is higher than for animated map visualizations. Differentiated user ratings The effects in the overall rating could not be found in all of the discriminate ratings contributing to the overall grade. Conducting one-way ANOVAs for all individual ratings, the exactness rating presented the clearest result with several significant differences between the navigational aids (p = .010; η 2 = .119). As illustrated in Figure 8 the animated arrow was rated significantly better than the static version, the animated schematic map, and both textured map versions. This diagram also shows that text was not rated better than the graphic visualizations. A positive effect of animation could be found in a twoway ANOVA for the exactness and comprehensibility rating. Furthermore, perceived exactness and agreeable appearance showed a significant interaction effect between animation and visualization also evident in the overall rating (see Figure 8).
The factor visualization type does not have a significant effect on search time, as there can not be found any significant difference in mean search times of the arrow, schematic or textured map condition. The effect of animation was as well non-significant in search times; neither did a dependent effect of both factors (interaction effect) exist. User overall ratings The one-way ANOVA including visualization type as the only factor shows no difference between text and graphics in user overall ratings. There is however a significant difference in graphical visualizations which is further investigated in the two-way ANOVA. This analysis shows a significant effect of the factor animation (F = 5.941; df = 1; p = .020; η 2 = .135) and tends to an interaction effect (F = 3.309; df = 1.506; p = .057;
Fig. 7. Interaction effect between visualization type and animation in overall user rating.
Fig. 8. Different perceived exactness. 4. CONCLUSION AND OUTLOOK Our evaluation of AR navigation aids is based on the measurement of search times which involve a lot of particular stages (understanding, walking, etc.). Hence, the results are widely distributed and perhaps could not reveal very small differences in the comprehension of the AR information. But for our application area smart home this is not relevant. The goal is to achieve the best possible acceptance and performance of the user for the whole search task. To draw a conclusion we propose the following suggestions for an AR based indoor navigation support. The incorporated but non AR specific parameters of visualization have to follow 2D desktop presentation adaptations. This means graphical solutions are better then text and animated images are processed better then static ones. If the acceptance of the user is less important than the performance, there are no constraints for the usage of more or less redundancy or realism. Otherwise the navigation aid should be a redundant map which additionally should avoid a realistic presentation of information. In future work we focus on other parameters of AR visualization for different applications and data types which are relevant in a smart home setting. The goal is to incorporate our findings in an expert system to automate the decision for an optimal AR information presentation in the UARKit in real time. ACKNOWLEDGEMENTS We want to thank the DFG, Department of HumanMachine Systems Berlin, and prometei for their support. REFERENCES Aragon, C.R. and Hearst, M.A. (2005). Improving aviation safety with information visualization: a flight simulation study. In CHI ’05: Proceedings of the SIGCHI conference on Human factors in computing systems, 441–450. ACM. Atchley, Kramer, A.F., and Andersen, G.J. (1997). Spatial cuing in a stereoscopic display: Evidence for a ”depthaware” attentional focus. Psychonomic Bulletin and Review, 4, 524–529. Baumgartner, T., Valko, L., Esslen, M., and J¨ ancke, L. (2006). Neural correlate of spatial presence in an
arousing and noninteractive virtual reality: An EEG and Psychophysiology study. CyberPsychology & Behavior, 9(1), 30–45. Bimber, O. and Raskar, R. (2005). Spatial Augmented Reality: Merging Real and Virtual Worlds. A K Peters, Ltd. Crawford, J. and Neal, A. (2006). A review of the perceptual and cognitive issues associated with the use of head-up displays in commercial aviation. The International Journal of Aviation Psychology, 16(1), 1– 19. Drascic, D. and Milgram, P. (1996). Perceptual issues in augmented reality. In Stereoscopic Displays and Virtual Reality Systems III, Proceedings of SPIE, 2653, 123–134. Fischer, E., Haines, R., and Price, T. (1980). Nasa technical paper 1711, NASA Ames Research Center. Furmanski, C., Azuma, R., and Daily, M. (2002). Augmented-reality visualizations guided by cognition: Perceptual heuristics for combining visible and obscured information. In ISMAR ’02: Proceedings of the 1st International Symposium on Mixed and Augmented Reality, 215. IEEE Computer Society, Washington, DC, USA. He, Z.J. and Nakayama, K. (1995). Visual attention to surfaces in three-dimensional space. Proceedings of the National Academy of Sciences of the United States of America, 92(24), 11155–11159. Hwang, J., Jung, J., and Kim, G.J. (2006). Manipulation of Field of View for Hand-Held Virtual Reality, chapter Advances in Artificial Reality and Tele-Existence, 1204– 1211. LNCS, Springer Berlin/Heidelberg. Jeon, S. and Kim, G. (2008). Providing a wide field of view for effective interaction in desktop tangible augmented reality. 3–10. Jurgens, V., Cockburn, A., and Billinghurst, M. (2006). Depth cues for augmented reality stakeout. In CHINZ ’06: Proceedings of the 7th ACM SIGCHI New Zealand chapter’s international conference on Computer-human interaction, 117–124. ACM, New York, NY, USA. Laramee, R. and Ware, C. (2002). Rivalry and interference with a head-mounted display. ACM Trans. Comput.Hum. Interact., 9(3), 238–251. Pfitzner, D., Hobbs, V., and Powers, D. (2003). A unified taxonomic framework for information visualization. In APVis ’03: Proceedings of the Asia-Pacific symposium on Information visualisation, 57–66. Australian Computer Society, Inc., Darlinghurst, Australia. Steck, S.D. and Mallot, H.A. (2000). The role of global and local landmarks in virtual environment navigation. Presence: Teleoper. Virtual Environ., 9(1), 69–83. Swan, J.E.I., Livingston, M.A., Smallman, H.S., Brown, D., Baillot, Y., Gabbard, J.L., and Hix, D. (2006). A perceptual matching technique for depth judgments in optical, see-through augmented reality. In VR ’06: Proceedings of the IEEE conference on Virtual Reality, 19–26. IEEE Computer Society, Washington, DC, USA. Walther-Franks, B. and Malaka, R. (2008). Evaluation of an augmented photograph-based pedestrian navigation system. In SG ’08: Proceedings of the 9th international symposium on Smart Graphics, 94–105. Springer-Verlag, Berlin, Heidelberg. Weiser, M. (1991). The computer for the 21st century. Scientific American, 265(3), 66–75.