Interacting with Computers 23 (2011) 268–278
Contents lists available at ScienceDirect
Interacting with Computers journal homepage: www.elsevier.com/locate/intcom
Evaluation of motion-based interaction for mobile devices: A case study on image browsing Sunghoon Yim a, Sungkil Lee b, Seungmoon Choi a,⇑ a
Haptics and Virtual Reality Laboratory, Department of Computer Science and Engineering, Pohang University of Science and Technology (POSTECH), Hyoja-dong, Nam-gu, Pohang, Gyungbuk 790-784, Republic of Korea b School of Information and Communication Engineering, Sungkyunkwan University, Suwon 440-746, Republic of Korea
a r t i c l e
i n f o
Article history: Received 28 May 2010 Received in revised form 5 April 2011 Accepted 17 April 2011 Available online 28 April 2011 Keywords: Mobile device Sensing-based interaction Motion sensing Image browsing
a b s t r a c t This article evaluates the usability of motion sensing-based interaction on a mobile platform using image browsing as a representative task. Three types of interfaces, a physical button interface, a motion-sensing interface using a high-precision commercial 3D motion tracker, and a motion-sensing interface using an in-house low-cost 3D motion tracker, are compared in terms of task performance and subjective preference. Participants were provided with prolonged training over 20 days, in order to compensate for the participants’ unfamiliarity with the motion-sensing interfaces. Experimental results showed that the participants’ task performance and subjective preference for the two motion-sensing interfaces were initially low, but they rapidly improved with training and soon approached the level of the button interface. Furthermore, a recall test, which was conducted 4 weeks later, demonstrated that the usability gains were well retained in spite of the long time gap between uses. Overall, these findings highlight the potential of motion-based interaction as an intuitive interface for mobile devices. Ó 2011 Elsevier B.V. All rights reserved.
1. Introduction Mobile devices, including cellular phones, PDAs, and portable media players, have become indispensable in our daily lives. In addition to their original purposes, they include a number of additional functions, e.g., on music players, photo galleries, and daily planners. The compact physical size of mobile devices, however, imposes inherent limits to intuitive and effective interaction with classic user interfaces such as buttons or a touch pad. For instance, the reduced set of keys in a mobile device causes ambiguities that can lead to an increase in the key-strokes needed to perform a text entry task (MacKenzie and Soukoreff, 2002). The information displayed on a touch-sensitive screen is often obscured by the user’s finger, hampering the user’s ability to capture visual information and select desired menus promptly (Hinckley, 2002). To improve mobile device user interfaces, sensing-based interaction is an attractive option and is considered a key research area. Recent sensors of high performance and small size now make it feasible to identify the user’s action and surrounding environment, even on mobile platforms (Zhai and Bellotti, 2005). A motionsensing interface, which is one type of sensing-based interaction, refers to an interaction device and scheme responding to the user’s
⇑ Corresponding author. Tel.: +82 54 279 2384; fax: +82 54 279 5477. E-mail addresses:
[email protected] (S. Yim),
[email protected] (S. Lee),
[email protected] (S. Choi). 0953-5438/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.intcom.2011.04.001
motion and/or gestures. Motion-sensing interfaces allow natural and intuitive interaction with the mobile device and can enhance device usability also by enlarging its workspace (Yee, 2003; Mehra et al., 2006; Rohs et al., 2007). These benefits can account for the users’ preference for motion-sensing interfaces when carrying out tasks such as image browsing or map navigation, as reported in previous studies (Cho et al., 2007; Rohs et al., 2007). To date, the true viability of motion-sensing interfaces has not been clearly elucidated because of the relatively small number of relevant studies. Recently, a motion-sensing interface on a mobile device was shown to be inferior to a button interface for task completion time (Cho et al., 2007), although the participants generally favored the motion-sensing interface. This study, however, did not reveal the underlying reasons behind the result. For example, the relatively low motion-sensing capability of a mobile device, including low accuracy and stability, can adversely affect task performance. The familiarity of users with an interface is another critical factor; the vast majority of users are very experienced with button interfaces already, but not with motion-sensing interfaces. The usability of a motion-sensing interface is contingent upon these factors, and these factors need to be adequately controlled for any fair comparison of task performance. The study we report in this paper assesses the usability of motion-based interaction on a mobile platform in comparison with a physical button interface, concentrating on task performance. In experimental design, we explicitly took into account the tracking performance of a mobile platform and user familiarity with an
S. Yim et al. / Interacting with Computers 23 (2011) 268–278
interface as external factors to neutralize. The questions we had in mind were: 1. Can motion-based interaction replace button-based interaction? 2. Is the current technical performance of motion sensing high enough with regard to task performance? 3. To what extent can training improve task performance, and how long does it take? We expect answers to the above questions to assist interface designers in building effective mobile interaction systems based on motion sensing. As an experimental task, we adopted image browsing as it is a common function in most mobile devices. The growing need to efficiently manage large image databases fosters the use of a motion-sensing interface that enables continuous sensing and natural interaction. Our investigation used a custom-made motion-sensing interface that could measure three degree-of-freedom (DOF) rotations. The mapping from rotation angles to motion commands was hybrid; short-distance panning was position-controlled and longdistance scrolling was rate-controlled. To consider the effect of implementation fidelity, we employed another motion-sensing system using a cutting-edge commercial 3D tracker (IS900, Intersense Inc.), which is used for immersive virtual reality (VR) applications. We compared these motion-sensing interfaces with a conventional physical button interface. To compensate for the differences in interface familiarity, we trained participants for 23 days until their task performance reached saturation. We paid attention to task completion times using the three interfaces, before and after training, in addition to their evolution over the training period. Data for the subjective usability of each interface, such as ease of use, learnability, preference, naturalness, and fun, were also collected. Lastly, we conducted a recall test in 4 weeks after the training in order to examine the degree of skill retention. Further details are presented in the rest of this paper. 2. Background and related work The concept of a motion-sensing interface for mobile platforms was pioneered by Rekimoto (1996) using a 6-DOF magnetic tracker, and a number of research projects have followed since then. We present a concise review in this section. 2.1. Motion-sensing interfaces for mobile devices On mobile platforms, motion-sensing interfaces have been used as a substitute for traditional buttons and a touch pad. Using several sensors, the movement of the device relative to the surrounding environment can be localized and tracked. In particular, a camera and an accelerometer are common standard components in recent mobile devices. A camera is usually used to capture a translational motion. This movement is generally mapped to the pointer displacement as an alternative to a mouse or a touch screen. Since most mobile devices have a single camera, the movement is typically tracked using feature-based algorithms such as optical flow and marker tracking (Haro et al., 2005; Wang et al., 2006; Hannuksela et al., 2007; Rohs et al., 2007; Rohs and Essl, 2007), instead of homography induced from multiple views. However, feature-based tracking can suffer from a lack of robust features or error accumulation (Barron et al., 1994), leading to the need for more sensors for accurate tracking. Marker-based tracking can offer better robustness through the use of predefined patterns, but this greatly sacrifices flexibility in usage scenarios.
269
An accelerometer is widely used to measure the rotation of a mobile device. Measuring 3-DOF acceleration including gravity acceleration allows estimating two absolute tilt angles along the axes perpendicular to the earth’s gravity vector (Tuck, 2007). For GUIs (Graphical User Interfaces), the axial tilt angles can be mapped to scroll velocity (rate control) or directly to a cursor position (position control) (Crossan et al., 2008), also to augment button-only devices. Besides, the velocity control of scrolling can be leveraged to enrich data presentation (e.g., speed-dependent automatic zooming) (Igarashi and Hinckley, 2000; Eslambolchilar and Murray-Smith, 2004; Jones et al., 2005). Motion-sensing interfaces have been applied to many tasks, including menu selection (Rekimoto, 1996; Oakley and O’Modhrain, 2005; Wang et al., 2006; Oakley and Park, 2009), document scrolling (Rekimoto, 1996; Harrison et al., 1998; Eslambolchilar and Murray-Smith, 2008), image browsing (Bartlett, 2000; Haro et al., 2005; Cho et al., 2007), information navigation (Rohs et al., 2007; Rohs and Essl, 2007), and literal keyboard input (Partridge et al., 2002). Motion-sensing interfaces can also provide better usability by enlarging the workspace of a pen-based touch interface (Yee, 2003) or by aiding the selection of characters mapped onto one key (Wigdor and Balakrishnan, 2003). In order to achieve robust 3-DOF rotation tracking, our motionsensing interface utilizes both a camera and an accelerometer while compensating for their individual limitations. The 2-DOF rotation of a device is mapped to the displacement in an image space for precise position control, and the other 1-DOF rotation is mapped to the scrolling speed for faster image browsing, as detailed further in Section 3. 2.2. Comparative evaluations on motion-sensing interfaces for mobile devices Compared with the variety of applications using motion-sensing technology, only a few studies have evaluated the usability of motion-sensing interfaces in comparison with conventional interfaces. However, the studies presented somewhat inconsistent evidences on the advantages of motion-sensing interfaces. For example, Wang et al. (2006) assessed task performance using their motion-sensing interface based on camera tracking (named the TinyMotion). A pointing task using the interface resulted in relatively low performance in terms of Fitts’s law (Fitts, 1954). In addition, their interface exhibited inferior task performance to a button interface for menu selection. They noted that a low camera sampling rate (12 fps) was largely responsible for the low task performance. More recently, Cho et al. (2007) designed a tilt-sensing only motion interface for a mobile photo browser. Its cursor dynamics (the mapping from tilt angle to scroll velocity) was tuned as an adaptive rule. This interface was empirically compared with the button interface and the touch-sensitive interface of an IPOD (Apple, Inc.). The button interface showed the best task performance, followed by the tilt-sensing motion interface. In terms of subjective preference, the tilt-sensing interface obtained a very high score. On the other hand, Rohs et al. (2007) investigated the utility of a mobile device in map navigation. They compared the traditional panning interface controlled by a joystick-like keypad with the two motion sensing-based interfaces, magic lens (Bier et al., 1993) and dynamic peephole (Yee, 2003). The motion sensingbased interfaces outperformed the panning interface. Hwang et al. (2006) compared the task performance of a button interface with a motion sensing-based interface for 3D selection and navigation tasks in handheld VR. Their results suggested that the motionsensing interface was more effective than the button interface for the 3D tasks. These two studies disclosed some quantitative benefits of motion-sensing interface, in contrast to the other two
270
S. Yim et al. / Interacting with Computers 23 (2011) 268–278
previous studies (Wang et al., 2006; Cho et al., 2007), which leaned toward conventional interfaces. The previous studies described above differed in many aspects, including experimental task and configuration, interface type, motion tracking performance, and the degree of familiarity and habituation. This makes it difficult to elicit the comparative advantages of motion-sensing interface from the studies. This was the primary motivation of the present study.
3. Design and implementation of motion-sensing interface Our motion-sensing interface aims at providing intuitive and natural interaction when the user browses images displayed on a mobile device. One of the efficient image layouts for this situation
Fig. 1. Layout for image browsing using a motion-sensing interface.
is wrapping images onto a 3D cylinder centered at the user (Fitzmaurice et al., 1993) ; images are placed using a ‘‘donut’’ metaphor around the user, as illustrated in Fig. 1. To accommodate a large number of images, images are stacked vertically in multiple layers within a range the user can easily reach. The radial position of an image is defined as h + 2pn, where h 2 [0,2p) is the display position and an integer n is synchronized with the user’s number of horizontal rotations; if the user rotates n times in the body yaw direction (the sign of n determines the rotation direction), the user sees an image with position h + 2pn displayed at angle h. In theory, this representation allows an infinite number of images to be placed on the cylindrical layout. In practice, however, the number of images should be limited to take into account the user’s comfort. The user moves a mobile device to browse images, and measured device movements are mapped to the displacement of a position cursor in the image space. In general, two kinds of mappings are used. In position control, the device movement is directly mapped to the cursor displacement. This straightforward relation has been used in the majority of cursor positioning interfaces, such as mouse, touch-sensitive screen, and dynamic peephole in a planar surface (Yee, 2003). However, to increase its footprint, position control often requires the use of clutching techniques (Hinckley et al., 1994; Hinckley, 2002), where the user resets the cursor position at the boundary of a physical workspace. An alternative is rate control where the device movement (e.g., tilt angle) determines the cursor movement velocity (Harrison et al., 1998; Bartlett, 2000; Eslambolchilar and Murray-Smith, 2008; Cho et al., 2007). While rate control is free from the clutching issue, an overshoot is a common problem and precise position control is difficult (Poupyrev et al., 2002). In our motion-sensing interfaces, we use a hybrid scheme by combining position and rate controls; a local view is accurately controlled using position control, while a rapid movement to another view (i.e., scrolling) is facilitated using rate control. Fig. 2 illustrates this control scheme. For implementation, we need a device capable of 3-DOF rotation sensing: 2 DOF for horizontal/vertical panning (position
Fig. 2. Mapping between the user’s motion in the physical space and browsing operations in the image space. The yaw and pitch rotations are translated into horizontal and vertical movements, respectively, for position control. The roll angle is translated into horizontal movements for rate control.
271
S. Yim et al. / Interacting with Computers 23 (2011) 268–278
control) and 1 DOF for horizontal scrolling (rate control). Our motion-sensing interface obtains the 3-DOF rotations using a camera (Logitech Quickcam for Notebooks) and a 3-axis linear accelerometer (MMA7620, Freescale Inc.). The measurement of the 2-DOF rotation using the accelerometer is straightforward (Tuck, 2007). For vision sensing, we use the optical flow of Lucas-Kanade distinct features (Jianbo and Tomasi, 1994; Bouguet, 1999). The optical flow algorithm can track the relative motion of the camera without predefined markers; therefore, it is suitable for a mobile self-contained tracking system. We regard the displacements of optical flow as pitch and roll rotations, since the mobile device motion is pivoted at the user’s torso in our image layout. Note that the optical flow algorithm, which uses a single view, is unable to accurately distinguish rotations from translational movements. Thus, our implementation also interprets translational movements as rotations. We defined a proportional mapping from the sensor data to the cursor movements in the image space as follows. In contrast to position control, rate control requires an absolute degree of rotation. Hence, we first map the tilt angle (roll) measured by the accelerometer to the horizontal scroll velocity. For position control, the horizontal rotation (yaw) estimated using the camera is mapped to the horizontal panning in the image space. For the pitch, both the camera and the accelerometer can track vertical movements. We examined in a pilot study whether a combination or a single use was better. The users preferred using the camera only, as it could produce smooth responses to both device translation and rotation. As a result, we map the pitch, which is estimated from the camera, to the vertical panning in the image space. Table 1 summarizes these mapping rules.
4. Evaluation methods We designed and performed a nearly 2-month long user experiment to compare motion-sensing interfaces and a button interface quantitatively with respect to task performance of image browsing on a mobile platform. The experiment consisted of two parts: training and a recall test. The participants were trained until their performance improvements became saturated over a period of prolonged practice (23 days). In the recall test conducted 4 weeks later, we examined the degree of the participants’ skill retention. Several important items for subjective usability were also collected. We describe the methods and procedures used for the evaluation in this section and present experimental results in the next section.
4.1. Participants Eighteen paid graduate and undergraduate students (15 males and 3 females) who had normal or corrected-to-normal vision took part in the experiment. The participants’ demographic information is summarized in Table 2 where the participants are grouped by interface type (a between-subject factor of the experiment; see Section 4.4 for details). Their ages ranged from 20 to 27 years old, with an average of 23.6 years. All participants owned mobile phones. We selected young participants since they are usually more willing to learn new functions in their mobile phones. Most participants had prior experience in managing images using a button interface on a mobile phone or a PDA, but no participants had experience with a motion-sensing interface. Six participants reported that they had been managing more than 50 images in their mobile phones. These participants were evenly distributed across the three participant groups for balancing. 4.2. Apparatus Visual stimuli were presented to participants at an 800 600 display resolution using an ultra mobile PC (UMPC; Sony VAIO VGN U71P; 1.1 GHz CPU, 512 MB RAM) on a 5-in. LCD display (see Fig. 3). The UMPC was chosen to guarantee a sufficient sampling rate (e.g., 30 fps) for the camera since it was the most important factor causing low task performances in previous research (Wang et al., 2006). Such a high sampling rate was not easily achievable for a cellular phone or a PDA. To minimize the effect of the relatively heavy weight of the UMPC (about 600 g) on the task performance, we designed 1-day trials to be shorter than 30 min. A program for the experiment was written in C++ with OpenGL on Microsoft Windows XP. Three types of interfaces were used in the experiment: (1) a mobile button interface (BI), (2) the mobile motion-sensing interface (MMI) detailed in Section 3, and (3) an immobile motion-sensing interface (IMI) using a high-end commercial tracker. Fig. 3 shows the sensors used for each interface, and Fig. 4 depicts the UMPC button layout used for the interfaces. A selection button was used for all the three interfaces. BI used four built-in buttons of the UMPC for left/right/up/down movements. When the participants pressed the directional button, the cursor moved discretely to the center of the corresponding adjacent image. Continuous button press made continual image movements with 1/20-s duration. This type of discrete browsing is standard and effective for button-based interfaces because images on mobile devices are usually evenly spaced.
Table 1 Mapping from input motion to sensor data and image browsing operations. Also see Fig. 2. Sensor
Input motion
Mapped motion in sensors
Movement in image space
Camera
Pitch and vertical movement Yaw and horizontal movement
Pitch Yaw
Vertical panning (position control) Horizontal panning (position control)
Accelerometer
Roll Pitch
Roll Not used
Horizontal scrolling (rate control) –
Table 2 Demographic information of the participants. Interface
Number
Gender
Age
BI MMI IMI Total
6 Persons 6 Persons 6 Persons 18 Persons
5 Males, 1 female 5 Males, 1 female 5 Males, 1 female 15 Males, 3 females
21–27 20–26 22–26 20–27
Persons managing >50 images in a phone (avg. (avg. (avg. (avg.
23.5) 23.5) 23.8) 23.6)
2 2 2 6
Persons Persons Persons Persons
272
S. Yim et al. / Interacting with Computers 23 (2011) 268–278
Camera
Accelerometer
(b) Mobile motion-sensing interface (MMI).
(a) UMPC.
(c) Immobile motion-sensing interface (IMI).
(d) Transmitter of IMI.
Fig. 3. Devices and sensors used in the experiment.
Selection
Up Left Screen Down
Selection
Reset
Screen
Right
(a) BI.
(b) MMI and IMI.
Fig. 4. Mapping of buttons to different functions for (a) BI and (b) MMI and IMI.
IMI used a high-definition 6-DOF tracker (IS900, Intersense Inc.) instead of our in-house motion sensors. This 6-DOF tracker uses ultrasonic and inertial sensors. Its precision at a static pose is 2– 3 mm for translations and 0.5–1° for rotations. The update rate of this tracker was set to 60 Hz. A tracking receiver was attached to the UMPC, and its position and orientation were received via wireless LAN from a tracker server (see the bottom row of Fig. 3). The 6-DOF motion data from IMI were transformed and tuned to our best in order to make its motion response similar to the re-
sponse of MMI. For this, translations in the horizontal and vertical directions were converted to pitch and yaw angles by a linear mapping with carefully tuned gains. These pitch and yaw angles were compared with the pitch and yaw angles acquired directly from IMI, and larger values were used for image browsing. Regarding the roll, only the roll angle measured by IMI was used. This procedure resulted in the almost identical responses of IMI to those of MMI. The interface characteristics are summarized in Table 3. 4.3. Stimuli
Table 3 Summary of the interfaces used in the experiment. Interface
Sensors
Input motion
Update rate (Hz)
BI
Physical button (Fig. 3a)
20
MMI
Camera and accelerometer (Fig. 3b) IS900 commercial tracker (Fig. 3c)
Pushing buttons (Fig. 4a) Device rotation (Fig. 2) Device rotation (Fig. 2)
IMI
30 60
Seventy two (24 columns 3 levels) images were randomly allocated in a cylindrical shape, as described in Fig. 1. The images were related to food, tableware, and flatware. As shown in Fig. 5, a 3 3 tile of images was displayed on the screen each time. The ID of each image was specified at the bottom of the image. Each image including its ID was laid in a 200 200 pixel rectangle (26 26 mm2). The thumbnail of the target image to be found was placed in the top right corner. The cursors for the current view and the target image were translucently represented on a mini
273
S. Yim et al. / Interacting with Computers 23 (2011) 268–278
Mini map (transparent)
Cursors
Thumbnail of the target
Cross-hair (transparent)
(a)
(b)
Fig. 5. Visual stimuli presented to the participant: (a) a screenshot and (b) an illustration of the configuration.
map. An additional cross-hair was fixed at the screen center to help participants quickly perceive the center image. 4.4. Design The experiment used a two-factor mixed design. The betweensubject factor was the interface type with three levels (BI, MMI, and IMI). The other within-subject factor was the training period with 24 levels (23 days for training and 1 day for a recall test). During the training period, the performance was measured on each day. A 4-week intermission followed, and then the recall test was performed as a 1 day session. Six participants were assigned to each interface type. They practiced two tasks repeatedly using the assigned interface. The two tasks were distinguished by whether or not the location of a target image was disclosed to the participant. A task, with a known location, induced the participant to focus on handling the interface itself with reduced mental load. To carry out the other task, with an unknown location, the participant had to compare the target and current images while navigating inside the image space. We denote the former task as Locate and the latter as Search. In addition to the two tasks, actual image browsing consists of more general tasks (e.g., property-oriented searches). We chose the two tasks as they allow clear definitions of task completion. The participants were trained until their performance improvement reached a plateau. Based on our pilot study, we selected 23 days as an adequate training period. During training, we examined the task performance of each participant on a daily basis and confirmed that all participants exhibited clear saturation patterns. A recall test was conducted after exactly 4 weeks from the last 23rd training day. On each day, the two tasks were alternated three times (16 trials each). For counterbalancing, a half of the participants performed task Locate before task Search, and the others performed the tasks in the reverse order (48 trials per task and day). Task performance was measured in terms of task completion time, averaged over the 48 trials. In addition to task performance, we measured the subjective usability ratings: ease of use, ease of learning, preference, intuitiveness, naturalness, and fun to use. Each usability item was evaluated on a 7-point Likert scale. Table 4 shows the questionnaire used in the experiment (translated from the original written in the authors’ mother language-Korean). 4.5. Procedure Each trial proceeded as follows. For task Locate, a target image was first shown to the participant. After memorizing the target im-
Table 4 Questionnaire used to rate the subjective usability. No.
Item
Questions
Scale
1. 2. 3. 4. 5.
Ease of use Ease of learning Preference Naturalness Fun
How How How How How
1 1 1 1 1
easy was it to use? easy was it to learn? much do you prefer it? natural was it to use? fun was it to use?
... ... ... ... ...
4 4 4 4 4
... ... ... ... ...
7 7 7 7 7
age, the participant pressed the selection button, and then, the position of the target image was displayed on a thumbnail map. When the participant was ready, s/he pressed the selection button again to begin the search. The initial view was set to the leftmost position of the whole image set. The participant navigated inside the image space to find the target image using an assigned interface (BI, MMI, or IMI). The participant was instructed to press the selection button again as soon as s/he found the target image, which ended the trial. The time taken between the last two selection button presses was recorded as the task completion time. During the search, the up button was used to reposition the screen view to the initial view in MMI and IMI. Fig. 6 illustrates these sequential steps. For task Search, all the steps were the same except that the thumbnail map and the target image location were not displayed on the screen and the translucent mini map, respectively. We collected the usability ratings only on the first and last days of training and after the recall test. On these 3 days, the participants were asked to rate each usability item on the printed questionnaire. The emphasis was on observing how the ratings had changed after prolonged training and after a long recess. 5. Evaluation results and discussion The results of the usability experiment are presented in this section, along with a discussion. 5.1. Task performance Average task completion times and standard errors for the two factors, Interface and Day, are summarized in Fig. 7. Only the data of the first and last training days and the recall test are shown here. Full data over the entire training period are detailed in Section 5.2. For Interface, the task completion time of BI was smaller than those of MMI and IMI for both tasks, Locate and Search. The time difference between MMI and IMI appeared marginal for task Locate, implying that the implementation fidelities of the two mo-
274
S. Yim et al. / Interacting with Computers 23 (2011) 268–278
Task Completion Time (s)
Fig. 6. Experiment procedures. (a) A target image is presented to a participant. (b) The location of the target image is shown in the mini map for task Locate, or a blank screen is shown for task Search. (c) The task begins with an initial view.
Button Interface (BI) Mobile Motion-sensing Interface (MMI) Immobile Motion-sensing Interface (IMI)
10 8 6 4 2 0
2.57 4.54 4.90
1.57 2.43 2.22
1.63 2.46 2.32
1
23
51
Day
Task Completion Time (s)
(a) Locate. Button Interface (BI) Mobile Motion-sensing Interface (MMI) Immobile Motion-sensing Interface (IMI)
10 8 6 4 2 0
6.84 7.95 8.93
3.13 4.16 3.69
3.37 4.48 3.75
1
23
51
Day
(b) Search. Fig. 7. Average task completion times for each interface and each task, measured on the first and last training days (days 1 and 23) and in the recall test (day 51). The vertical error bars indicate standard errors.
tion-sensing interfaces were comparable for this task. In contrast, for task Search, IMI seemed to have better performance than MMI. For Day, the results of all the interfaces exhibited remarkable differences between the 1st day and the 23rd day (before and after training), while the performance gains were well retained even after the 28-day intermission (the 51st day). We performed two-factor mixed model ANOVA to understand the statistical significance of the above results. The number of participants per cell of the between-subject factor Interface was inevitably small (six) because of the long experimental period. Thus, we first tested the normality of our data with the Shapiro–Wilk W test (Shapiro and Wilk, 1965). All the measurements regarding the interfaces, days, and tasks were normally distributed (p > 0.05)
without transformation, except the single cell of task Locate, MMI, and day 1 (W = 0.7433, p = 0.0171). Therefore, we performed a natural log-transformation of the measurements to improve normality. All of these transformed measurements passed the normality test, and they were used for ANOVA. With respect to task Locate, both main effects of Interface and Day were statistically significant at 0.05 significance level (F(2, 15) = 30.17, p < 0.0001 and F(2, 15) = 69.37, p < 0.0001), and no interaction was observed between the two main effects. The differences between the individual levels of Interface (BI, MMI, and IMI) and Day (day 1, 23, and 51) were examined using the Tukey’s HSD test. Day 23 and day 51 were in the statistically same group, and day 1 differed statistically from the others. This indicates that the training during day 1 to day 23 improved the task performance. It is also evident that the skills learned from the training had persisted over the 4 weeks. The two motion-sensing interfaces formed a single statistical group, apart from the other group of BI. One noteworthy observation is that, in spite of the considerable performance improvements of the motion group, the motion group still exhibited slower task completion times than BI by 0.65 s in the recall test. In terms of task Search, both main effects of Interface and Day were statistically significant (F(2, 15) = 5.88, p = 0.0130 and F(1, 15) = 175.11, p < 0.0001, respectively). It also showed no interaction effect. The Tukey’s HSD test showed that statistical groupings were exactly the same as the groupings of task Locate. This implied that the overall results of task Search had similar tendencies to those of task Locate. 5.2. Non-linear regression with the exponential function To further analyze the trends of task performance improvement, we looked into the data of each interface with respect to the training period. The performance curves commonly exhibited a nonlinear asymptotic pattern; the task completion times decreased with training and then saturated to some extent, as shown in Fig. 8. To model this behavior, we used the exponential law of practice (Heathcote et al., 2000). Compared to the common power law of practice (Card et al., 1978), this exponential model takes advantage of the cases that involve pre-experimental practice. In our experiment, the participants were already experienced with the button interface. The following exponential function was used for regression:
EðT N Þ ¼ A þ BeaðN1Þ ;
ð1Þ
where TN is a random variable representing the task completion time on training day N, A is the task completion time at saturation, and B and a characterize the training pattern. B incorporates the ef-
Task completion time (s)
S. Yim et al. / Interacting with Computers 23 (2011) 268–278
9 BI MMI IMI
7
interfaces in task Search than in task Locate. For BI, MMI, and IMI, the respective a = 0.42, 0.35, and 0.62, r2 = 0.96, 0.90, and 0.98, and Nsp = 8.13, 9.56, and 5.83 days. The trends were maintained afterward, as with task Locate. 5.3. Fitts’ law evaluation
5
3
1
0
2
4
6
8
10
12
22 23
Day
(a) Locate.
Task completion time (s)
275
9 BI MMI IMI
7
5
3
1
0
2
4
6
8
10
12
22 23
Day
(b) Search. Fig. 8. Exponential regression of task performance over training. The vertical bars indicate standard errors. The horizontal and vertical dotted lines represent days and task completion times for 5% saturation, respectively.
fect of prior practice, and a determines the learning rate (Heathcote et al., 2000). We considered 5% from the asymptotic limit (A + 0.05B) as a level of reasonable saturation, which leads to a saturation point Nsp = 1 ln(0.05)/a. Model fitting results are provided in Fig. 8. For task Locate, the task completion time decreased more steeply for the motion-sensing interfaces (MMI and IMI) than for the button interface (BI). The a values for BI, MMI, and IMI were 0.26, 0.50, and 0.55 (a higher a means a steeper decrease), respectively. r2 values for the aggregate averages of six participants were 0.94, 0.92, and 0.97 for BI, MMI, and IMI, respectively. The two motion-sensing interfaces showed a similar training tendency in terms of task performance. Initially, their task completion times were about twice higher than that of BI but improved more quickly. The participants were already experienced with the button-based interface, and this seems to have caused the slower performance convergence. Once task performance improvement was saturated at about day 10, the performance differences between the interfaces were maintained during the rest of training. Nsp’s for BI, MMI, and IMI were 12.52, 6.99, and 6.45 days, respectively. These results suggest that the participants were sufficiently trained during the training and the task performance improvements were well saturated. For task Search, the task completion times showed the same pattern. The only noticeable difference was that task Search took longer than task Locate, which was expected. This appears to be responsible for the faster performance improvements of all the
The experimental results showed that the task completion time significantly decreased with repeated training. However, average task completion time quantifies the performance of an input system in a task-dependent manner. Although a target is displayed through the device, our tasks can be modeled using Fitts’s law (Fitts, 1954) because they are analogous to target pointing tasks (Rohs and Oulasvirta, 2008). Fitts’s law is a standard tool for studying the pointing task performance of computer interfaces and allows cross-device comparisons (MacKenzie, 1992). In our case, Fitts’s law cannot be applied to the BI, as it used discrete steps for navigation. In addition, task Search required additional mental loads for searching and matching. Thus, we only used the data of MMI and IMI for task Locate in this analysis. The basic Fitts’s law models movement time MT as:
MT ¼ a þ bID D þ1 ID ¼ log2 W
ð2Þ ð3Þ
where ID is the index of difficulty, D is the distance from an origin to a target, and W is the minimum width of the target. The coefficients, a and b, characterize the performance of an interface, and they are useful for comparing interfaces. Data for the analysis were reclaimed from the main experiment. W was constant (200 pixels). As D varied from trial to trial, ID relied solely on D values. Each trial corresponded to a target pointing task for a randomly chosen D. Note that our experiment used 2D coordinates to represent a target position, and this may require using 2D Fitts’s law in a more strict analysis (MacKenzie and Buxton, 1992). In most cases, however, the vertical movement distance was rather negligible compared to the horizontal movement distance. Therefore, we regarded D as the horizontal movement distance in the trial and applied 1D Fitts’s law. The data of each day had 288 observations (6 subjects 48 trials). The number of wrong target selections was less than 10 trials (3%) for all the interfaces. Some data had ID values lower than 1 (i.e., the target was in the initial view). Outliers that were outside two standard deviation intervals also existed. All these instances were excluded from the analysis, which resulted in 230 points on average for each interface. This data set was further divided into 23 groups by their ID values (1.58–4.52 bits). Analysis results are shown in Fig. 9 along with the a and b parameters of Fitts’s law. The index of performance IP (1/b) (Fitts, 1954; Zhai, 2004) measures the performance of the interface. Overall, the IPs of MMI and IMI were 1.03 and 0.88 bits/s on day 1, 1.60 and 1.77 bits/s on day 23, and 1.74 and 2.10 bits/s on day 51, respectively. The IP also improved significantly after training and stayed at a similar level for the recall test. 5.4. Subjective usability We also recorded the usability ratings of each interface three times (on the first and last days of training and in the recall test), so as to observe how subjective evaluations had changed after training. Average ratings are summarized in Fig. 10 where the vertical error bars represent standard errors. Overall, on the first day, BI surpassed the motion-sensing interfaces in all aspects except for fun, whereas MMI and IMI resulted in similar scores. After the training, the usability ratings of all the interfaces became quite
276
S. Yim et al. / Interacting with Computers 23 (2011) 268–278 2
Movement Time (s)
Subjective usability
Day 1 MT = 0.700+0.972ID , R =0.77 2 Day 23 MT = 0.313+0.624ID , R =0.91 2 Day 51 MT = 0.555+0.574ID , R =0.82
8
6
4
2
BI MMI IMI
7 6 5 4 3 2 1 l Use arning erence atura tuitive N f e In e of Pre Eas se of L Ea
0
0
1
2
3
4
5
Fun
(a) Day 1.
Index of Difficulty (ID)
Subjective usability
(a) MMI. 2
Day 1 MT = 0.792+1.14ID R =0.63 2 Day 23 MT = 0.318+0.564ID R =0.88 2 Day 51 MT = 0.678+0.476ID R =0.85
Movement Time (s)
8
6
4
BI MMI IMI
7 6 5 4 3 2 1 l Use arning erence atura tuitive N f e In e of Pre Eas se of L Ea
2
0
Fun
(b) Day 23.
0
1
2
3
4
5
(b) IMI. Fig. 9. Fitts’s law results of the motion-sensing interfaces.
similar, except fun. Specifically, ease of use and ease of learning for the motion-sensing interfaces improved considerably, approaching those of BI. These trends were well reflected in the statistical analysis where the Kruskal–Wallis non-parametric test (Kruskal and Wallis, 1952; Montgomery and Montgomery, 1997) was applied to the data of each day. Statistical significances were observed in ease of use ðv2ð0:05;2Þ ¼ 9:4570; p ¼ 0:0088Þ, ease of learning ðv2ð0:05;2Þ ¼ 10:7164; p ¼ 0:0047Þ, and fun ðv2ð0:05;2Þ ¼ 11:9407; p ¼ 0:0026Þ on day 1, in only fun ðv2ð0:05;2Þ ¼ 6:9195; p ¼ 0:0314Þ on day 23, and in ease of use ðv2ð0:05;2Þ ¼ 9:2083; p ¼ 0:0100Þ and fun ðv2ð0:05;2Þ ¼ 7:9514; p ¼ 0:0188Þ on day 51. These results indicate that, at first glance, the participants considered the motion-sensing interfaces to be conceptually natural and more interesting than the button interface, but they immediately realized that the motion-sensing interfaces were not easy to use and learn. This explains the low preference scores of the motion-sensing interfaces on day 1. However, the motion-sensing interfaces became sufficiently familiar to the participants after the prolonged training, and they finally gained preference scores comparable to that of the button interface. 5.5. Discussion The experimental results showed that the task performance of each of the two motion-sensing interfaces was initially much inferior to the button interface but approached the same level as the button interface once sufficient training was provided. After task performance saturation, an approximate one second difference in the task completion time still existed between the button and motion-sensing interfaces for both tasks. Whereas this gap is not negligible in daily tasks, it is notable that the differences at the initial
Subjective usability
Index of Difficulty (ID)
BI MMI IMI
7 6 5 4 3 2 1 e e g al Use nin enc Natur ntuitiv I e of f Lear Prefer s a o E e Eas
Fun
(c) Day 51.
Fig. 10. Average ratings of subjective usability for each interface, measured on the first and last days (days 1 and 23) of training and in the recall test (day 51). The error bars indicate standard errors.
phase were greatly reduced with training. To our knowledge, this is the first demonstration for the advantage of prolonged training for task performance on a motion-sensing interface. The training period that was necessary for canceling the differences in pre-experimental experiences and acquiring similar task skill levels turned out to be relatively short. In our experiment, it took only 7–10 days with 15–30 min daily training. This suggests that motion-sensing interfaces may be a viable solution for a wide range of interaction tasks when such training is available. We note that the participants of this study were young and an appropriate training period may depend on the user’s age. Task Search demonstrated greater task performance improvements than task Locate for all the interfaces. To perform task Locate, the participant tried to move the cursor as fast as s/he could, and the movement pattern was similar to a simple pointto-point movement. On the other hand, the prevalent strategy for task Search was to maintain the appropriate velocity level to recognize a target image during navigation. Therefore, task Search was mentally more demanding, and its overall performance was lower than that of task Locate.
S. Yim et al. / Interacting with Computers 23 (2011) 268–278
In the recall test, the motor skills for task Locate obtained by prolonged daily use were shown to be well retained even after a 4-week rest period. This is consistent with previous studies for older people on visual search and motor skills (Ball et al., 1988; Smith et al., 2005). For task Search, which requires more cognitive skills, the recall performance was slightly degraded after the rest period. This suggests that whereas the interaction scheme of a motionsensing device itself can be learned easily and retained, learning it together with a more complex mental strategy may require more practice. The input performance of our motion-sensing interfaces was examined using Fitts’s law, and these results can be compared with other motion-sensing interfaces: the TinyMotion in Wang et al. (2006) and the dynamic peephole in Yee (2003). As the evaluations of existing interfaces did not include prolonged training, we used the first day training data for comparison. On training day 1, (a, b) of the Fitts’s law model was (0.700, 0.972) for MMI and (0.792, 1.14) for IMI. These values are similar to (0.458, 1.112) of the TinyMotion (Wang et al., 2006), indicating that precise selection was not easy with these interfaces. When compared to (0.636, 0.881) of the dynamic peephole (originally proposed in Yee (2003) and implemented and analyzed in Rohs and Oulasvirta (2008)), our interfaces exhibited inferior performance. This might be due to the use of marker-based tracking (Rohs and Oulasvirta, 2008), in contrast to the markerless tracking of our interfaces. Note that after training, (a, b) improved significantly to (0.313, 0.624) for MMI and (0.318, 0.564) for IMI. Another noteworthy result was the similar scores for the two motion-sensing interfaces (MMI and IMI) in both task completion time and Fitts’s law analysis. This was despite the fact that IMI had better performance in accuracy and frame rate (60 fps) than MMI (30 fps). This finding implies that motion tracking fidelity ceases to contribute to improving task performance and/or usability once the fidelity exceeds a certain threshold. Hence, motionsensing interface designers can focus on other issues, such as the design of interaction schemes and an effective GUI layout space, rather than polishing tracking performance beyond the threshold. We expect that the actual value of tracking performance threshold would depend on many factors, such as the camera specification, motion tracking algorithm, interaction scheme, and device form factor. In spite of the prolonged training, our motion-based interfaces were unable to catch up with the button interface in terms of task performance, which in part contradicts the two previous studies (Hwang et al., 2006; Rohs et al., 2007). We infer that this is mainly due to the difference of navigation patterns; our button interface, as well as the interface of Cho et al. (2007), was used to control discrete transitions, while in Hwang et al. (2006) and Rohs et al. (2007), interaction keys were mapped to continuous navigation. The use of discrete transitions can considerably facilitate navigation, in particular, in the environments where items are evenly spaced, as reported in Cho et al. (2007). Another lesson we learned is that the effectiveness of the motion-sensing interface may be contingent upon applications. For instance, the previous studies argued that a motion-sensing interface is quite useful in applications where continuous navigation is indispensable. Target environments in the applications that are too large to be displayed within the device screen, such as map navigation (Rohs et al., 2007) or 3D virtual environments (Hwang et al., 2006), correspond to these cases. In the present study, however, discrete navigation also suffices for browsing a set of small images, and the button interface outperformed the motion-sensing interfaces. A promising alternative to a pure motion-sensing interface is a hybrid system that combines a motion-sensing interface and a button interface. One such interface described previously was a combi-
277
nation of the dynamic peephole and the pen-based interface (Yee, 2003). The dynamic peephole was used to extend the workspace, while the pen-based interface was responsible for pinpointing targets. Similar roles can be applied to a motion-sensing interface and a button interface; the tilt-sensing rate control can facilitate longdistance navigation, and the button interface can be in charge of instant selection in a narrow range. This combination could replace the typical pattern in the button interface—holding the button down for an extended period. Comparisons with such hybrid interfaces are encouraged as future research issues. 6. Conclusions Using the motion-sensing interface for mobile platforms is a promising option to compensate for the limitations of current button-based interfaces, enabled by recent sensor technology for mobile platforms and users’ interest in sensing-based interaction. Understanding the benefits of motion-sensing interfaces can lead to more effective designs for motion-based interfaces and interaction schemes. In this article, we have empirically investigated the efficacy of motion-sensing interfaces in terms of task performance and subjective usability by comparing two motion-sensing interfaces with a button interface, with image browsing as the representative task. Important findings can be summarized as: 1. The motion-sensing interface can exhibit obviously lower task performance than the button interface, but with training its performance can approach, while still falling behind, the button interface performance. The time required for training can be moderate (15–30 min per day for about 6–10 days in our experiment). 2. The skills for motion-sensing interfaces can be well retained even after a long intermission. 3. Motion-tracking performance, once it exceeds a certain threshold level, is no longer relevant to the task performance of the motion-sensing interface. In our study, 30-fps vision sensing was sufficient. 4. Experienced users give higher subjective usability ratings to the button interface for most items. Appropriate training can raise the ratings of the motion-sensing interface so that they are comparable to those of the button interface. The motion-sensing interface is always regarded as more fun, regardless of training level. In addition, our experience suggests that an approach that combines both motion-sensing and button interfaces has high potential for effective mobile device interaction. For example, motion sensing can be used for quick long-distance constant-speed navigation, while buttons control local detail. We plan to design and examine such hybrid interaction systems in the future. Acknowledgements This work was supported in parts by an NRL program 20100018454 and a BRL program 2010-0019523 from NRF and by an ITRC program NIPA-2010-C1090-1011-0008, all funded by the Korean government. References Ball, K.K., Beard, B.L., Roenker, D.L., Miller, R.L., Griggs, D.S., 1988. Age and visual search: expanding the useful field of view. Journal of the Optical Society of America A 5 (12), 2210–2219. Barron, J.L., Fleet, D.J., Beauchemin, S.S., 1994. Performance of optical flow techniques. International Journal of Computer Vision 12 (1), 43–77. Bartlett, J.F., 2000. Rock ‘n’ Scroll is here to stay. IEEE Computer Graphics and Applications 20 (3), 40–45.
278
S. Yim et al. / Interacting with Computers 23 (2011) 268–278
Bier, E.A., Stone, M.C., Pier, K., Buxton, W., DeRose, T.D., 1993. Toolglass and magic lenses: the see-through interface. In: Proceedings of the 20th Annual Conference on Computer Graphics and Interactive Techniques. SIGGRAPH ’93. ACM, pp. 73–80. Bouguet, J.Y., 1999. Pyramidal Implementation of the Lucas Kanade Feature Tracker Description of the Algorithm. Tech. Rep., Intel Corporation. Card, S.K., English, William K., Burr, B.J., 1978. Evaluation of mouse, rate-controlled isometric joystick, step keys, and text keys for text selection on a CRT. Ergonomics 21 (8), 601–613. Cho, S., Murray-Smith, R., Kim, Y., 2007. Multi-context photo browsing on mobile devices based on tilt dynamics. In: Proceedings of Mobile HCI, pp. 190–197. Crossan, A., Williamson, J., Brewster, S., Murray-Smith, R., 2008. Wrist rotation for interaction in mobile contexts. In: Proceedings of Mobile HCI, pp. 435–438. Eslambolchilar, P., Murray-Smith, R., 2004. Tilt-based automatic zooming and scaling in mobile devices – a state-space implementation. In: Proceedings of Mobile HCI, pp. 120–131. Eslambolchilar, P., Murray-Smith, R., 2008. Control centric approach in designing scrolling and zooming user interfaces. International Journal of Human– Computer Studies 66 (12), 838–856. Fitts, P.M., 1954. The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology 47 (6), 381–391. Fitzmaurice, G.W., Zhai, S., Chignell, M.H., 1993. Virtual reality for palmtop computers. ACM Transactions on Information Systems 11 (3), 197–218. Hannuksela, J., Sangi, P., Heikkilä, J., 2007. Vision-based motion estimation for interaction with mobile devices. Computer Vision and Image Understanding 108 (1–2), 188–195. Haro, A., Mori, K., Capin, T., Wilkinson, S., 2005. Mobile camera-based user interaction. In: Proceedings of the Computer Vision in Human–Computer Interaction, pp. 79–89. Harrison, B.L., Fishkin, K.P., Gujar, A., Mochon, C., Want, R., 1998. Squeeze me, hold me, tilt me! an exploration of manipulative user interfaces. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 17–24. Heathcote, A., Brown, S., Mewhort, D., 2000. The power law repealed: the case for an exponential law of practice. Psychonomic Bulletin & Review 7 (2), 185–207. Hinckley, K., 2002. Input technologies and techniques. In: Jacko, J.A., Sears, A. (Eds.), Handbook of Human–Computer Interaction. Lawrence Erlbaum Associates Inc., pp. 161–176. Hinckley, K., Pausch, R., Goble, J.C., Kassell, N.F., 1994. A survey of design issues in spatial input. In: Proceedings of the ACM Symposium on User Interface Software and Technology, pp. 213–222. Hwang, J., Jung, J., Kim, G.J., 2006. Hand-held virtual reality: a feasibility study. In: Proceedings of the ACM Symposium on Virtual Reality Software and Technology, pp. 356–363. Igarashi, T., Hinckley, K., 2000. Speed-dependent automatic zooming for browsing large documents. In: Proceedings of the ACM Symposium on User Interface Software and Technology, pp. 139–148. Jianbo, S., Tomasi, C., 1994. Good features to track. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 593–600. Jones, S., Jones, M., Marsden, G., Patel, D., Cockburn, A., 2005. An evaluation of integrated zooming and scrolling on small screens. International Journal of Human–Computer Studies 63 (3), 271–303. Kruskal, W.H., Wallis, W.A., 1952. Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association 47 (260), 583–621.
MacKenzie, I.S., 1992. Fitts’ law as a research and design tool in human–computer interaction. Human–Computer Interaction 7, 91–139. MacKenzie, I.S., Buxton, W., 1992. Extending Fitts’ law to two-dimensional tasks. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 219–226. MacKenzie, I.S., Soukoreff, R.W., 2002. Text entry for mobile computing: models and methods, theory and practice. Human–Computer Interaction 17 (2), 147–198. Mehra, S., Werkhoven, P., Worring, M., 2006. Navigating on handheld displays: dynamic versus static peephole navigation. ACM Transactions on Computer– Human Interaction (TOCHI) 13, 448–457. Montgomery, D., Montgomery, D., 1997. Design and Analysis of Experiments. Wiley, New York. Oakley, I., O’Modhrain, S., 2005. Tilt to scroll: evaluating a motion based vibrotactile mobile interface. In: Proceedings of World Haptics Conference, pp. 40–49. Oakley, I., Park, J., 2009. Motion marking menus: an eyes-free approach to motion input for handheld devices. International Journal of Human–Computer Studies 67 (6), 515–532. Partridge, K., Chatterjee, S., Sazawal, V., Borriello, G., Want, R., 2002. TiltType: accelerometer-supported text entry for very small devices. In: Proceedings of the ACM Symposium on User Interface Software and Technology, pp. 201–204. Poupyrev, I., Maruyama, S., Rekimoto, J., 2002. Ambient touch: designing tactile interfaces for handheld devices. In: Proceedings of the ACM Symposium on User Interface Software and Technology, pp. 51–60. Rekimoto, J., 1996. Tilting operations for small screen interfaces. In: Proceedings of the ACM Symposium on User Interface Software and Technology, pp. 167–168. Rohs, M., Essl, G., 2007. Sensing-based interaction for information navigation on handheld displays. In: Proceedings of Mobile HCI. ACM, pp. 387–394. Rohs, M., Oulasvirta, A., 2008. Target acquisition with camera phones when used as magic lenses. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, pp. 1409–1418. Rohs, M., Schöning, J., Raubal, M., Essl, G., Krüger, A., 2007. Map navigation with mobile devices: virtual versus physical movement with and without visual context. In: Proceedings of the 9th International Conference on Multimodal Interfaces. ICMI ’07. ACM, pp. 146–153. Shapiro, S.S., Wilk, M.B., 1965. An analysis of variance test for normality (complete samples). Biometrika 3 (52). Smith, C., Walton, A., Loveland, A., Umberger, G., Kryscio, R., Gash, D., 2005. Memories that last in old age: motor skill learning and memory preservation. Neurobiology of Aging 26 (6), 883–890. Tuck, K., 2007. Tilt sensing using linear accelerometers. Tech. Rep. AN3461, Freescale Semiconductor. Wang, J., Zhai, S., Canny, J., 2006. Camera phone based motion sensing: interaction techniques, applications and performance study. In: Proceedings of the ACM Symposium on User Interface Software and Technology. pp. 101–110. Wigdor, D., Balakrishnan, R., 2003. TiltText: using tilt for text input to mobile phones. In: Proceedings of the ACM Symposium on User Interface Software and Technology, pp. 81–90. Yee, K.-P., 2003. Peephole displays: pen interaction on spatially aware handheld computers. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1–8. Zhai, S., 2004. Characterizing computer input with fitts’ law parameters – the information and non-information aspects of pointing. International Journal of Human–Computer Studies 61 (6), 791–809. Zhai, S., Bellotti, V., 2005. Introduction to sensing-based interaction. ACM Transactions on Computer–Human Interaction (TOCHI) 12, 1–2.