A classification of web browsing on mobile devices

A classification of web browsing on mobile devices

Author's Accepted Manuscript A classification of web browsing on mobile devices A. Roudaki, J. Kong, N. Yu www.elsevier.com/locate/jvlc PII: DOI: R...

593KB Sizes 9 Downloads 80 Views

Author's Accepted Manuscript

A classification of web browsing on mobile devices A. Roudaki, J. Kong, N. Yu

www.elsevier.com/locate/jvlc

PII: DOI: Reference:

S1045-926X(14)00150-5 http://dx.doi.org/10.1016/j.jvlc.2014.11.010 YJVLC683

To appear in:

Journal of Visual Languages and Computing

Received date: 21 November 2013 Revised date: 29 October 2014 Accepted date: 21 November 2014 Cite this article as: A. Roudaki, J. Kong, N. Yu, A classification of web browsing on mobile devices, Journal of Visual Languages and Computing, http://dx.doi.org/ 10.1016/j.jvlc.2014.11.010 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

A Classification of Web Browsing on Mobile Devices A. Roudaki J. Kong N. Yu {amin.roudaki, jun.kong, nan.yu}@ndsu.edu North Dakota State University

Abstract The usage of mobile devices is growing fast. Different approaches have been proposed to improve the user experience of Web browsing on mobile devices. This paper comprehensively reviews and compares various techniques for Web browsing on mobile devices. We classify approaches of Web browsing on mobile devices into three categories: platform-specific mobile design, Web page restructuring and zooming based interaction. The platform-specific mobile design technique includes various mobile markup languages and interactive tools that facilitate Web designers to manually create a mobile optimized Web site. The Web page restructuring technique intelligently adapts a desktop Web page for Web browsing on mobile devices. This technique first discovers closely related information (i.e., page segmentation) and then generates an adaptive layout that fits mobile devices. The zooming based interaction displays a desktop Web page within a mobile screen as an overview, which allows a user to zoom in on a specific portion for a detailed reading. Based on the classification, we compare those browsing techniques on mobile devices from six perspectives, i.e., development cost, consistency with the desktop version, dynamic content supports, hardware requirements, display optimization and user operation. We further discuss eight popular commercial mobile browsers. A user study was conducted to evaluate different browsing techniques. This paper demonstrates the state of the art in Web browsing on mobile devices, and compares different approaches. Based on the comparison and user study, we propose future developments in Web browsing on mobile devices.

1. Introduction The mobile devices usage is increasing very fast. According to the report published by comScore, Inc., “for the three-month average period ending in September 2012, 234 million users in U.S. age 13 and older used mobile devices 1 .” Furthermore, 52.6% of U.S. mobile subscribers used Web browsers on their mobile devices, up 2.4% from the preceding 3-month period. A mobile device connected to a wireless network provides a great convenience to access information from anywhere at anytime. However, due to their small screens and different input methods, mobile devices have not been particularly user-friendly for Web browsing. The user study conducted by the Nielsen Norman Group concludes: “In user testing, Web site use on mobile devices got very low scores, especially when users accessed "full" sites that weren't designed for mobile2.” In order to improve the usability of Web browsing on mobile devices, many Web sites released their mobile versions, which are specifically designed for mobile devices. With the development of mobile optimized Web sites, successful experiences have been summarized as guidelines, such as Mobile Web Best Practices [55], the Opera’s Mobile Web Optimization Guide [44] etc. Though the number of mobile Web pages is growing fast, desktop Web pages still dominate the Internet. For example, “98% of smallbusiness Web sites are not mobile optimized3.” Different approaches have been proposed to improve the usability of browsing a desktop Web page on a mobile device, such as automatic adaptation or zooming based interaction. This paper provides a comprehensive discussion on Web browsing techniques on mobile devices and classifies those techniques into three categories:

1

http://www.comscore.com/Insights/Press_Releases/2012/11/comScore_Reports_September_2012_U.S._Mobile_Su bscriber_Market_Share 2 http://www.useit.com/alertbox/mobile-usability.html 3 http://www.prweb.com/releases/vSplash/SMBDigtalScape/prweb9907736.htm

• Platform-specific Mobile Design. This category includes mobile mark up languages and various graphical authoring tools to facilitate the creation of a mobile Web site. The platform-specific mobile design needs mobile designers to invest a significant amount of manual efforts on implementing a mobile Web site, which is especially optimized to fit the distinct features of mobile devices. • Web Page Restructuring. Web page restructuring supports browsing a desktop Web page on a mobile device by adapting the original layout. It first segments a desktop Web page into semantically related information blocks (i.e., page segmentation) and then provides a semantics-directed adaptation. • Zooming based Interaction. Zooming based interaction avoids adaptation, and provides an alternative solution to support browsing Desktop Web pages on mobile devices. It displays a desktop Web page within a mobile screen as an overview, and provides a mobile-friendly navigation facility to efficiently zoom in on any portion for a detailed reading. Based on the above classification, we propose six measures, i.e., development cost, consistency with desktop pages, dynamic content support, hardware requirements, display optimization, and user operation, to compare those three techniques. In order to describe the state of the art in the Web browsing on mobile devices, we further describe eight popular commercial mobile browsers and compare them according to the above measures. A user study was conducted to evaluate various browsing techniques implemented in those browsers. Based on the comparison and user study, we propose insights on future developments in the Web browsing on mobile devices.

2. A Classification of Web Browsing on mobile devices With the fast development of mobile devices, many approaches have been proposed to improve the user experience of Web browsing on mobile devices. One direction is to design a mobile optimized Web site from scratch and the other is to reuse existing Desktop Web pages for Web browsing on mobile devices. In order to provide a mobile-friendly experience on desktop Web pages, some approaches intelligently restructure the original representation to fit mobile devices, while others provide a mobilefriendly navigation facility (such as gesture based zooming) without changing the original layout. As presented in Figure 1, we classify the Web browsing approaches on mobile devices into three categories. Web Browsing on Mobile Devices Platform-specific mobile design

Web page restructuring

Zooming based interaction

Reuse an existing desktop Web site Figure 1. A taxonomy of browsing approaches on mobile devices In a systematic review, one critical facet is to determine the inclusion or exclusion of an approach. Since mobile markup languages and authoring tools form the foundation of designing a mobile web site, the first category covers the publications about mobile markup languages and authoring tools. The key of the web page restructuring is to automatically generate semantic-oriented layouts suitable for mobile devices. Therefore, the second category includes various techniques about page segmentation (i.e., identifying semantic-related information) and adaptive layout generation. The critical issue in the zooming based interaction is to quickly identify the information of interest on a small screen. Accordingly, the last category contains approaches that improve the readability of a thumbnail overview.

• Platform-specific Mobile Design. This category includes mobile markup languages and interactive tools that support authoring a mobile Web site. Mobile markup languages specifically consider the unique features of a mobile device, such as the downloading speed and input methods. However, it is time consuming to create a new mobile Web site from scratch. Furthermore, keeping two versions, one for mobile devices and the other for desktops, may cause a consistency issue. • Web Page Restructuring. The Web page restructuring automatically extracts useful information from a desktop Web page and accordingly adjusts the page to a mobile presentation. In general, information restructuring proceeds in two steps: page segmentation and adaptive layout generation. Page segmentation intelligently discovers and groups semantically related information. Based on the page segmentation, the second step produces an adaptive layout, which places semantically related information in proximity and minimizes the number of operations. Due to the diversity and complexity of HTML specifications, it is challenging to design a generic page segmentation algorithm that is applicable to different Web sites. Another issue is the lack of customizable adaptations by end users. Most existing approaches are stuck to one pre-defined adaptation style (such as a single-column presentation). This “one style fits all” strategy could not offer universal usability. Even without considering users’ personal favors, one fixed style could not cater for the proliferation of diverse mobile devices and different browsing situations. • Zooming based interaction. The zooming based interaction takes a scaled-down image of a desktop Web page as an overview, which presents the overall structure of a Web page. Based on the overview, a user can change his/her focus and zoom in on a specific portion. The zooming based browsing takes the advantage of a capacitive multi-touch screen that supports touch-based gestures (such as pinch and panning). However, the small screen of a mobile device can reduce the readability of an overview. This is especially true when a user has not seen the desktop Web site before. Consequently, the user may take a while to identify the information of interest in the overview. Various techniques have been developed to improve the readability of a small overview. For example, page segmentation combined with text summarization is proposed to increase the visibility of each block in an overview. In summary, the platform-specific mobile design optimizes the user experience of Web browsing on mobile devices, though it requires significant manual efforts for each individual Web site. The Web page restructuring depends on discovering the information organization underlying a Web page, which is challenging due to the diversity of HTML usages. In the zooming based interaction, it is critical to increase the visibility of an overview. The following sections first present an empirical study for understanding users’ perceptions of mobile browsing techniques and then compare them in details.

3. Users’ Preferences We evaluated users’ preferences and experiences using a survey study to understand users’ perceptions of different browsing techniques provided by commercial mobile browsers. 3.1. Method We carried out an online survey research in which students were able to see either descriptions or visual pictures of various browsing techniques and then provide their evaluations and preferences. We used Quatrics to launch the survey online (https://jfe.qualtrics.com/preview/SV_5BV1Y4rFQxNWzNH?Preview=Survey&BrandID=ndstate). Sample. Participants were invited through emails and we required participants to have at least some experiences with smart phones. A total of 129 participants successfully completed the study. About 59% identified themselves as male, with 41% as female. The majority of the participants described themselves as Caucasian (84.3%), followed by Asian (6.3%), Hispanic (3.9%), mixed race (3.1%), and AfricanAmerican (2.4%). Participants’ ages range from 18-42 years (M = 21.4, SD = 3.0). The majority of the participants were college students and they were given small extra credits to a class they were taking as incentives to join this study.

Procedure and measures. Once participants agreed to participate after reading the consent form, they were exposed to a series of questions regarding their prior experiences with web browsing on mobile devices. Then on each page of the survey, they were asked to look at graphics or descriptions portraying different browsing techniques on smart phones and then provide their evaluations. There are four groups of browsing techniques that were included in this survey study: 1) interactive functions (i.e., using dropdown menus to make selections, clicking buttons to navigate, scrolling up and down on a webpage, and tying and inputting information); 2) dynamic content and multimedia (seven types: text, video, audio, graphics, 3D graphics, interactive maps, and mobile games); 3) page rendering (three types: full adaptation, partial adaptation, and zooming); and 4) interactive gestures (eight types: tapping, double tapping, dragging, flicking, pinching, spreading, pressing, pressing and tapping). Because some of the browsing techniques can be difficult for participants to differentiate, we created graphical examples to visually illustrate the three types of page rendering and the eight types of interactive gestures. We also provided descriptions for interactive gestures to explain the subtle differences: 1) tapping (tap once on one object); 2) double tapping (i.e., rapidly tap twice on one object), 3) dragging (move a fingertip over surface without losing contact), 4) pressing (touch an object for a period), 5) stretching (move two fingers apart), 6) pinching (move two fingers closer together), 7) flicking (quickly brush surface with a fingertip), and 8) pressing and tapping (press surface with one finger and briefly touch surface with a second finger). Frequency of Use. We measured participants’ frequency of use based on a 1-5 scale (1=never, 5=all of the time). For example, we asked participants to indicate how often they have used each of the five types of interactive functions, how often they were exposed to seven types of dynamic content, and how often users have encountered the three modes of page rendering. Users’ Preference. In addition to frequency of use, our study evaluated the degree to which users prefer to use each of the five types of interactive functions. We also examined users’ preferences for each of three modes of page rendering. The preference scale ranges from 1 (least preferred) to 5 (most preferred). Ease of Use. The measure of ease of use exams the extent to which users feel a given browsing technique (e.g. an interactive function or an interactive gesture) can be easily used. For example, we asked participants to evaluate each of the eight types of gestures by answering the question “when using the gesture to achieve your interactive goals with your smartphone, is it very easy or very difficult?” We used a 5-point scale for the measure of ease of use (1=very difficult, 5=very easy). At the end of the survey, we asked participants to report their gender, age, and race. 3.2. Results Our sample included participants who have used various brands of smart phones, such as iPhone, HTC Droid, Google Nexus, Samsung Galaxy, and Motorola. The length of smart phone usage ranges from less than a year to 11 years (M = 2.8, SD = 1.7). Questions are organized into three categories, i.e., dynamic content and multimedia, page rendering, and user interface and interaction. Dynamic Web Content and Multimedia. One-way repeated-measures ANOVA with Bonferroni posthoc analysis shows that text-based content (frequency: M = 4.6, SD = 0.5; preference: M = 4.3, SD = 0.9) was used the most frequently [Wilk’s Λ= .25, F (6, 120) = 60.2, p <. 001, partial η2= .75] and was rated the most preferred content format [Wilk’s Λ= .39, F (6, 121) = 31.37, p <. 001, partial η2= .61] when comparing with all other types of content. Videos and graphics were also used more often and were more preferred than audio, 3D graphics, interactive maps, and mobile games (p < .05). 3D graphics (frequency: M = 2.5, SD = 1.3; preference: M = 2.6, SD = 1.2) were used the least frequently and rated the least preferred among all (p < .05). Based on the results revealed in Table 1, we can conclude that individuals prefer simple content. This observation is coherent with the design principle, i.e., “less is more”. Text-based representation is especially suitable for mobile devices since it utilizes a small screen efficiently and consumes only a little bandwidth and resource. Furthermore, text is a desirable media format to support universal usability since

it can be supported even by the basic model of smart phones, be accessed by a broad group of users (e.g., a blind user can access text content through a screen reader), and does not require a high-speed internet. Though users’ preference to simplicity of web browsing is well presented in this study, traditional multimedia contents (i.e., video and graphics) in general supplement text to increase the engagement of web browsing. With the fast development of hardware, some advanced multimedia contents (such as 3D graphics), which were only available on desktops in the past, can be presented on mobile devices. However, users’ preference to them is low, which may be caused by complex operations on those contents through a mobile device. It is challenging to take advantage of the newest hardware to provide rich and interactive multimedia contents while providing user-friendly operations through a mobile device. In summary, this study suggests that users do not expect the web content to be over dynamic – rather they prefer simple and more traditional formats of content. Table 1. Evaluations for dynamic content and multimedia Content

Frequency of Use

Preference

1.Text

4.6 a (0.5)

4.3 a (0.9)

2. Video

3.8 b (0.9)

3.8 b (1.0)

3. Audio

3.5 c (1.1)

3.5 c (1.1)

4. Graphics

4.1 b (0.9)

3.8 b (1.0)

5. 3D Graphics

2.5 d (1.3)

2.6 d (1.2)

6. Interactive Maps

3.2 c (1.1)

3.3 c (1.1)

7. Mobile Games

3.1 c (1.3)

3.1 c (1.4)

Note. Frequency of use was measured with a scale from 1 (never) to 5 (always). Preference was measured with a scale from 1 (least preferred) to 5 (most preferred). Numbers are means and standard deviations (in parentheses). Within each column, means with different subscripts differs at p < .05. One-way repeated measures ANOVA -- Frequency of Use: Wilk’s Λ= .25, F (6, 120) = 60.2, p <. 001, partial η2= .75; Preference: Wilk’s Λ= .39, F (6, 121) = 31.37, p <. 001, partial η2= .61.

Page Rendering. On a mobile device, a desktop page can be rendered as a thumbnail, in which users can zoom in to a specific portion for a detailed reading (i.e., zooming-based rendering), or be fully redesigned to adapt to a mobile device (i.e., a mobile page or full adaption). Our findings revealed that users encountered zooming-based rendering (M = 3.9, SD = 0.8) more often than they were exposed to pages with full adaption (M = 3.4, SD = 0.8) or partial adaption (M = 3.3, SD = 0.8), Wilk’s Λ= .67, F (2, 125) = 30.3, p <. 001, partial η2= .33. The frequency of use in Table 2 implies that desktop pages still dominate the web even though the number of mobile pages is increasingly dramatically. Our results show that a full adaptation (M = 3.5, SD = 1.1) is less favored than a zooming-based (M = 3.9, SD = 0.9) rendering (Wilk’s Λ= .87, F (2, 125) = 9.7, p <. 001, partial η2= .13.), though a full adaptation intends to optimize a layout for mobile devices. The lower preference on a mobile page may be potentially caused by the inconsistency between a mobile page and a desktop page since users in general get the first impression on the layout and structure of contents from a desktop page. On the other hand, a large highdefinition multi-touch screen is essential to provide user-friendly zooming-based browsing. The above observation also implies that mobile web design lags behind desktop web design and lacks consensus on mobile design guidelines (see Table 2).

Table 2. Evaluations for page rendering Page Rendering

Frequency of Use

Preference

1. Full Adaptation

3.4 a (0.8)

3.5 a (1.1)

2. Zooming

3.9 b (0.8)

3.9 b (0.9)

3. Partial Adaptation

3.3 a (0.8)

3.9 b (0.9)

Note. Frequency of use was measured with a scale from 1 (never) to 5 (always). Preference was measured with a scale from 1 (least preferred) to 5 (most preferred). Numbers are means and standard deviations (in parentheses). Within each column, means with different subscripts differs at p < .01. One-way repeated measures ANOVA -- Frequency of Use: Wilk’s Λ= .67, F (2, 125) = 30.3, p <. 001, partial η2= .33; Preference: Wilk’s Λ= .87, F (2, 125) = 9.7, p <. 001, partial η2= .13.

User Interface and Interaction. Our findings revealed that scrolling up and down on a webpage (M = 4.5, SD = 0.7) and also typing in information (M = 4.4, SD = 0.8) were the two functions that were used the most frequently among the four types of interactive functions [Wilk’s Λ= .35, F (4, 125) = 58.5, p <. 001, partial η2= .65]. Additionally, scrolling (M = 4.5, SD = 0.7) was rated the easiest function among the four [Wilk’s Λ= .47, F (4, 124) = 35.3, p <. 001, partial η2= .53.]. The four interactive functions we included (see Table 3) are four common ones that users encounter when browsing web pages. Even though these functions require subtle fingers movements and control, users reported different experiences with them. Scrolling up and down through a swipe gesture was rated the easiest and was used the most frequently (see Table 3). This observation indicates that fast finger movement (such as swipe) is favored over precise finger position (such as selecting a menu, clicking a button or inputting information), which may cause a fat finger error. Table 3. Evaluations for Interactive Functions Frequency of Use

Ease of Use

1.Using drop-down menus to make selections

Interactive Functions

3.6 a (1.1)

3.9 a (1.0)

2. Clicking buttons to navigate

3.8 a (1.2)

4.2 a (0.9)

3. Scrolling up and down on a webpage

4.5 b (0.7)

4.5 b (0.7)

4. Typing and inputting information

4.4 b (0.8)

4.1 a (1.0)

Note. Frequency of use was measured with a scale from 1 (never) to 5 (always). Ease of use was measured with a scale from 1 (very difficult) to 5 (very easy). Numbers are means and standard deviations (in parentheses). Within each column, means with different subscripts differs at p < .001. One-way repeated measures ANOVA -- Frequency of Use: Wilk’s Λ= .35, F (4, 125) = 58.5, p <. 001, partial η2= .65; Ease of Use: Wilk’s Λ= .47, F (4, 124) = 35.3, p <. 001, partial η2= .53.

For the eight common touch-based gestures that we evaluated, we found that users rated them differently in terms of ease of use [Wilk’s Λ= .39, F (7, 117) = 26.2, p <. 001, partial η2= .61]. Tapping (M = 4.8, SD = 0.5) is considered the easiest gesture. Double tapping (M = 4.4, SD = 0.8), dragging (M = 4.4, SD = 0.8), pressing (M = 4.4, SD = 0.8), stretching (M = 4.2, SD = 0.9), and pinching (M = 4.1, SD = 0.9) are the second tier of gestures that were easy to perform. Flicking (M = 4.0, SD = 1.0) was rated a little more difficult than double tapping and dragging. Pressing and tapping (M = 3.4, SD = 1.1) is considered the most difficult one. When looking at the means (see Table 4), all gestures were rated above 4 (a 5-point scale) except for pressing and tapping. This shows that, in general, users were quite familiar with standard gestures (i.e., gestures with one finger or standard zooming gestures – pinch and stretch).

Table 4. Evaluations for Interactive Gestures Content

Ease of Use

1.Tap

4.8 a (0.5)

2. Double Tap

4.4 b (0.8)

3. Drag

4.4 b (0.8)

4. Flick

4.0 c (1.0)

5. Pinch

4.1 bc (0.9)

6. Stretch

4.2 bc (0.9)

7. Press

4.4 bc (0.8)

8. Press and Tap

3.4 e (1.1)

Note. Ease of use was measured with a scale from 1 (very difficult) to 5 (very easy). Numbers are means and standard deviations (in parentheses). Within each column, means with different subscripts differs at p < .05. One-way repeated measures ANOVA -- Ease of Use: Wilk’s Λ= .39, F (7, 117) = 26.2, p <. 001, partial η2= .61.

4. Web Browsing on Mobile Devices This section discusses different approaches in each category in details. 4.1. Platform-specific mobile design The platform-specific mobile design refers to the approaches that create a mobile Web site based on mobile markup languages or conversion rules. Considering the distinct features of Web browsing on mobile devices, mobile markup languages have been proposed to specifically support mobile optimized Web sites. For example, the Handheld Device Markup Language (HDML) [54], originally developed in 1996 by Unwired Planet, is optimized for the following factors: device memory capability, wireless network connection and drop time, device display and control characteristics, and transfer speeds. Some mobile device manufacturers developed their own mobile markup languages, such as Nokia Tagged Text Markup Language (TTML) and Ericsson’s proprietary markup language. In order to provide a common standard across different mobile devices, the Wireless Application Protocol (WAP) forum proposed the Wireless Markup Language (WML) that is developed based on HDML, Nokia Tagged Text Markup Language and Ericsson’s proprietary markup language. WML is now an international standard and has been adopted by the WAP. Based on WML, Bruijn et al. [18] proposed a system, called Rapid Serial Visual Presentation (RSVP) Browser, to support Web browsing on mobile devices. In this system, a Web designer manually creates a set of sub-pages (i.e., cards), each of which presents a story implemented through WML. However, Markup languages that specifically target mobile devices are not used in mobile Web sites any more. Recently, HTML 5 that supports cross-platform developments attracts more attention. It introduces many features to support mobile devices. For example, the offline support allows a mobile user to complete his/her task without being affected by a connectivity interrupt. Besides, HTML 5 has a strong support for multimedia contents. In summary, Table 5 compared three mobile markup languages from four perspectives.

Table 5. Comparison of mobile markup languages Dynamic & multimedia content

Hardware

Adaptation and Display Optimization

User Operation

HDML [54]

Text and image, selection and data entry

Support small displays, limited compute resources, and low bandwidth

Information is structured into one or more cards

Keypad and soft keys

WML

navigational support, data input, hyperlinks, text and image presentation, and forms

Support small displays, limited compute resources, and low bandwidth

Information is structured into one or more cards

Keypad; map frequently used Internet functions to specific keys

HTML5

Support multimedia and graphical content without resorting to plugins and APIs

Support offline web applications; combine with JavaScript to access hardware components

Combine with CSS to adapt content for different devices

Drag and drop; 2D drawing; touch and mouse

In order to support various browsing situations, different authoring tools [16, 37, 64] have been proposed to specify presentations adaptive to viewing conditions that are dynamically changing. Constraint-based approaches [16, 37] are capable to dynamically generate multimedia presentations adaptive to the change of media contents, display environments, and user intention. Zhang et al. [64] proposed a grammatical method to support adaptive layouts by specifying high level structural and spatial relations among multimedia objects through a graph grammar. The Responsive Web Design (RWD) technique, which aims to improve the readability and user friendliness based on the features of the target device, uses CSS3 queries to implement different styles based on device characteristics. RWD requires a designer to specify the target layout based on the screen size and device capabilities (e.g., the media format supported by the target device). Then, CSS3 media queries apply an appropriate style for Web pages based on the following four criteria: 1) the width and height of a browser, 2) the width and height of a screen, 3) the orientation (i.e., landscape or portrait) and 4) the screen resolution. For example, if the screen width is not large enough, the display of a Web page is automatically converted to a single column layout from a two-column one. RWD is especially useful to support multiple devices. In summary, HTML 5 and CSS3 provide a solid foundation to author a mobile Web site, and they include some distinct features to optimize user experiences in Web browsing on mobile devices. For example, CSS3 can limit the display of images to reduce the data transfer on a wireless network. Furthermore, the adaptation capability in CSS3 may potentially reduce the development cost by automatically determining an optimized layout based on the hardware and the mobile browser. Instead of designing a mobile Web site from scratch, some approaches use conversion rules to transform a Web page from one presentation to another. Sun et al. [51] developed a system to support online transactions on mobile devices. Through an automata-based process model, this system enables a Web designer to design a set of conversion rules, which assure only transaction-related information is delivered to a mobile user. In addition, it also supports data entry through speeches. Nichols et al. [41] implemented a client-server based system (called Highlight) that supports rapid prototyping and development of mobile applications. This approach embedded a fully functional browser in a proxy server that transformed and clipped contents according to conversion rules. Then, transformed contents are

delivered to mobile users. This approach supports dynamic contents (e.g., dynamic HTML and Ajax) to some extents. Highlight is further extended to support interactive design [42]. This interactive authoring tool allows a Web designer to specify desirable interactions and contents from an existing desktop Web page, and then it generates a mobile Web page that meets Web designer’s specifications. Chen et al. [23] used XML to manually specify the partition of an HTML Web page. Based on the partitioning, natural language processing technique is used to summarize the main content in each block. This approach intelligently layouts the summary of each block as an overview based on the screen size. According to the overview, a user can click on one block which is enlarged to display its entirety, while the sizes of other blocks are reduced accordingly. This approach supports a context+focus display in a small screen. In addition to Web page authoring tools, some frameworks [27, 40] have been proposed to convert a desktop-based interface to a mobile interface. sConCAF (Siemens CONtext aware Content Assembling Framework) [27] allows application designers to model a user interface at a high level, and then automatically generates a user interface for multiple devices based on the interface model. The Merlion system [40] uses the same logic on a mobile application as its corresponding desktop version by defining a mapping between mobile and desktop applications. First, a designer creates a mobile user interface and defines the mapping between the desktop graphical user interface and the mobile one. Then, the system executes the desktop application on a proxy server and the mobile application sends the user interaction to the proxy server, which performs the corresponding command and returns the updates to the mobile user. In summary, in the platform-specific mobile design, different approaches were developed to reduce the development cost, as summarized in Table 6. Table 6. The comparison of different approaches to reduce the development cost Reduce the development cost [23]

Introduce a document representation for scalable structure, which is binary slicing tree, to facilitate information providers to develop adaptive web.

[27]

Automatically generate a user interface for multiple devices based on a high level model.

[40]

Define a mapping between mobile and desktop applications.

[41, 42]

Present the Highlight system that enables rapid prototyping and deployment of mobile web applications created from existing web sites.

[51]

Use an automata-based process model to automatically extract transaction-related information.

4.2. Web Page Restructuring The Web page restructuring technique is developed with the goal of providing a generic system that automatically adapts contents from a desktop presentation to a mobile one without the need of manual efforts for each individual Web site. Content adaptation bridges the gap between device capabilities and content formats [1]. In automatic content adaptation, one challenge is to group semantically related information since the HTML specification of a Web page does not reveal an information organization, and Web designers may use HTML completely differently. Another challenge is to produce an adaptive presentation that fits the unique features of mobile devices. In order to address those challenging issues, various approaches, such as text summarization, data extraction, image reduction and block recognition, have been applied to the adaptation process. This subsection illustrates different approaches in the category of Web page restructuring.

In early 2000, the Internet had a limited speed, and mobile devices (such as PDAs) were constrained with their computing capacity and small screen. Many mobile devices even did not support color images. Web page restructuring approaches proposed at this time had the objectives of reducing the bandwidth usage and optimizing the utilization of a small screen. The Digestor system [13] is a seminal work that applies a heuristic planning algorithm and a set of heuristic rules (such as reducing the image size, discovering and highlighting the most important header, or sentence elision) to adapt a Web page to achieve the best looking for a given screen size. However, this approach does not support tables and applets. The Power Browser [19] removes images and white spaces to save the screen space. The WEST browser [15] eliminates JavaScript, image maps and frames to maximize the usage of the limited resources, and realizes a context+focus visualization in a limited display environment through text reduction. This approach divides an HTML Web page into several small blocks with a fixed size, called cards, and provides three different display modes (i.e., thumbnail view, keyword view and link view). Users can select one display mode and focus on one card displayed in the central area while other cards are displayed in border areas. Later, heuristic approaches [20, 21, 30, 38] analyze HTML structural tags (like Table) to partition a Web page. Kaasinen et al. [30] automatically converted HTML structures to WML specifications. Buyukkokten et al. [20, 21] recognized several semantic textual units based on HTML tags, e.g., the tag P indicates a boundary between two semantic textual blocks. This method, however, only focuses on texts without supporting graphics. SmartView [38] focuses on partitioning table elements. Due to the complexity and diversity of HTML DOM structures, some researchers proposed visual analysis to implement page segmentation. Yang et al. [59] evaluated the visual similarities of HTML contents, detected the pattern of visual similarities, and then derived a hierarchical information organization in an HTML page. Chen et al. [24] divided a Web page into several high level information blocks according to their sizes and locations, and then identified explicit and implicit separators inside each high level block. CMo [17] utilizes the geometrical alignment of frames to segment a Web page. Paterno et al. [46] split the presentation of a desktop page by calculating the cost (e.g., the number of pixels) for each information object. If the size of the original Web page is too large to be displayed on a mobile screen, this approach iteratively replaces some information blocks with links, which serve as an overview, until the size is reduced to fit the small screen. Later, this work is further extended by allowing users to configure the adaptation process and provide more controls over the costs calculation and the adaptation results [47]. Hybrid analysis has received more and more attention. The Vision-based Page Segmentation (VIPS) [22, 61] utilizes various visual cues and structural information in the DOM structure to partition a Web page at the semantic level. Hattori et al. [29] calculated the strength of connections between content elements based on the structural depth of HTML tags and analyzed the layout to segment a page. This method, however, cannot handle separate Web page components, such as headers and footers. Ahmadi and Kong [2, 3] used a set of heuristic rules from the perspectives of the DOM structure and the visual layout to divide an HTML Web page into several subpages, each including closely related contents and suitable for the small screen display. This approach supports the automatic generation of a table of contents to facilitate the navigation between different subpages. Table 7 summarizes different page segmentation techniques.

Table 7. Page segmentation techniques Page Segmentation DOM structure based analysis

Visual analysis

Hybrid analysis

[13]

Use a heuristic planning algorithm and a set of heuristic rules for page adaptation

[15]

Eliminate JavaScript, image maps and frames, and support a context+focus visualization through text reduction

[19]

Remove images and white spaces

[20,21]

Recognize coherent blocks of text in a Web page based on HTML tags and adapt them to small screen displays

[30]

Automatically convert HTML-based contents to WML

[59]

Calculate visual similarities of HTML contents, detect frequent patterns of visual similarities and group items based on those patterns

[24]

Iteratively partition each content block into smaller ones based on locations, sizes and visual separators

[17]

Utilize the geometrical alignment of frames to segment a Web page

[46,47]

Split a Web page by comparing the required cost (e.g., the number of pixels) for displaying concrete contents with the cost sustainable by a mobile device

[22,61]

Analyze both the DOM structure and visual cues (e.g., size and color) to partition a Web page

[29]

Conduct preliminary segmentation based on the layout of a Web page and then calculate the content-distance based on the structural depth of HTML tags

[2,3]

Develop a set of heuristic rules from the perspectives of the DOM structure and the visual layout to segment a page

Instead of segmenting a whole Web page, some approaches extract from a Web page a list of structured records, which indicate the main content. Ashraf et al. [10] applied a clustering technique to divide raw data in a Web page into clusters, which are further refined to eliminate irrelevant information. The evaluation on this approach shows good performance. Recently, Kong et al. [33] applies the graph grammar to interpret a Web interface and extract structured information from Web pages. According to the hierarchical information organization underlying a Web page, a desktop Web page has to be adapted to fit the distinct features of a mobile device. One optimization is to reduce the overall bandwidth usage by prioritizing information blocks. Based on the priority, the most important/relevant information is first displayed to users. PROTEUS [4] automatically personalizes and adapts Web pages based on user’s previous behaviors. According to the user’s browsing history, PROTEUS highlights or removes some information pieces from the original Web page. For example, PROTEUS will highlight a link, if a user frequently clicks on this link. This approach needs some training before it can predict users’ preferences. Yin et al. [60] calculated the important of an information object based on a set of heuristic rules (such as sizes, text lengths, similar content, width/height ratio and physical offset). In CMo [17], after a user clicks on a link, the system identifies the context of the link and presents the most relevant block in the destination page that follows this link. Otterbacher et al. [45] proposed a summarization method that produces a hierarchical structure of sentences based on their importance. An adaptive presentation is optimized by only delivering the most representative sentence to a mobile user at first. However, this method cannot handle multimedia contents, such as graphs. COSA [36] uses the downloading time as the major criterion to prioritize information objects – the more downloading time, the less importance. Information pieces are classified into four categories, i.e., Text, Banner, Picture & Animation, and Video & Audio, with a descending order of their importance. In each category, the importance is measured in terms of size.

Another optimization is to enable mobile users to efficiently navigate through a large amount of information on a small screen. Due to the lack of a traditional keyboard and a mouse, it is necessary to minimize the number of user operations. One common strategy is to display information vertically in a single column to avoid horizontal scrolling. Automatic scrolling [6, 9] has been proposed to further reduce the number of user operations. Based on page segmentation, some approaches [2, 3, 46] summarize an overview that indicates a table of content. The overview, represented as a list of hyperlinks, can reduce the number of user operations by directing users to a specific topic. Some approaches emphasize on adapting data with a special format, such as HTML tables. For example, Tajima and Ohnishi [52] proposed a novel interaction technique to efficiently browse large HTML tables on a mobile device. This approach displays an HTML table in one of the three rendering modes, i.e., normal mode, record mode, and cell mode. Potla et al. reorganized an HTML table to two different styles, i.e., a single narrow layout and a multi-page layout [48]. Personalized adaptation by end users is desirable to improve the browsing experience [39]. The concept of Collapse-to-Zoom [12] was introduced to eliminate irrelevant blocks (such as menus or advertisements) through natural gestures. Collapsing blocks makes the remaining blocks expand to a larger size for more detailed information and reduces the loading time. PageTailor [14] allows users to remove irrelevant blocks, resize page elements and adjust the layout. The customization can even be applied to different Web pages that have the same template. The above two approaches support customization at the client side and a user can save his/her customization, which can speed up the browsing when the user revisits the same Web page. Table 8. Adaptation for mobile devices Adaptation for mobile devices [4]

Highlight or remove some information pieces based on a user’s browsing history

[60]

Rank the importance of content objects based on a set of heuristic rules (e.g., size, text length, similar content) and only deliver important information to mobile users

[17]

Identify the context of a link clicked by a user and present the most relevant block in the destination page

[45]

Produce a hierarchical structure of sentences based on their importance and deliver the most representative sentence to a mobile user in the beginning

[36]

Display contents by their importance that is evaluated based on the downloading time

[12]

Introduce the concept of Collapse-to-Zoom to eliminate irrelevant block through natural gestures

[14]

Allow users to remove irrelevant blocks, resize page elements and adjust the layout

[6,9]

Automatic scrolling to reduce the number of user operations

[2,3,46]

Summarize a table of content to direct a user to a specific topic

4.3. Zooming based interaction Instead of adapting the original layout, an alternative solution is to use the original layout as an overview and provide an efficient zooming facility for a detailed reading. Though an overview provides the overall structure of a Web page, detailed information in the overview is not readable. Therefore, it is critical to improve the readability of an overview. WebThumb [56] provides the basic panning and zooming capabilities on the thumbnail of a desktop Web page. Especially, this approach introduces a “picking mode” to facilitate information identification from an overview: when a user clicks on a specific point in the thumbnail, the contents enclosed in the smallest tag that covers the selected position are displayed in a popup window with a readable size. This approach also provides a “text mode” to efficiently read a text paragraph: when a user taps on a block of text, this mode displays one word at a time. This mode supports a detailed reading without consuming much space. WebThumb can achieve an

optimal result when a user had browsed a similar desktop Web page before. In MiniMap [49], a page overview is overlaid transparently on the top of the viewport, so users can read detailed information while understanding the overall page structure. The presentation in the viewport is reformatted to accommodate as much information as possible and to minimize scrolling operations. With the consideration that a user is unlikely to have seen a similar desktop Web page before, Woodruff et al. [57] proposed a keyword enhanced thumbnail. This approach extracts from a Web page a list of keywords, which have a transparent background and are displayed with a readable size in proximity to the locations where they occur in the original page. Summary Thumbnail [34] displays a fragment of readable texts in the region of each text block. The Summary Thumbnail preserves the original layout, while allowing users to quickly differentiate different text blocks. When a user zooms in on a specific text block, the text fragment is replaced with complete texts. Arase et al. [5, 7, 9] annotated each information block with a textual description in an overview, which is combined with automatic scrolling. Compared with simple thumbnails, text-enhanced thumbnails (such as Summary Thumbnail) improve the browsing efficiency [34, 57, 58] by highlighting the most important information in an overview to enhance the readability. In order to reduce the number of navigation operations, some approaches [11, 24, 38] combine page segmentation with zooming based interaction, so that a user can directly navigate to a recognized information block. SmartView [38] partitions an HTML Web page into several logic units according to HTML table tags. An overview highlights those recognized information blocks, and a user can select a specific block for a detailed reading. In a detailed view, the layout of an information block is optimized to avoid horizontal scrolling by resizing images, rearranging contents and re-flowing paragraphs. Chen et al. [24] organized information in two levels: an overview in the first level and detailed information in the second level. When a user clicks on an information block in the overview, there are two modes for presenting detailed information. The first mode displays the selected block in a new mobile Web page while the second mode automatically positions the selected block to the center. Baluja [11] implemented a segmentation-based zooming for mobile devices that do not support a touch screen. This approach searches for the best segmentation through a recursive algorithm based on entropy. After page segmentation, it maps the most important blocks (at most 9) to a keypad. When a user presses a key, the viewport accordingly displays the corresponding block. Chiu et al. [25] applied image processing techniques to segment the screenshot of a document (i.e., a bitmap image). Based on the page segmentation, when a user moves the viewport around a document, the system will automatically adjust the zooming factor based on the size of the focused region. Xiao et al. [58] adjusted the VIPS algorithm [22] to construct an SP-tree, in which an internal node consists of a textually enhanced thumbnail image with hyperlinks, and a leaf node is an information block extracted from the original Web page. The hierarchical SP-tree provides a progressive browsing from the top level overview to the detailed contents in a leaf node.

In summary, Table 9 summarizes different techniques in the zooming based interaction. Table 9. The comparison of zooming-based interaction techniques [56]

Targeted panning and zooming on the thumbnail of a Web page; a text mode efficiently reads a text paragraph

[49]

A page overview is overlaid transparently on top of the viewport

[57]

Provide a keyword enhanced thumbnail

[34]

Display a fragment of readable texts in the region of each text block

[5,7,9]

Annotate each information block with a textual description in an overview

[38]

Provide a document thumbnail made of several logic units, and a detailed view of a selected logic unit

[24]

A thumbnail representation provides a global view and index to a set of sub-pages for detail information

[11]

Present the thumbnail image of a web page and allow a user to simply press a single key to zoom into a region

[25]

Automatically adjust the zoom factor of the regions visible in the viewport

[58]

A Web page is transformed into a set of subpages organized in a tree hierarchy, where an internal node consists of a textually enhanced thumbnail image with hyperlinks and a leaf node is a block extracted from the original Web page

5. Comparing Techniques for Web Browsing on Mobile Devices Based on the classification, we compare three browsing techniques from six perspectives, i.e., development cost, consistency with desktop pages, dynamic content support, hardware requirements, display optimization and user operation. The development cost refers to the time needed to design or adapt a Web page. Since consistency is an important issue in the Human Computer Interaction, we compare the consistency between the layout of a mobile page and its corresponding desktop version. Due to the high demand on dynamic contents, it is important to compare three browsing techniques according to the support on dynamic contents. Finally, mobile devices have several limitations, such as the limited capabilities of mobile devices [1, 31], different input methods [62] or a small screen [35]. Therefore, we evaluate how to overcome those limitations in Web browsing on mobile devices from the perspectives of hardware requirements, display optimization and user operation. 5.1. Development Cost Platform-specific mobile design needs significant manual efforts to create a mobile optimized Web site from scratch. In order to speed up the development process, some approaches, such as [23, 41, 42, 51], provided a graphical user interface to convert a desktop presentation to a mobile one. The human involvement in the design process can achieve the highest usability. However, the development process can be costly. Besides, it needs to maintain the consistency between desktop and mobile versions. In contrast, the Web page restructuring and the zooming based interaction can in general be applied to different Web pages without further cost to optimize a specific Web site. However, the page restructuring has to address the page segmentation issue, and the zooming based interaction needs to improve the readability of an overview on a small screen.

5.2. Consistency with Desktop Pages Based on the distinct features of mobile device, the platform-specific mobile design and Web page restructuring techniques optimize a mobile presentation, which causes a different layout from the desktop version. Especially, the mobile presentation in general simplifies the desktop version by eliminating unimportant contents. On the other hand, the zooming based interaction uses the original layout of a desktop page as an overview, which facilitates a user to transfer his/her previous browsing experiences from a desktop Web site to a mobile one. Therefore, zooming based interaction can achieve the best user experience when the user has browsed a similar desktop Web page before. 5.3. Dynamic Content Supports Dynamic contents improve the interactivity in the Web browsing. However, the Web page restructuring in general does not fully support dynamic contents (e.g., JavaScript), due to the modification of HTML source codes: • JavaScript may be removed in the restructuring process; • A page is divided into smaller slices and each slice is presented individually or in different frames. A JavaScript function cannot access to other functions located in other slices. Consequently, the scripts will not function correctly in a mobile page. In contrast, the platform-specific mobile design and the zooming based interaction support dynamic contents well. 5.4. Hardware requirements Adzic et al. [1] summarized three locations to perform content adaptation, i.e., the server side, the client side or an intermediary proxy server. Since the page segmentation in the page restructuring is computationally expensive, it is desirable to perform the page segmentation on a dedicated server. Then, the adaptation can take place at the client side based on the device capability. Furthermore, the server can filter out unnecessary information, which reduces the amount of data transferred over network and speeds up the downloading time. In platform-specific mobile design, a mobile Web page is already manually optimized by a professional, and thus has the least requirement on the hardware. In the zooming based interaction, some operations (such as text extraction) need a relatively high computing capacity, and thus have the hardware requirement in between. The mobile hardware is developing fast. The more powerful computing consumes more energy. In the platform-specific mobile design, web designers consider different factors, including the energy consumption, to lay out a mobile web page and thus create a green design. When browsing desktop-based pages, the web page restructuring is computationally complex due to page segmentation and thus consumes more energy than the zooming based interaction. The removal of multimedia contents in page adaptation can lower the energy consumption but it also reduces the interactivity. 5.5 Display Optimization The Web page restructuring technique adapts the layout of a desktop Web page in a new way that better fits small screen devices, such as changing radio buttons to a dropdown list [46], to save the screen space. Those adaptations significantly increase the effectiveness of information visualization on a small screen. In the zooming based interaction, text summarization is efficient to improve the readability of an overview on a small screen. With the human intervention, the platform-specific mobile design optimizes the layout of a mobile page.

Mobile phones have different screen sizes. Therefore, it is desirable to automatically adjust a mobile web page based on the actual screen size. In the platform-specific mobile design, designers need to consider various screen sizes and accordingly define adjustable layouts. At run time, information blocks can be resized to fit a specific screen size. In the Webpage restructuring technique, the vision based page segmentation can partition a page according to the screen size so that a subpage is suitable for displaying on a specific mobile screen. The zooming based interaction achieves a better user experience on the larger mobile screen since a thumbnail overview is more readable on large screens. 5.6. User Operation Due to a different input method and a small screen, it is desirable to reduce the number of operations (such as page pre-fetching or automatic text copying [65]) in the Web browsing on mobile devices. In the Web page restructuring, the layout is adapted to limit the scrolling to one direction, such as a single column layout or text wrapping. In the zooming based interaction and platform-specific mobile design, natural user interaction has been applied to support efficient navigation on a multi-touch screen. Since a user may perform browsing on mobile devices under different scenarios (e.g., surfing the Internet when he/she is walking or shopping), one-handed interaction attracts more and more attention recently. For example, the OPA Browser [6, 8] maps different functionalities to the numeric keys of a mobile phone so that a user can fully control the browsing with only one hand. Table 10. Compare browsing techniques on mobile devices Platform-specific Mobile Design Independently optimize each mobile Web site

Web Page Restructuring

Zooming based Interaction

No further cost to optimize each individual site

No further cost to optimize each individual site

Consistency with the Desktop Version

A layout is different from the desktop version, and is optimized to fit mobile devices

The adaptation and restructuring make the layout different from the original Desktop Web page

The overview is consistent with the original layout

Dynamic Support

Support

JavaScript is not supported well due to restructuring

Support

Hardware Requirements

Low

High due to page segmentation. A proxy server may be needed.

In between

Display Optimization

Layout is optimized for mobile devices. Authoring tools were developed to adapt content to different devices, such as constraint-based approaches [16, 37], grammar-based approach [64] and CSS.

Resize images and restructure the overall organization

Text enhanced thumbnail

User Operation

Sliding, scrolling

Minimize scrolling at one direction

Zooming and scrolling

Development Cost

Content

clicking

or

In summary, the comparison on three browsing techniques is summarized in Table 10.

6. The Comparison of Mobile Browsers Commercial mobile browsers represent the state of the art about Web browsing on mobile devices. Based on the market share, we select and compare 8 popular mobile browsers, each of which is running on a mobile platform as presented in Table 11. According to the measures discussed in Section 5, we compare those browsers from three perspectives, i.e., dynamic and multimedia content support, page rendering and user interface and interaction (i.e., display optimization and user operation), as presented in Table 12. Table 11. The list of mobile browsers OS Browser Android 4.0 stock browser 1 Chrome Android 4.0 2 Firefox 18 3 Safari 4 iOS 6 Opera Mobile 12 5 Blackberry 10 Blackberry 10 6 Nokia Belle FP2 Nokia S60 7 Internet Explorer 10 Windows Phone 8 8 Multimedia and dynamic contents bring a pleasant experience when users are surfing the Internet. JavaScript has a broad support, while Flash is only supported by the Android stock browser and Firefox, and no browser in Table 11 supports Silverlight. Compared with HTML4, HTML5 is extended with new syntactic features to support multimedia contents, including the new