DangerNeighbor attack: Information leakage via postMessage mechanism in HTML5

DangerNeighbor attack: Information leakage via postMessage mechanism in HTML5

Accepted Manuscript DangerNeighbor Attack: Information Leakage via postMessage Mechanism in HTML5 Chong Guan, Kun Sun, Lingguang Lei, Pingjian Wang, ...

822KB Sizes 0 Downloads 27 Views

Accepted Manuscript

DangerNeighbor Attack: Information Leakage via postMessage Mechanism in HTML5 Chong Guan, Kun Sun, Lingguang Lei, Pingjian Wang, Yuewu Wang, Wei Chen PII: DOI: Reference:

S0167-4048(18)30848-4 https://doi.org/10.1016/j.cose.2018.09.010 COSE 1402

To appear in:

Computers & Security

Received date: Revised date: Accepted date:

28 July 2018 25 September 2018 29 September 2018

Please cite this article as: Chong Guan, Kun Sun, Lingguang Lei, Pingjian Wang, Yuewu Wang, Wei Chen, DangerNeighbor Attack: Information Leakage via postMessage Mechanism in HTML5, Computers & Security (2018), doi: https://doi.org/10.1016/j.cose.2018.09.010

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT 1

DangerNeighbor Attack: Information Leakage via postMessage Mechanism in HTML5

of postMessage mechanism, the messages are delivered to a webpage or the window object corresponding to the webpage with specific origin property. When there are multiple receiver functions in the webpage, the message will be broadcast to each receiver function. However, along with its usability and convenience, the postMessage mechanism also raises new security risks [2], [3]. Particularly, when the hosting page includes a piece of script from a malicious service provider, the malicious script can seize all privileges of the hosting page to steal sensitive information [4], [5], [6]. In this paper, we develop a new attack called DangerNeighbor attack that misuses the postMessage mechanism to steal sensitive information from a hosting webpage. It is based on one observation that when a hosting page receives messages via postMessage mechanism from third-party providers (i.e., multiple senders), all receiver functions in the hosting page are capable of receiving all the messages destined to the hosting page. Therefore, when there is a malicious receiver function that keeps eavesdropping the messages sent to the hosting page, all sensitive information in the messages will be leaked out. We confirm the existence of DangerNeighbor attacks by implementing two proof of concept prototypes using malicious third party service provider, and malicious browser extension, respectively. In the first method, the receiver function should be implemented by the receiver and there is no need to distinguish receiver functions for the message; however, in the real world, the receiver function is usually implemented in the imported JavaScripts, which are provided by a third party. Thus, a malicious third party service provider can launch DangerNeighbor attack to stealthily gather user sensitive information. In the second method, since a malicious browser extension has the privilege to eavesdrop the messages, attackers can develop a malicious browser extension and upload it to the public browser extension store for future download by victims. We verify that the malicious browser extensions can easily pass the extension store’s security check. To evaluate the impacts of DangerNeighbor attacks in the wild, we perform a large-scale empirical study on the receiver functions and messages sent by postMessage mechanism in webpages. We develop a data collection JavaScript to collect the messages and receiver functions and then inject it into the homepages of top 5,000 Alexa [7] sites and 1,200 randomly

ED

M

AN US

Abstract—The postMessage mechanism in HTML5 enables webpages with different origins to communicate with each other on a hosting webpage. When the hosting webpage contains multiple receiver functions from different origins, all receiver functions can receive any messages sent to this webpage. However, if one receiver function is malicious and may deliberately eavesdrop on all messages sent to the hosting webpage, there exists a risk of information leakage. In this paper, we perform a systematic study on this new type of information leakage threat named DangerNeighbor attack, which can eavesdrop messages sent through postMessage by inserting a malicious receiver function into the hosting page. We implemented two proof of concept prototypes of DangerNeighbor attacks using malicious third party service provider and malicious browser extension, respectively. To evaluate the feasibility of DangerNeighbor attack, we study Alexa top 5,000 websites and 1,200 Chrome extensions, and our analysis results verify the wide existence of postMessage vulnerability in the wild. Particularly, we perform a case study of DangerNeighbor attack against the OAuth access token. We find that 39.61% of websites using Facebook OAuth and 23.38% of websites using Google OAuth may leak users’ private information. Even worse, an attacker can successfully login 11 vulnerable websites with the compromised OAuth access token. Finally, we propose two countermeasures to thwart DangerNeighbor attacks.

CR IP T

Chong Guan, University of Chinese Academy of Sciences, [email protected] Kun Sun, George Mason University, [email protected] Lingguang Lei, University of Chinese Academy of Sciences, [email protected] Pingjian Wang, University of Chinese Academy of Sciences, [email protected] Yuewu Wang, University of Chinese Academy of Sciences, [email protected] Wei Chen, Nanjing University of Posts and Telecommunications, [email protected]

PT

Index Terms—postMessage, Information Leakage, OAuth

I. I NTRODUCTION

AC

CE

The Same Origin Policy (SOP) [1] is a critical security mechanism in web application design to isolate scripts and thus prevent malicious scripts on one webpage from accessing the sensitive data on other webpages. Though SOP is effective on protecting modern web applications, it is sometimes too restrictive when considering the websites’ communication requirements in the real world. For instance, most popular websites need to exchange information with third-party service providers such as advertisement, social recommendations, and performance measurement, but those necessary communications are blocked by SOP. To address this limitation, the fifth version of HTML (HTML5) introduces the postMessage mechanism to enable web contents being exchanged among different origins. The postMessage method allows plain text messages to be sent from one domain to another. According to the design Lingguang Lei is the corresponding author.

ACCEPTED MANUSCRIPT 2

CR IP T

Fig. 1: The Frame Structure of a postMessage Example to DangerNeighbor attacks on stealing their access tokens and leaking out sensitive information. • We propose a lightweight countermeasure to defend against DangerNeighbor attacks. Alternatively, we recommend to revise postMessage APIs to thwart DangerNeighbor attack. The remainder of the paper is organized as follows. Section II introduces some background knowledge. We present the DangerNeighbor attack in Section IV. Section V presents a large scale evaluation of DangerNeighbor vulnerabilities in the real world. We carry out a case study on compromising OAuth access token with DangerNeighbor attacks in Section VI. Section VII provides two countermeasures against DangerNeighbor attacks. We describe the related work in Section IX and conclude the paper in Section X.

We develop a new DangerNeighbor attack by misusing the HTML5 postMessage mechanism. It enables the attackers to eavesdrop all the messages sent to the hosting page via the postMessage mechanism and thus steal the sensitive information in those messages. We implement two proof of concept prototypes of DangerNeighbor attack using two different attack vectors. We perform a large-scale empirical study on evaluating the impacts of DangerNeighbor attacks in the real world. The results show that a high percentage of popular websites may suffer from DangerNeighbor attacks. We carry out a case study on the compromise of OAuth access tokens using DangerNeighbor attacks. Our experimental results show that 39.61% of websites relying on Facebook OAuth and 23.38% of websites relying on Google OAuth in top 2,000 Alexa websites are vulnerable

AC



CE

PT

ED

M

AN US

chosen Chrome extensions. The experimental results show that it is common for hosting pages to contain multiple receiver functions from different service providers. After clustering the collected message data into a number of categories, we identify a number of sensitive information including userID, frameID, videoID, OAuth access token, etc., which can be leveraged to identify a user and conduct user behavior tracking. We conduct a case study on the potential leakage of OAuth access token using DangerNeighbor attack. OAuth is a standard secure authorization protocol that allows Internet users to authorize websites/applications to access the user information on other websites without separately providing the passwords [8]. The access token is the key data in OAuth protocol and represents the authorization of a specific application to access specific parts of a user’s data. OAuth has been popularly used to implement single sign-on (SSO) services on social networks such as Facebook, Google, and Twitter. Our experiments show that the access token could be eavesdropped and stolen by DangerNeighbor attacks. We find that 39.61% of websites with Facebook OAuth and 23.38% of websites with Google OAuth in top 2,000 Alexa websites are vulnerable for the attackers to collect users’ private information and even worse, use the stolen access token to log in the relying party websites. We propose two countermeasures to defeat the DangerNeighbor attack. In the first defense solution, we propose a lightweight defense mechanism by using a JavaScript function wrapper technique. The basic idea is to wrap each receiver function with an extra layer of security checking on the message origin. Only when the message origin can be found in an origin-RecvSource mapping table, the original receiver function could be invoked to perform its normal tasks. As an alternative defense solution, we recommend to revise the postMessage APIs with a new postMessage method. To remain compatible with the old version of the postMessage mechanism, we introduce a “receiver ID” property into each receiver function to inform what receiver functions may receive the postMessages. A prototype is implemented on Chromium with the new postMessage API method and we verify its effectiveness on defending against DangerNeighbor attacks. In summary, we make the following major contributions:





II. BACKGROUND

A. postMessage in HTML5 The Same-Origin Policy (SOP) can successfully separate mutually distrusted web contents within the same web browser via origin-based compartmentalization [1]. However, there exist exceptions that need to break the same origin regulations. For instance, an advertisement agent may want to display its advertisement based on the requirements from the hosting webpage. Thus, HTML5 introduces the postMessage mechanism to establish a communication channel between different origins by exchanging messages from one origin to another origin. The implementation of postMessage involves two major parties: the receiver and the sender. Figure 1 demonstrates the frame structure of a postMessage example whose source code is shown in Listing 1, where Bob.html is the sender and Alice.html is the receiver. Alice.html imports the content of Bob.html as a subframe (line 2) in its webpage and implements a receiver function msgReceiver (line 5-11). Though introduced into Alice.html, the JavaScript codes receiving from Bob.html belong to a different origin. As a sender, Bob.html invokes the postMessage() method to send a message “Hi Alice.” to Alice.html (line 18).The receiver is a window object [9] in the webpage, and the entire webpage (i.e., an HTML document) corresponds to one window object. An additional window object will be created if a frame is introduced through the < if rame > tags. In Listing 1, window.parent (line 18) specifies Alice.html as the receiver, since Alice.html is the parent of Bob.html.

ACCEPTED MANUSCRIPT 3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

//Bob.html < s c r i p t> w i n d o w . p a r e n t . p o s t M e s s a g e ( "Hi Alice." , " http://www.alice.com" ) ; < / s c r i p t>

AN US

19

//Alice.html < s c r i p t> function msgReceiver ( event ) { i f ( e v e n t . o r i g i n = = "http://www.bob.com" ) { c o n s o l e . l o g ( "Bob says:" + e v e n t . d a t a ) ; e v e n t . s o u r c e . p o s t M e s s a g e ( "Hi, Bob." , event.origin ) ; } } w i n d o w . a d d E v e n t L i s t e n e r ( "message" , msgReceiver , f a l s e ) ; < / s c r i p t>

server, the third-party service provider server, and the user (the browser). The user requests a hosting page from the hosting page server. In the hosting page, the sender code is imported through the if rame tag while the receiver code is imported via the HTML script tag. Therefore, the receiver is in the same origin and the same window as the hosting page, so it can receive and handle the messages sent to the hosting page. In contrast, the sender is included in a different window with a different origin. Though being executed in different origins, both codes are provided by the provider. We use RecvSource to define the src property of the receiver (e.g., “provider/receiver.js” in Figure 2). The postMessage method is also used in the browser extension, which is a small web application developed in JavaScript language to extend the functionality of the browser. Injecting JavaScript code into the context of webpages is one of the basic goals of the browser extension. Usually the injected JavaScript can be executed only in a special environment called isolated world [10], [11], [12]. The injected JavaScript codes have the ability to access the DOM [13] of the page, but it cannot access any local JavaScript variables or functions created by the page. The postMessage method opens a channel to enable the communication between the injected JavaScript and the page local JavaScript.

CR IP T

Listing 1: A Usage Example of postMessage

ED

M

III. A DVERSARY M ODEL

Fig. 2: The Usage Pattern of postMessage

AC

CE

PT

Since the message could be hijacked through the navigation attacks [3], an origin check is enforced before dispatching the message to a window. An origin is defined as a combination of URI scheme, host name, and port number. For example, “http://www.alice.com” defines the origin of Alice.html. When the method is invoked, it triggers a MessageEvent to be dispatched to Alice.html. The MessageEvent sets its data property to the message being sent (i.e., “Hi Alice.”), its origin property to the origin of the sender (i.e., “http://www.bob.com”), and its source property to the sender’s window object, where they correspond to the event.data (line 7), event.origin (line 6), and event.source (line 8), respectively, in the sample code shown in Listing 1. Then, the receiver can provide a response to the sender through the event.origin and event.source properties, as shown in line 8-9. B. postMessage Usage in the Real World In the real world, since postMessage enables cross-origin communication, it is usually used between the hosting websites and the third-party service providers. As is shown in Figure 2, the common scenario involves three roles: the hosting page

We assume the attacker only have the ability to add postMessage receiver functions to the target webpage and in the receiver function the attacker sends all the messages to his server. The victim’s browser and computer are not compromised and the hosting websites are benign. The goal of an attacker is to compromise the messages sent via postMessage. We mainly consider two kinds of attackers: the third-party service provider and the browser extension. We also have a discussion about the scenario that an attacker adds receiver function via XSS vulnerability in this paper. We are not concerned with the attackers that can inject arbitrary malicious JavaScript code. The third-party service may be used by any customer. All the JavaScript code is served by source code, any of the customers can check the code for purpose of security, performance, compatibility. The service provider will not risk their prestige with malicious code. The browser extension needs to pass security check of the browser vendors before they are published to public. Containing malicious code in extension will result in failing to pass the security check. Only injecting postMessage receiver function is stealthier than general malicious code. There is not a clear bound between malicious receiver function and benign receiver function. The only difference is that malicious receiver function collects the sensitive message which is not belonged to it. To achieve this distinction, the following problems need to be solved: 1) How to decide which message is sensitive? 2) How to figured out the message belong to whom? 3) How to trace the message until it is sent to a malicious server? It is difficult to solve any of those problems. Even if the “malicious” receiver function is filtered out, it can not be identified as malicious. Some developers may not intend to

ACCEPTED MANUSCRIPT 4

is not strictly enforced and many receivers fail to perform semantically correct origin checks [14], the receivers in the hosting page can still be misused in DangerNeighbor attacks. The threat of DangerNeighbor attacks has not been well studied in the real world.

Hosting Page http://www.host.com

Frame1 sender1

Receiver Function2

C. Attack Vectors

postMessage Frame2

Malicious Receiver Function

sender2

Fig. 3: DangerNeighbor Attack implement malicious receiver function. They just failed to filter out some messages for the reason of careless. If a service provider is caught injecting malicious receiver function, he can argue that it is just a vulnerability caused by careless. IV. DANGER N EIGHBOR ATTACK

AN US

We first discuss the attack goal and the fundamental attack approach of DangerNeighbor attacks, and then enumerate potential attack vectors to implement DangerNeighbor attacks.

One prerequisite for DangerNeighbor attacks is to inject a malicious receiver function into the targeted hosting page. We identify three major attack vectors to inject malicious receiver functions, namely, malicious service provider, malicious browser extension, and online attacks (e.g., Cross-Site Scripting (XSS) attack). Malicious Service Providers. To enrich their contents, modern websites usually need to integrate multiple contents from different third-party service providers. For example, the news websites always add a weather forecast from a third-party provider by injecting related script tag into the hosting page. However, an adversary can disguise as a legitimate service provider and provide malicious receiver functions to be included on the victim website’s hosting page. Since a malicious receiver function only passively eavesdrops messages, it leaves no other attacking traces and thus can evade the state-of-the-art malware detection. Malicious Browser Extension. Browser extension is a small web application developed in JavaScript to extend the functionality of the browser. Since injecting JavaScript code into a webpage is one of the basic functions and goals of the browser extension, a malicious extension can easily inject a malicious receiver function to conduct DangerNeighbor attack. Most of the users install the browser extensions from the official extension stores [15], which is counted on to perform strict security checking and filter out extensions with malicious JavaScript code [16], [17]. However, since the malicious receiver functions added by the extensions conduct no attacking actions except not checking the origin of message, attackers can craft a malicious browser extension (e.g., Chrome extension) and easily pass the Chrome security checking. We crafted a toy Chrome extension with malicious receiver function and confirmed that it can pass the security checking of the Chrome extension store 1 . Online Attacks. Online attacks can misuse the web vulnerabilities to inject malicious JavaScript code onto the targeted webpage, including the Cross-Site Scripting (XSS) attack, stale IP-address-based/domain-name-based inclusions attack [5], and the man-in-the-middle attack (malicious proxy). These online attacks can benefit from the stealthiness of the DangerNeighbor attack by injecting only a malicious receiver function, thereby bypassing the malicious code detection. For example, the runtime monitor installed in browser could be bypassed, since it does not treat the messages sent via postMessage as high sensitive data [6], [18]. In addition, the attacker can used a fixed payload (i.e., injecting a malicious receiver function) on a number of different websites to be released from the burden of customizing specific attack payload for different webpages.

CR IP T

Receiver Function1

A. Attack Goal

CE

B. Attack Approach

PT

ED

M

The goal of an adversary is to stealthily eavesdrop the sensitive messages sent from other webpages by misusing the unprotected postMessage. We assume the victim’s browser and the software stack below the browser are benign. However, the adversary has the ability to add postMessage receiver functions to the targeted hosting webpage. Since the malicious receiver function only passively eavesdrop all messages and even benign receiver functions may sometimes accept messages that are not destined to them due to careless implementation, it is difficult to detect this type of stealthy attacks. We assume that after capturing the sensitive data, the adversary has other channels to transmit all captured information to a server controlled by itself, but it is out of the scope of this paper.

AC

According to the design of postMessage mechanism, the messages are delivered to a webpage or more specifically, the window object corresponding to the webpage with specific origin property. When there are multiple receiver functions in the webpage, the message will be broadcast to each receiver function. Therefore, if the adversary can inject a malicious receiver function into one webpage, it can eavesdrop on all broadcast messages to this webpage. As shown in Figure 3, the hosting page imports frame1, frame2, receiver function1, receiver function2, and a malicious receiver function. Since all three receiver functions are executed in the same origin of the hosting page, the malicious receiver function can collect the messages sent from the sender1 in Frame1 and the sender2 in Frame2. The HTML5 standard suggests that the receiver function should check the origin of message and abandon the message from unknown senders. However, since the origin checking

1 We

unpublished it before any user could install it

ACCEPTED MANUSCRIPT 5

V. DANGER N EIGHBOR ATTACK IN THE W ILD Receiver Function Number 30

20

10

data used in this paper is collected in April 2017.

usa tod ay. bib com les tud yto ols .co m cbs spo r ts. com chr on. com cnn turk .co m

n.c om

nes

mk

Fig. 4: Top 20 Websites Containing the Largest Number of RecvSources in their Homepages

10000

1000

Website Number Receiver Function Number

100

gle.c om /platf orm.t witte r.com ://mc .yand ex.ru http:/ /stati c.cha rtbea t.com https ://scri pt.ho tjar.c om http:/ /s7.a ddth is.co m http:/ /cdn.k rxd.n et https ://pla tform .twitte http:/ r.com /asse ts.ad obed tm.co https m ://con nect. faceb https ook.n ://tpc et .goo glesy ndica http:/ tion.c /ads.r om ubico npro ject.c http:/ /nativ om e.sha rethro ugh.c http:/ om /cdn.c xens e.com https ://ww w.gs tatic.c om http:/ /vk.c om https ://s.y timg.c om

.net

k.net

http:/

https

/apis .goo

.dou bleclic k

eboo

https :/

bads .g

ect.fa c

/conn http:/

/secu repu

https :/

ead2

.goo glesy

ndica

tion.c

om

10

http:/ /pag

M

ED

PT

CE

AC

2 The

.ru

0

AN US

We develop a small piece of JavaScript named Collection JavaScript to collect both receiver functions and messages. This collection JavaScript is injected into the context of the target webpage and runs before any other codes. It wraps the addEventListener function, which is responsible for adding postMessage receiver functions. Thus, when a new receiver function is added, its source code can be captured. The collection JavaScript also adds a new postMessage receiver function to collect the messages. There are two options to inject the collection JavaScript on the target webpage. First, we implement a toolkit based on a webview component [19] to automatically inject the collection JavaScript to webpages. This toolkit loads the top 5,000 Alexa websites’ homepage and injects the collection JavaScript to their homepages before they are loaded in the webview. Thereafter, the receiver functions and messages could be collected through the collection JavaScript once the homepages are opened. Second, we inject the collection JavaScript in a Chrome extension. There are two types of JavaScripts in the Chrome extension, i.e., extension JavaScripts and ContentScripts. The former is executed with full privilege of the extension, and the later is injected into the webpages and executed in the context of the webpages with limited extension privilege. The postMessage method is usually not used in the communications between the extension’s JavaScripts (including extension JavaScripts and ContentScripts) since Chrome extensions have their own message communication mechanism [20]. However, communication between the ContentScripts and the page local JavaScripts still needs to use the postMessage mechanism, since the page local JavaScripts do not support the extension’s message mechanism. To study the receiver functions and messages in browser extensions, we randomly choose 1,200 Chrome extensions from the Chrome Web Store [15]. Each extension has a manifest file [21] that pre-defines a list of webpages where the associated ContentScripts should be injected. We inject our Collection JavaScript as the first ContentScript by editing the extensions’ source codes. Thereafter, the collection JavaScript will automatically collect the receiver functions and messages once the pre-defined webpages are opened.

RecvSource Number

CR IP T

A. Data Collection

40

nyp ost .co m poly gon .co m eat er.c om cnn .co m ign .co ind m epe nde nt.c o.u k sfg ate .co m sta nda rd.c o.u 247 k spo r ts. com abs −cb n.c om bgr .co dig m italt ren ds. com me nsh ealt h.c om

Though DangerNeigbor attacks have been identified, their real impacts on the security of web applications have not been well studied. Therefore, we perform a measurement study on evaluating the DangerNeighbor vulnerabilities in the real world. We first collect the messages and receiver functions from Alexa top 5,000 websites and 1,200 randomly selected Chrome extensions2 . Then we investigate how the postMessage mechanism is used via analyzing the collected receiver functions and messages. Finally, we analyze if the receiver functions conduct the recommended message origin checking as suggested in the HTML5 standard.

Fig. 5: Top 20 RecvSources the Mostly Used by Websites B. postMessage Usages We investigate how postMessage is used in both webpages and Chrome extensions by analyzing the collected receiver functions and messages. First, we show the results on receiver functions and messages collected from the Alexa top 5,000 websites. In total, 2,987 websites’ homepages contain receiver functions, and we collect 20,865 receiver functions coming from 1,928 different RecvSources. We observe that 2,250 websites introduce multiple receiver functions and 845 websites import receiver functions from more than one RecvSources. Due to the DangerNeighbor attacks, as long as one RecvSource is malicious, the containing websites may suffer from sensitive data leakage. Therefore, the number of different RecvSources contained in one website may reflect the security risk of the website to the DangerNeighbor attacks. In average, we see 2.78 RecvSources among the websites containing receiver functions. Figure 4 shows the top 20 websites whose homepages containing the largest number of RecvSources. Figure 5 shows the top 20 RecvSources that are mostly used by the websites. It could infer the potential attack range once one popular RecvSource becomes malicious. The most popular

ACCEPTED MANUSCRIPT 6

abandon the ones from unknown senders. Son et al. [14] have well studied the origin checking behaviors in the webpages and proved that many webpages performed origin checks incorrectly or did not check at all. Here, we focus on analyzing the message checking behaviors in Chrome extensions. Among the 52 receiver functions found on 54 browser extensions that utilize the postMessage mechanism, we find that only 5 receiver functions perform the suggested origin checking. Compared to the webpage scenario where the postMessage is primarily used for communications among several specific webpages and thus the origin is usually fixed or obeys some simple rules, the ContentScripts in the extensions are broadly injected into various types of webpages. Therefore, it is difficult to generate a common rule for conducting the origin checking. Besides the origin checking, we identify two other types of checking methods in the extensions, namely, source checking and content checking. First, the source checking is achieved by checking the reference of the sender’s window object. Source checking is useful in the extension scenario, since the postMessage is primarily used for communication between the ContentScripts and local page scripts, and these two entities usually belong to the same window object (i.e., with the same source). Therefore, the source checking could be easily achieved by checking if the sender’s source (i.e., event.source) equals to the current one (i.e., window). In the 52 receiver functions, source checking appears 16 times. Second, the content checking is implemented through checking the content of the messages. For example, most receiver functions only accept the messages in json [25] format and containing certain keys. Content checking is commonly used in the receiver functions, in addition to the origin checking or the source checking. According to our experiments, the content checking is used 31 times in the 52 receiver functions. In summary, we can see that the existing security checking on receiver functions is not sufficient to defeat DangerNeighbor attacks. Furthermore, it is not reliable to identify malicious receiver functions by investigating their security checking behavior, since the malicious ones may perform security checking with a wild card such as “*” to match everything in the checking rules. In Section VII, we propose two countermeasures that can effectively defeat DangerNeighbor attacks.

C. Security Checking in Browser Extensions

A. OAuth-based SSO System

Since the goal of DangerNeighbor attacks is to steal sensitive data by eavesdropping on the messages, it is most likely that the malicious receiver functions do accept all messages without checking. However, the HTML5 standard suggests the benign receiver functions to check the origin of messages and

OAuth is a standard secure authorization protocol that allows Internet users to authorize websites/applications to access users’ information on other websites without giving them confidential passwords [8]. It is commonly used to implement Single Sign On (SSO) systems. One SSO system

AC

CE

PT

ED

M

AN US

CR IP T

RecvSource is http://pagead2.googlesyndication.com, whose receiver functions appear 8,070 times in 1,040 websites among the Alexa top 5,000 websites. We collect 68,281 messages from 2,415 different websites. Since it is difficult to manually inspect all those messages for potential information leakage, we adopt two techniques to help identify information leakage in a semi-automatic manner. First, we use the term frequency-inverse document frequency (tf-idf) model [22] to vectorize the messages. By counting the word frequency in the message and the word frequency in all messages, we can calculate a tf-idf value to measure the importance of each word in one message. Next, we cluster the vectorized message data into different categories using the k-means algorithm [23]. The clustering operation gives each message a label to identify the category of the message. The messages are clustered into 394 categories. After manually checking the 394 different categories, we find that the messages can be divided into two groups, i.e., request data and response data. The request data is usually used to invoke a function or trigger certain event in the receiver’s webpages (e.g., getAdParams, resize height etc.), while the response data conversely contains the result data corresponding to the requests (e.g., the webpage status, ACK message etc.). We find two types of sensitive information, i.e., ID and token. IDs include various identification information such as userID, videoID, frameID and callbackID, which may leak user’s privacy data. For example, a malicious provider may derive the user’s advertisement or video browsing history according to user ID information. Even worse, some messages contain various temporary tokens or keys, which may lead to more serious privacy leakage. We will give a detailed case study on OAuth access token leakage in Section VI. Comparing with the data collected in July 2015 [24], we find the postMessage mechanism is used more frequently in the web pages. In the Alexa top 5,000 webpages, the number of receiver functions and messages increase by more than two (from 7807 to 20865) and four times (from 15457 to 68,281), respectively. And the potential vulnerable websites number increases from 544 to 845. Second, we show the results on receiver functions and messages collected from the Chrome extensions. 54 among the 1,200 extensions utilize the postMessage method. In total, 52 receiver functions and 6,370 unique messages are collected from the 54 extensions. 3 extensions contain more than one receivers and 5 receivers are reused in different extensions. Similar to the webpage scenario, two types of sensitive information, i.e., ID and token, are involved in the messages. For example, we found a so-called strNativeKey in the message, which represents a temporary key, and ID information such as userID and groupID.

VI. A C ASE S TUDY ON C OMPROMISING OAUTH T OKENS Based on the observation that postMessage method is commonly used in the implementation of OAuth protocol [26], we perform a case study on compromising the OAuth tokens using DangerNeighbor attacks, where the stolen access token does not only leak users’ private information, but also allows the attacker to log in the relying party’s websites.

ACCEPTED MANUSCRIPT 7

RP

1.Authorization Request User

5.Authorization Code

RP

6.Access Token

Authorization Server

7.Access Token 8.User Identity

idP

Resource Server

idP

HTTP

Proxy Frame postMessage (Access Token) (From idP)

JavaScript SDK(From idP) Receiver Function

CR IP T

3.Authorization Code

Function Call

2.Login Credential

Authorization Server

4.Authorization Code

Local JavaScript of RP

Fig. 7: postMessage in OAuth-based SSO System

contains three main parties: the user, the identity provider (idP) and the relying party (RP), where RPs are the real applications/websites to be accessed by the users and the idPs are responsible for providing the authentication services for the RPs. Usually, the idP runs both an authorization server and a resource server. Figure 6 shows a high-level work flow of an SSO system based on the OAuth protocol [27]. The RP first sends an authorization request to the user when it receives an access request from the user. Then, after the user inputs the login credentials (e.g., username and password) to the idP, the authorization server in the idP server verifies the user’s credentials and returns an authorization code. Next, the user forwards this authorization code to the RP, which can obtain an access token from authorization server using the authorization code provided by the user. Finally, the RP sends the access token to the resource server, obtains the user’s identity information, and allows the user to login using that identity. In the OAuth-based SSO system, access token is sensitive and should be protected in transit and in storage, since it represents the authorization of a specific application to access specific user’s data contents. Normally, only the application itself, the authorization server, and the resource server may have access to the access token. The application should protect the access token from being stolen by other applications on the same device.

JavaScript SDK-based solutions, but not in the REST API based solutions. Figure 7 shows the usage pattern of postMessage in the OAuth SSO system implemented via JavaScript SDK. The three components, i.e., authorization server, proxy frame, and JavaScript SDK, are all provided by the idP. Since the RP imports the JavaScript SDK through an HTML script tag, it has the same origin as the RP’s webpage. The included JavaScript SDK code implements a receiver function and imports the proxy frame via an if rame tag. The proxy frame works as an intermediary, which relays the access token to authorization server (idP) and JavaScript SDK (RP) via the HTTP protocol and postMessage mechanism, respectively.

AC

CE

PT

ED

M

AN US

Fig. 6: Work Flow of the OAuth-based SSO System

B. postMessage in OAuth-based SSO System In OAuth-based SSO systems, postMessage is one major method to transfer access token between idP and RP. We study the OAuth implementation of the top 10 popular global identity providers (idPs) [28] including facebook.com, twitter.com, qq.com, google.com, yahoo.com, sina.com.cn, openId, vkontakte.ru, weibo.com, and linkedin.com. All idPs provide their Representational State Transfer (REST) [29] APIs for the RPs to implement the OAuth SSO system. Facebook, Google and LinkedIn also provide additional JavaScript SDK to help RPs implement OAuth SSO system [30], [31], [32]. We observe that postMessage mechanism is used in all three

C. DangerNeighbor Attacks on OAuth-based SSO Systems We perform a case study on the DangerNeighbor attacks on the real-world OAuth-based SSO systems and verify that attackers can successfully steal access tokens through the DangerNeighbor attacks. In addition, we evaluate the potential consequences if the access tokens are stolen by DangerNeighbor attacks. To facilitate our evaluation on identifying DangerNeighbor vulnerabilities, we develop a Chrome extension to help automatically inject a postMessage receiver function to the top 2,000 Alexa websites and then open them one-by-one in the Chrome browser. After we manually login the websites with the supported SSO services, our chrome extension is capable of collecting the messages received and sending them to a remote server. After that, we can check if there are access tokens leaked out in the received messages. To evaluate the impacts of stolen OAuth access tokens, we develop an extension to a web debugging proxy for browser called Fiddler [33]. Using this Fiddler extension, we first analyzes what sensitive information might be disclosed with the stolen access. Next, we verify if the stolen access tokens could be misused by the attackers to disguise as the tokenassociated real user for logging into the related RPs. 1) DangerNeighbor Vulnerabilities: Among the top 2,000 Alexa websites, we find 462 websites using Facebook OAuthbased SSO system, 325 websites using Google OAuth-based SSO system and 42 websites using LinkedIn OAuth-based

ACCEPTED MANUSCRIPT 8

RPNum 462 455 134 123 67 62 46 32 25 21 17 17 9 9 9 8 7 6 5 4 4 4 3 3 3 3 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1

M

ED

PT

CE

AC

VulRPNum 174 170 52 61 27 31 23 12 15 8 5 6 5 1 3 4 1 1 2 0 2 2 1 1 0 1 1 1 1 1 2 1 0 1 0 0 1 0 0 0 0 0

Top 13 13 13 44 50 50 44 50 57 107 44 97 17 227 17 227 57 614 17 390 624 805 359 258 258 430 441 413 866 1,051 827 795 441 1,051 441 441 1,786 866 1,367 441 441 441

CR IP T

Permission public profile email user birthday user friends user location publish actions user likes user hometown user photos user education history user about me user work history user status user website publish stream user relationships user posts read stream offline access read custom friendlists user religion politics user relationship details manage pages user interests user videos user tagged places publish pages status update friends likes user activities user actions.books user actions.music user events user actions.fitness read page mailboxes user managed groups read friendlists friends activities basic info read insights ads management pages manage cta

AN US

SSO system. In the following, we refer to the website using Facebook OAuth, Google OAuth and LinkedIn OAuth as Facebook RP, Google RP and LinkedIn RP, respectively. Among these RPs, 183 (39.61%) Facebook RPs, 76 (23.38%) Google RPs and 1 (2.38%) LinkedIn RP send access tokens via postMessage method, thereby being vulnerable to the DangerNeighbor attacks. For all other idPs that we did not find access token disclosure, they are all implemented through the REST APIs, which do not use postMessage. As discussed in Section VI-B, the postMessage is mainly used in the JavaScript SDK-based OAuth solutions for access token transmission. However, when compared to the REST APIs, the JavaScript SDK is easier for the developers to use due to its simple APIs. In addition, JavaScript SDK solution is also recommended by the popular idP like Facebook. Since it could be anticipated that more idPs may provide JavaScript SDK in the future, it is critical to prevent DangerNeighbor vulnerabilities in the JavaScript SDK-based OAuth solutions. Figure 8 illustrates the statistics results of access token disclosure on RPs using Facebook and Google OAuth SSO services. On average, there are 9.15 vulnerable Facebook RPs and 3.8 vulnerable Google RPs in every 100 websites. We do not see a dramatic security improvement against DangerNeighbor vulnerability as the websites’ rankings improve. In other words, the DangerNeighbor vulnerability is still not very wellknown, even to the developers from the most famous Internet companies. 2) Potential Attack Consequences: We observe two types of severe attack consequences if the OAuth access tokens are stolen. First, the attackers could illegally access the tokenassociated users’ sensitive data on the idPs. Second, the attackers may disguise as the token-associated user to login the related RPs. Private Information Leakage on idPs. Each access token is associated with a permission list being checked by the idPs to constrain what resource information could be accessed with the token. Thus, when the access token is stolen, the attacker may inherit the associated permissions to access the victim user’s private information. Our experiments show that the RPs apply the access permissions for the resources on the idPs through a scope parameter in a URL request sent to idPs. Since our Fiddler extension can intercept all requests and responses transmitted between the RPs and idPs, we can extract the permissions associated with the access tokens. Here, we focus on analyzing the permissions associated with the Facebook RPs and Google RPs, since there is only 1 vulnerable LinkedIn RP found in our experiments. Table I and Table II show the permissions requested by Facebook RPs and Google RPs, respectively. The first column lists the name of the permission, and the second column indicates the number of RPs applying this permission. The third column shows the number of vulnerable RPs. The fourth and fifth columns represent top ranking of the RP and the top ranking of the vulnerable RP applying the permission, respectively. According to Table I, 42 different permissions are used by Facebook RPs, which covers 93.33% of the all 45 permissions provided by Facebook [34], and 71.11% of the 42 permissions

VulTop 15 15 44 44 107 122 44 107 238 107 44 107 238 1,143 413 808 1,616 1,198 808 1,051 1,051 1,616 760 1,051 1,616 413 1,980 1,051 827 1,051 1,051 1,786 -

Permission: the name of the permission. RPNum: number of RPs applying the permission. VulRPNum: number of vulnerable RPs applying the permission. Top: top ranking of the RP applying the permission. VulTop: top ranking of the vulnerable RP applying the permission.

TABLE I: Permissions Requested by Facebook RPs

are requested by the vulnerable RPs. The public prof ile, email, user f riends, and user birthday are four most widely requested permissions by the vulnerable RPs and thus are easier to be leaked. Particularly, the public prof ile permission is associated with the user’s information such as userID, full name, age range, and gender, and based on the agreement between Facebook and the users, it is set to public by default. We also find some RPs (including some RPs with high Alexa rankings) request a number of permissions and might cause a bunch of permission exposure if being attacked. For example, 35 RPs request more than 7 permissions, and 16 RPs among them are vulnerable to the DangerNeighbor attack. In total, 27 different permissions will be leaked out if these RPs are attacked. The permissions associated with Google RPs are listed in Table II, which shows 57.89% permissions are vulnerable.

ACCEPTED MANUSCRIPT 9

30

25

No Access Token in Message

No Access Token in Message

With Access Token in Message

With Access Token in Message

Number of RPs

Number of RPs

20 20

10

15

10

5

1900−2000

1800−1900

1700−1800

1600−1700

1500−1600

1400−1500

1300−1400

1200−1300

1100−1200

1000−1100

800−900

900−1000

700−800

600−700

400−500

500−600

CR IP T

Alexa Ranking

300−400

200−300

0−100

100−200

1900−2000

1800−1900

1700−1800

1600−1700

1500−1600

1400−1500

1300−1400

1200−1300

1100−1200

1000−1100

800−900

900−1000

700−800

600−700

500−600

400−500

300−400

200−300

0−100

0 100−200

0

Alexa Ranking

(a) Websites Using Facebook OAuth.

(b) Websites Using Google OAuth.

Fig. 8: Access Token Disclosure on Websites Utilizing Facebook and Google OAuth Services RPNum 151 136 118 112 85 39 38 36 13 4 2 1 1 1 1 1 1 1 1

VulRPNum 34 32 45 30 11 14 10 10 0 1 1 0 0 0 0 0 0 1 0

M

AN US

Permission email https://www.googleapis.com/auth/userinfo.email https://www.googleapis.com/auth/plus.login profile https://www.googleapis.com/auth/userinfo.profile openid https://www.googleapis.com/auth/plus.me https://www.googleapis.com/auth/plus.profile.emails.read https://www.google.com/m8/feeds/ https://www.googleapis.com/auth/contacts.readonly https://www.google.com/m8/feeds https://www.google.com/reader/api/ https://www.googleapis.com/auth/plus.profiles.write https://www.googleapis.com/auth/analytics.readonly https://mail.google.com/ https://www.googleapis.com/auth/drive.metadata.readonly https://picasaweb.google.com/data/ https://www.googleapis.com/auth/user.phonenumbers.read http://www.google.com/m8/feeds/

Top 43 23 53 43 23 50 146 166 496 725 107 1,419 1,770 1,512 1,114 1,845 1,884 959 850

VulTop 107 53 53 107 122 122 183 166 833 107 959 -

ED

TABLE II: Permissions Requested by Google RPs

AC

CE

PT

The email and prof ile are also most widely requested permissions by Google RPs. However, the compatibility of the Google OAuth service is not as well as Facebook, while some permissions are with different names in different versions of Google OAuth service. For example, “email” and “https://www.googleapis.com/auth/userinfo.email” in Table II actually represent the same permission. Comparing Table I and Table II, we can find less sensitive information is leaked through Google RPs. The primary reason may be two-fold. First, Google OAuth service is not as popular as Facebook OAuth service. Second, Google is not a social network website with abundant user information as Facebook. Disguise Attack. We name the attacks that may disguise as the token-associated user to login the related RPs as Disguise Attack. To verify the existing RPs’ capability on defeating the disguise attack, we extend the Fiddler extension to login those RPs by replacing the normal access tokens with the stolen ones. In total, we find 11 vulnerable RPs suffering from disguise attack, including 9 Facebook RPs and 2 Google RPs. In other words, the attacker can login and operate as the tokenassociated user on those RPs. Normally, the login state can be sustained until the token is expired, e.g., about 5,000 seconds

for Facebook token. If the RP sets a cookie for follow-up login authentication, the attacker can store the cookie and thus extend the login state to even several months. There are various types of websites included in the 11 vulnerable RPs, such as social network website, portal website, and photo management website, etc. For example, one RP is the home page of a screenshot management software for the users to manage his screenshots and share them to social networks. Two RPs are sports websites, on which the user can browse sports news, buy tickers for the competitions, and set his address, birthday, and favorite sport. Other vulnerable RPs include five online video websites, one digital product selling website, one price comparison website, and one estate selling/buying website. From these websites, the attacker can obtain sensitive information such as browser history, favorite list, interests etc. We find two main reasons that disguise attack may fail on vulnerable RPs. First, multiple authentication factors are combined for user authentication on some RPs. For example, a RP verifies both the IP address and access token when performing user authentication. Second, some RPs utilize other credential data rather than access tokens to authenticate the

ACCEPTED MANUSCRIPT 10

VII. C OUNTERMEASURES

A. Checking Origin via JavaScript Function Wrapper

CE

PT

ED

M

As discussed in Section II-B, when postMessage is used for communication between different webpages, the receiver functions and the corresponding sender codes are usually provided by the same service provider. Therefore, there usually exists a dedicated mapping between the origin of the message and the RecvSource of the receiver function. Based on this observation, we propose a lightweight JavaScript-based defense solution. The basic idea is to enforce an additional checking on the message’s origin and the receiver function’s RecvSource on each receiver function and only accept the message whose origin matches the receiver function’s RecvSource. As illustrated in Listing 2, the additional checking is achieved by wrapping the addEventListener function with a piece of JavaScript code , which enforces the mandatory checking in the receiver functions each time the addEventListener function is invoked to attach a new receiver function. JavaScript function wrapper [36] is a technique to extend or augment the behavior of certain functions without breaking the existing functionality when working with JavaScript libraries and widgets. The wrapper takes an existing function reference and maps it to a new reference. The original function is then overwritten with a new one that performs some extra actions before executing the original function. This defense solution requires the hosting websites introduce the wrapper shown in Listing 2 and ensure its being executed before any other JavaScript code, so that all invocation to the addEventListener function will be directed to the wrapper. It also needs the hosting website to maintain an origin-RecvSource mapping table to declare the map between the message’s origin and the receiver function’s RecvSource. The addEventListener wrapper in Listing 2 works as follows. Lines 3 to 22 illustrate the definition of a new addEventListener function that overrides the original one. In the new addEventListener, it judges if it is a postMessage receiver function attaching event (Line 7) and then go on processing,

AC

10 11 12 13 14 15 16 17 18

(function(old_eventListener){ //Override the addEventListener function window.addEventListener = function(type, listener, usecapture){ var RecvSrcMap = {}; //If it is receiver attaching event if(/message/i.test(type)) { //Get the RecvSource RecvSrcMap[listener] = getRecvSrc (); //Override the receiver function newreceiver = function(event) { //Check the Origin-RecvSrc map var allow = checktable( RecvSrcMap[listener], event.origin); //Invoke original receiver if (allow) listener(event); }; //Attach the newreceiver return old_eventListener.apply( this,[type, newreceiver, usecapture]); } //Other events return old_eventListener.apply(this ,[type, listener, usecapture]); } })(window.addEventListener);

AN US

After a systematic study of the DangerNeighbor attacks, we propose two countermeasures to defeat this type of attacks. The root cause of DangerNeighbor attack is the design flaw of the postMessage mechanism, which assume all the JavaScript codes inside one origin can be trusted. However, this assumption is incorrect. Therefore, the first solution targets at thwarting malicious service providers by using JavaScript function wrapper technique [36] to enforce a checking on the message’s origin and the receiver function’s RecvSource on each receiver function. It is lightweight and could be easily deployed on hosting websites. The second solution is more general by extending the postMessage APIs in the browsers with an additional declaration on the designated receiver function when the sender sends a message.

Listing 2: addEventListener Wrapper 1 2 3 4 5 6 7 8 9

CR IP T

users. According to [35], there are three kinds of data which can be used as SSO credentials: the access token, authorization code, and user profile. When the other two credentials are used for user authentication, disguise attack will fail.

19 20 21 22 23

or else it directly attaches the listener (Line 21). For the receiver function attaching event, it obtains the RecvSource of the receiver function (Line 9), overrides the original receiver function with a new one (Line 11-16), and attaches the new receiver function (Line 18). In the new receiver function, it checks whether the RecvSource maps the origin of the message (Line 13) and invokes the original receiver function if there is a match (Line 15). One challenge in Listing 2 is how to get the RecvSource of the receiver function. We solve it by throwing out an exception and catching it immediately in the addEventListener function, so that we can extract the RecvSource of the receiver function from the stack information included in the exception object. Note we must ensure the enforcement of the security checking, so that it cannot be bypassed. First, we must avoid incomplete wrap of the addEventListener function. When the developer might define an alias for addEventListener function, since we execute an addEventListener wrapper before any other JavaScript codes, no alias could be created for the original addEventListener method. In addition, though all types of DOM objects can invoke the addEventListener method to attach a receiver function, only receiver functions attached to the window objects could be invoked when receiving a message. Therefore, we just need to wrap the window.addEventListener. In Chrome browser, the receiver function attached to the prototype of window object (i.e., window. proto ) can also be invoked, so the window. proto .addEventListener will be wrapped too. Second, since the origin-RecvSource mapping table is stored

ACCEPTED MANUSCRIPT 11

"init", "*" Sender(S)  

"R_ID, R_AuthenticationInfo", "*"

  Receiver(R)

"S_ID, S_AuthenticationInfo", "R_ID" "Messages", "S_ID" "Messages", "R_ID"

CR IP T

Fig. 9: Protocol to Exchange the Receiver IDs

mechanism, we introduce a “receiver ID” property into the “event” parameter of the receiver function (Line 5 Listing 1), so that the receiver function can obtain its own receiver ID. For the postMessage API, the “receiver ID” parameter is attached to the end of origin parameter. For example, if the origin is “http://www.demo.com” and the receiver ID is 7, the new origin parameter will be “http://www.demo.com/7”.

AN US

as a dictionary inside the checktable() function (Line 13 in Listing 2) and is used to check the mapping between the message’s origin and receiver function’s RecvSource, attackers can poison the root prototype to disturb the check. We know that all JavaScript objects inherit basic properties and methods from the object prototype. If the attacker can insert a forged mapping as a field of the object prototype, e.g., object.prototype [‘http://www.evil.com’] = [‘http://www.alice.com’], this field could be inherited into the origin-RecvSource mapping table and enables the receiver functions from ‘http://www.evil.com’ to obtain the messages from ‘http://www.alice.com’. Fortunately, JavaScript provides the ability to check if a property is inherited from the prototype, so we can defeat this attack by inspecting if an originRecvSource mapping item in the table is inherited from the prototype. Third, since some built-in functions and objects [37] are used in our solution, if these functions are hijacked before they are invoked, the attackers could modify the return values and disable our defense. For example, a built-in function String.match is used in the getRecvSrc() method (Line 9 in Listing 2) to obtain the corresponding RecvSource of the receiver function, and it will be executed when addEventlistener is called. The attackers can hijack the String.match method and then return a fake RecvSource in order to disable the enforced checking. We can defeat this attack by storing the references of all related built-in objects and functions as local variables and utilizing the references to invoke them. As such, the original built-in functions or objects could be invoked even the attackers may wrap and hijack them. Lastly, JavaScript provides a delete operator to delete unused property, so when the delete operator is applied to the builtin functions, it can delete any wrappers and restore them to the original ones. If the attacker applies delete operator on window.addEventListener, our defense would be disabled. Although this delete operator cannot be totally prevented, it is easy to detect the usage of delete operator. The delete operator makes the DangerNeighbor attacks less stealthy. As a solution, we treat delete operator as a high-risk behavior and record them for further analysis.

The main problem exploited by the DangerNeighbor attacks is the postMessage mechanism allows receiver functions in an origin to receive all messages sent to that origin. If we can constrain the target of the messages to a designate receiver function, the DangerNeighbor attacks could be blocked. Based on the observation, we propose a more general solution by modifying the postMessage mechanism in the browser with an additional information on the designated receiver function when sending a message. Specifically, the browser generates and maintains a receiver ID for each receiver function, provides an interface for the receiver function to obtain its own receiver ID, retrofits the postMessage API (i.e., postMessage()) to include an additional “receiver ID” parameter, and ensures all messages are only sent to the designated receiver functions. To remain compatible with the old version of the postMessage

The detailed control flow is as follows. First, the sender sends an “init” message with receiver ID set as “*”, which will be broadcast to all receiver functions. Then, the corresponding receiver function obtains its receiver ID (i.e., R ID) from the “event” parameter and sends it back to the sender along with the authentication information (i.e., R AuthenticationInfo). When the sender receives the response, it authenticates the identity of the receiver, extracts the R ID, and sends its own receiver ID (i.e., S ID) back along with the authentication information (i.e., S AuthenticationInfo) to the R ID. Finally, both the sender and the receiver obtain each other’s receiver ID and can communicate without worrying about privacy leakage. We implement a prototype of the new postMessage mechanism on Chromium v57.0.2971.0 (64-bit), which introduces about 200 source line of code (SLOC).

AC

CE

PT

ED

M

B. Modifying postMessage APIs

In the beginning, the sender in another origin cannot know the receiver ID generated by the browser, and the receiver function does not know its receiver ID until it is triggered. Therefore, we design a protocol to exchange the receiver IDs between the sender and the receiver. According to Section II-B, both the receiver and sender JavaScript codes are controlled by the service provider. Therefore, the protocol could be implemented by the service provider inside the sender and receiver JavaScript codes. As illustrated in Figure 9, the R ID and S ID represent the receiver IDs of the receiver functions inside the receiver and sender codes respectively, which are generated by the browser and could be obtained through the “event” parameter in the receiver functions. The R AuthenticationInfo and S AuthenticationInfo corresponds to the authentication information preset by the service provider to authenticate the identities of the receiver and sender. To prevent the authentication information from being stolen by the attackers, the service provider should dynamically generate R AuthenticationInfo and S AuthenticationInfo (e.g., based on the IP address) each time the receiver and sender JavaScript codes are downloaded (i.e., when the webpage containing the JavaScript codes is loaded).

ACCEPTED MANUSCRIPT 12

VIII. D ISCUSSION This is “Discussion” section.

C. Malicious Browser Extensions

CR IP T

D. Privacy Analysis

The protection of the privacy in the web environment is a major issue in web security. Jang et al. [45] found 43 instances of privacy-violating information flows in the Alexa top sites. Gervais et al. [46] provided a quantitative methodology for evaluating users’ web-search privacy. Acquisti et al. [47] investigated individual privacy valuations in a series of experiments informed by theories from behavioral economics and decision research. Olejnik et al. [48] performed a privacy analysis of Cookie matching and real-time bidding and measured the leakage of users’ browsing histories due to the usage of these mechanisms. Our DangerNeighbor attack is a new attack to steal user’s privacy data. We show that the messages delivered by postMessage indeed carry user’s privacy data.

AN US

IX. R ELATED W ORK A. postMessage in Web Browser Barth et al. [3] conducted a comprehensive study of crossframe communication in Web browsers and demonstrated attacks on the confidentiality of messages sent via postMessage under certain frame navigation policies, including the descendant policy. Due to the security concerns from their research work, the HTML5 standard Committee added an origin parameter into the postMessage API. However, we discovered that simply adding one new origin parameter cannot totally defeat the attacks by misusing postMessage mechanisms. In this paper, we suggest to add a receiver ID parameter in addition to the origin parameter. Hanna et al. [26] analyzed the usage of postMessage in Facebook Connect and Google Friend Connect, and showed how incomplete origin checks and guessable random tokens may compromise message integrity and confidentiality. Son et al. [14] found numerous legitimate scripts that used postMessage incorrectly and could be exploited because of flawed origin checks. All these studies assumed that the attacker is a postMessage sender. In contrast, the DangerNeighbor attacks assume that the attacker is a postMessage receiver, and it becomes more difficult to detect such a stealthy attack that only passively eavesdrops messages.

Browser extensions is one attacking vector that can be misused by attackers to inject malicious JavaScript code [43]. Carlini et al. [44] performed a security review of 100 Chrome extensions and found 70 vulnerabilities across 40 extensions. To defeat those attacks, Kapravelos et al. [17] presented Hulk, a dynamic analysis system that detected malicious behavior in browser extensions. Jagpal et al. [16] exposed wide-spread efforts by criminals to abuse the Chrome Web Store as a platform for distributing malicious extensions and presented a detection system called WebEval. In this paper, we introduce a new passive attack that can be implemented with a malicious browser extension. We suggest to limit the usage of postMessage in browser extensions.

AC

CE

PT

ED

M

B. JavaScript Inclusion We find that the attackers may use malicious third-party service providers to launch DangerNeighbor attacks, since the third-party JavaScript code from the service provider runs in the same origin as the hosting page. There are a number of research works that show JavaScript inclusion may introduce vulnerabilities into the website. Nikiforakis et al. [5] performed a large-scale measurement study of the JavaScript inclusions and concluded that even the top Internet sites may trust remote service providers that can be easily comprised by attackers. Lekies et al. [4] systematically investigate the security issue caused by dynamic JavaScript inclusion. Xing et al. [38] presented InteGuard to offer security protection against vulnerable web API integrations. ScriptInspector [6] is a modified browser that can intercept, record, and check third-party script accesses to critical resources against security policies. JavaScript function wrapper is one practical method to enforce extra isolation for untrusted JavaScript code. Phung et al. [36] presented a method to transform the JavaScript code into a self-protecting code based on function wrapper. Magazinius et al. [39] extended Phung’s work with a systematic way to avoid the identified vulnerabilities. Meyerovich et al. [40] proposed object views as a user-level mechanism for finegrained JavaScript object sharing. Google project Caja [41] and Facebook project FBJS [42] adopted JavaScritp function wrapper to provide security check. Our lightweight JavaScript defense performs a specific constraint on postMessage method using JavaScript function wrapper.

E. SOP in Smart Phone System The Same-Origin Policy (SOP) is the principal security policy enforced by the web browser. However, the postMessage mechanism provides a legitimate way to break the SOP [3], [26], [14]. When the smart phone encounters the SOP, new problems emerge. Wang et al. [49] reported the first systematic study on mobile cross-origin risk. Son et al. [50] analyzed the software stack created by hybrid frameworks and demonstrated that it does not properly compose the access control policies to govern web code and local code. Hybrid application [51] is a native mobile application, which involves Web-based technologies such as JavaScript, HTML, etc. The frameworks used to develop hybrid application are called hybrid frameworks. Jin et al. [52] also studied the hybrid application and they found a new form of code injection attack, which inherited the fundamental cause of Cross-Site Scripting attack (XSS) but with more channels in hybrid application to inject code than XSS. F. OAuth Security It is critical to protect the security of OAuth, particularly, the OAuth access tokens. Formal method has been used to prove the secure of OAuth protocol [53], [54], [55]. Moreover, the implementations of OAuth protocol in the real world have attracted a lot of attendtion [35], [56]. Zhou et al. [57] designed and implemented an automatic vulnerability checker

ACCEPTED MANUSCRIPT 13

X. C ONCLUSION

ED

ACKNOWLEDGMENT

M

AN US

The postMessage mechanism in HTML5 enables crossorigin communication, which brings convenience on the communication between different origins. However, it also introduces new security problems. In this paper, we identify a new DangerNeighbor attack against the HTML5 postMessage mechanism. This passive attack enables the attacker to eavesdrop the messages sent via postMessage. We develop two prototypes of DangerNeighbor attacks using two different attacking vectors. We develop a toolset to evaluate the threat of the DangerNeighbor attack in the wild. The toolset helps us collect the messages and receiver functions in Alexa top 5,000 sites and 1,200 randomly chosen Chrome extensions. A case study is conducted on how to compromise OAuth access token through DangerNeighbor attack. The experimental results confirm that DangerNeighbor attack could compromise OAuth and steal access token from vulnerable websites. Totally, 39.61% of websites using Facebook OAuth and 23.38% of websites using Google OAuth in the top 2,000 Alexa websites are vulnerable. Finally, we propose two defense methods to defeat the DangerNeighbor attack.

[6] Y. Zhou and D. Evans, “Understanding and monitoring embedded web scripts,” in IEEE Symposium on Security and Privacy(S&P), San Jose, CA, USA, May 17-21, 2015, pp. 850–865. [Online]. Available: http://dx.doi.org/10.1109/SP.2015.57 [7] “Alexa, top sites on the web,” http://www.alexa.com/topsites. [8] “OAuth,” http://oauth.net/. [9] “window dom object,” https://developer.mozilla.org/en-US/docs/Web/ API/Window. [10] “Content scripts in chrome extension,” https://developer.chrome.com/ extensions/content scripts. [11] “Content scripts in firefox extension,” https://developer.mozilla.org/ en-US/Add-ons/SDK/Guides/Content Scripts. [12] “Injecting scripts in safari extension,” https://developer.apple.com/ library/content/documentation/Tools/Conceptual/SafariExtensionGuide/ InjectingScripts/InjectingScripts.html. [13] “w3c, document object model (dom),” https://www.w3.org/DOM/. [14] S. Son and V. Shmatikov, “The postman always rings twice: Attacking and defending postmessage in HTML5 websites,” in 20th Annual Network and Distributed System Security Symposium(NDSS) , San Diego, California, USA, February 24-27, 2013. [15] “Chrome web store,” https://chrome.google.com/webstore/category/ extensions. [16] N. Jagpal, E. Dingle, J. Gravel, P. Mavrommatis, N. Provos, M. A. Rajab, and K. Thomas, “Trends and lessons from three years fighting malicious extensions,” in 24th USENIX Security Symposium, Washington, D.C., USA, August 12-14, 2015, pp. 579–593. [Online]. Available: https://www.usenix.org/conference/ usenixsecurity15/technical-sessions/presentation/jagpal [17] A. Kapravelos, C. Grier, N. Chachra, C. Kruegel, G. Vigna, and V. Paxson, “Hulk: Eliciting malicious behavior in browser extensions,” in 23rd USENIX Security Symposium, San Diego, CA, USA, August 20-22, 2014, pp. 641–654. [Online]. Available: https://www.usenix.org/ conference/usenixsecurity14/technical-sessions/presentation/kapravelos [18] L. Bauer, S. Cai, L. Jia, T. Passaro, M. Stroucken, and Y. Tian, “Run-time monitoring and formal analysis of information flows in chromium,” in 22nd Annual Network and Distributed System Security Symposium(NDSS) , San Diego, California, USA,, February 8-11, 2015. [Online]. Available: http://www.internetsociety.org/doc/ run-time-monitoring-and-formal-analysis-information-flows-chromium [19] “Wkwebview,” https://developer.apple.com/documentation/webkit/ wkwebview. [20] “Message passing in chrome extension,” https://developer.chrome.com/ extensions/messaging. [21] “Chrome extension manifest file,” https://developer.chrome.com/ extensions/manifest. [22] “Tf-Idf:term frequency and inverse document frequency,” https://en. wikipedia.org/wiki/Tf-idf. [23] “A python module for machine learning built on scipy and distributed under the 3-clause bsd license.” http://scikit-learn.org. [24] C. Guan, K. Sun, Z. Wang, and W. T. Zhu, “Privacy breach by exploiting postmessage in HTML5: identification, evaluation, and countermeasure,” in Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security(AsiaCCS), Xi’an, China, May 30 - June 3, 2016, pp. 629–640. [Online]. Available: http://doi.acm.org/10.1145/2897845.2897901 [25] “Javascript object notation,” http://www.w3schools.com/json/. [26] S. Hanna, R. Shin, D. Akhawe, A. Boehm, P. Saxena, and D. Song, “The emperor’s new apis: On the (in) secure usage of new client-side primitives,” in Proceedings of the Web, vol. 2, 2010. [27] “Oauth 2.0,” https://tools.ietf.org/html/rfc6749. [28] A. Vapen, N. Carlsson, A. Mahanti, and N. Shahmehri, “Third-party identity management usage on the web,” in Proceedings of the 15th International Conference Passive and Active Measurement(PAM) ,Los Angeles, CA, USA, March 10-11, 2014, pp. 151–162. [Online]. Available: http://dx.doi.org/10.1007/978-3-319-04918-2 15 [29] “Representational state transfer (rest),” https://en.wikipedia.org/wiki/ Representational state transfer. [30] “Facebook JavaScript SDK,” https://developers.facebook.com/docs/ javascript. [31] “Google OAuth2.0,” https://developers.google.com/identity/protocols/ OAuth2. [32] “Sign In with LinkedIn,” https://developer.linkedin.com/docs/ signin-with-linkedin. [33] “Fiddler, a web debugging proxy,” http://www.telerik.com/fiddler. [34] “Permissions reference - facebook login,” https://developers.facebook. com/docs/facebook-login/permissions.

CR IP T

named SSOScan for applications using Facebook Single SignOn APIs. Hanna et al. [26] studied the real-world usage of postMessage on Facebook Connect and Google Friend Connect. Both works demonstrated that the misuse of the postMessage on OAuth implementation could leak the OAuth access token. Our work shows that even when the developer uses the postMessage according to official specification, the access token may still be leaked out.

CE

PT

We would like to thank our anonymous reviewers for their valuable comments and suggestions. This work is supported by the National Key Research and Development Program of China under Grant No.2016YFB0800102, the U.S. ONR grants N00014-16-1-3214 and N00014-16-1-3216, the NSF grants CNS-1815650, the National Natural Science Foundation of China under Grant No.61802398, and the National Cryptography Development Fund under Award No.MMJJ20180222. R EFERENCES

AC

[1] “Same origin policy,” http://www.w3.org/Security/wiki/Same Origin Policy. [2] “XSS, ‘cross-site scripting’. OWASP,” https://www.owasp.org/index. php/XSS. [3] A. Barth, C. Jackson, and J. C. Mitchell, “Securing frame communication in browsers,” in Proceedings of the 17th USENIX Security Symposium, San Jose, CA, USA, July 28-August 1, 2008, pp. 17–30. [Online]. Available: http://www.usenix.org/events/sec08/ tech/full papers/barth/barth.pdf [4] S. Lekies, B. Stock, M. Wentzel, and M. Johns, “The unexpected dangers of dynamic javascript,” in 24th USENIX Security Symposium, Washington, D.C., USA, August 12-14, 2015, pp. 723–735. [Online]. Available: https://www.usenix.org/conference/ usenixsecurity15/technical-sessions/presentation/lekies [5] N. Nikiforakis, L. Invernizzi, A. Kapravelos, S. V. Acker, W. Joosen, C. Kruegel, F. Piessens, and G. Vigna, “You are what you include: large-scale evaluation of remote javascript inclusions,” in the ACM Conference on Computer and Communications Security(CCS), Raleigh, NC, USA,, October 16-18, 2012, pp. 736–747. [Online]. Available: http://doi.acm.org/10.1145/2382196.2382274

ACCEPTED MANUSCRIPT 14

CR IP T

[54] S. Chari, C. S. Jutla, and A. Roy, “Universally composable security analysis of oauth v2. 0.” IACR Cryptology ePrint Archive, vol. 2011, p. 526, 2011. [55] C. Bansal, K. Bhargavan, A. Delignat-Lavaud, and S. Maffeis, “Discovering concrete attacks on website authorization by formal analysis,” Journal of Computer Security, vol. 22, no. 4, pp. 601–657, 2014. [56] R. Wang, S. Chen, and X. Wang, “Signing me onto your accounts through facebook and google: A traffic-guided security study of commercially deployed single-sign-on web services,” in Proceedings of IEEE Symposium on Security and Privacy(S&P). IEEE, 2012, pp. 365–379. [57] Y. Zhou and D. Evans, “Ssoscan: Automated testing of web applications for single sign-on vulnerabilities,” in Proceedings of the 23rd USENIX Security Symposium, San Diego, CA, USA, August 20-22, 2014, pp. 495–510. [Online]. Available: https://www.usenix.org/conference/ usenixsecurity14/technical-sessions/presentation/zhou

AC

CE

PT

ED

M

AN US

[35] S. Sun and K. Beznosov, “The devil is in the (implementation) details: an empirical analysis of oauth SSO systems,” in the ACM Conference on Computer and Communications Security(CCS), Raleigh, NC, USA, October 16-18, 2012, pp. 378–390. [Online]. Available: http://doi.acm.org/10.1145/2382196.2382238 [36] P. H. Phung, D. Sands, and A. Chudnov, “Lightweight self-protecting javascript,” in Proceedings of the ACM Symposium on Information, Computer and Communications Security(ASIACCS), Sydney, Australia, March 10-12, 2009, pp. 47–60. [37] “Javascript built-in objects,” https://developer.mozilla.org/en-US/docs/ Web/JavaScript/Reference/Global Objects. [38] L. Xing, Y. Chen, X. Wang, and S. Chen, “Integuard: Toward automatic protection of third-party web service integrations,” in 20th Annual Network and Distributed System Security Symposium(NDSS) , San Diego, California, USA, February 24-27, 2013. [39] J. Magazinius, P. H. Phung, and D. Sands, “Safe wrappers and sane policies for self protecting javascript,” in Information Security Technology for Applications - 15th Nordic Conference on Secure IT Systems, NordSec, Espoo, Finland, Revised Selected Papers, October 27-29, 2010, pp. 239–255. [40] L. A. Meyerovich, A. P. Felt, and M. S. Miller, “Object views: finegrained sharing in browsers,” in Proceedings of the 19th International Conference on World Wide Web(WWW), Raleigh, North Carolina, USA, April 26-30, 2010, pp. 721–730. [41] “Google caja,” https://developers.google.com/caja/. [42] “Facebook fbjs,” https://github.com/facebook/fbjs. [43] L. Liu, X. Zhang, G. Yan, and S. Chen, “Chrome extensions: Threat analysis and countermeasures,” in Proceedings of the 19th Annual Network and Distributed System Security Symposium(NDSS), San Diego, California, USA, February 5-8, 2012. [Online]. Available: http://www.internetsociety.org/ chrome-extensions-threat-analysis-and-countermeasures [44] N. Carlini, A. P. Felt, and D. Wagner, “An evaluation of the google chrome extension security architecture,” in Proceedings of the 21th USENIX Security Symposium, Bellevue, WA, USA, August 8-10, 2012, pp. 97–111. [Online]. Available: https://www.usenix.org/conference/ usenixsecurity12/technical-sessions/presentation/carlini [45] D. Jang, R. Jhala, S. Lerner, and H. Shacham, “An empirical study of privacy-violating information flows in javascript web applications,” in Proceedings of the 17th ACM Conference on Computer and Communications Security(CCS), Chicago, Illinois, USA, October 4-8, 2010, pp. 270–283. [Online]. Available: http: //doi.acm.org/10.1145/1866307.1866339 [46] A. Gervais, R. Shokri, A. Singla, S. Capkun, and V. Lenders, “Quantifying web-search privacy,” in Proceedings of the ACM Conference on Computer and Communications Security(CCS), Scottsdale, AZ, USA, November 3-7, 2014, pp. 966–977. [Online]. Available: http://doi.acm.org/10.1145/2660267.2660367 [47] A. Acquisti, L. K. John, and G. Loewenstein, “What is privacy worth?” The Journal of Legal Studies, vol. 42, no. 2, pp. 249–274, 2013. [48] L. Olejnik, M. Tran, and C. Castelluccia, “Selling off user privacy at auction,” in 21st Annual Network and Distributed System Security Symposium(NDSS) , San Diego, California, USA, February 23-26, 2014. [Online]. Available: http://www.internetsociety.org/doc/ selling-privacy-auction [49] R. Wang, L. Xing, X. Wang, and S. Chen, “Unauthorized origin crossing on mobile platforms: threats and mitigation,” in Proceedings of the ACM Conference on Computer and Communications Security(CCS), Berlin, Germany, November 4-8, 2013, pp. 635–646. [50] M. Georgiev, S. Jana, and V. Shmatikov, “Breaking and fixing originbased access control in hybrid web/mobile application frameworks,” in 21st Annual Network and Distributed System Security Symposium, NDSS , San Diego, California, USA, February 23-26, 2014. [51] “Hybrid frameworks,” https://developer.jboss.org/wiki/ GetStartedWithHybridApplicationFrameworks? sscc=t. [52] X. Jin, X. Hu, K. Ying, W. Du, H. Yin, and G. N. Peri, “Code injection attacks on html5-based mobile apps: Characterization, detection and mitigation,” in Proceedings of the ACM Conference on Computer and Communications Security(CCS), Scottsdale, AZ, USA, November 3-7, 2014, pp. 66–77. [Online]. Available: http: //doi.acm.org/10.1145/2660267.2660275 [53] S. Pai, Y. Sharma, S. Kumar, R. M. Pai, and S. Singh, “Formal verification of oauth 2.0 using alloy framework,” in International Conference on Communication Systems and Network Technologies(CSNT). IEEE, 2011, pp. 655–659.

ACCEPTED MANUSCRIPT

CR IP T

Chong Guan received the B.Sc. degree in school of information science and technology from University of Science and Technology of China, in 2011. He is currently pursuing the Ph.D degree with the Institute of Information Engineering, University of Chinese Academy of Sciences, Beijing, China. He was a Visiting Scholar with the Center for Secure Information Systems, George Mason University, VA, USA, from 2016 to 2017 and was advised by Dr. K. Sun.

M

AN US

Kun Sun is an Associate Professor in the Department of Information Sciences and Technology at George Mason University. He is also the director of Sun Security Laboratory. Kun Sun received his PhD from Department of Computer Science at North Carolina State University in 2016. His research focuses on systems and network security. The main thrusts of his research include moving target defense, cyber deception, trusted computing, Linux container security, forensic analysis of virtual machine, security on Internet of Things (IoT), side channel attacks, system resource isolation and access control, and password management. He published over 70 technical papers on security conferences and journals including IEEE S&P, ACM CCS, NDSS, IEEE DSN, ESORICS, ACSAC, IEEE TDSC, and IEEE TIFS, and two papers won the Best Paper Award.

AC

CE

PT

ED

Lingguang Lei received the Ph.D. degree in computer science from University of Chinese Academy of Sciences (UCAS), in 2013. She joined the Institute of Information Engineering, CAS as an Assistant Professor in July, 2013. She had worked as a visiting scholar at the College of William and Mary between October 25, 2015 to October 24, 2016, and at the George Mason University between October 25, 2016 and August 31, 2017. Her research interest is in the field of information security, including mobile system security, container security, network security and information system security evaluation.

1

ACCEPTED MANUSCRIPT

CR IP T

Pingjian Wang received the Ph.D. degree from Graduate University, Chinese Academy of Sciences (CAS), in 2011. From 2011 to 2013, he worked as a postdoctoral fellow at the University of Chinese Academy of Sciences (UCAS). In 2014, he joined the Institute of Information Engineering, CAS. He is currently an Senior Engineer of the Institute of Information Engineering at CAS. He has authored 8 referred technical papers. His research interest is in the field of information security, including mobile system security, Internet of Things security, and privacy protection.

M

AN US

Yuewu Wang received the Ph.D. degree from Graduate University, Chinese Academy of Sciences (CAS), in 2008. From 2008 to 2011, he was an Assistant Professor with Data Assurance and Communication Security, CAS. In 2012, he joined the Institute of Information Engineering, CAS, along with Data Assurance and Communication Security. He is currently a Professor of the Institute of Information Engineering at CAS. He has authored 20 referred technical papers. His research interest is in the field of information security, including mobile system security, network attack simulation, and information system security evaluation.

AC

CE

PT

ED

Wei Chen received the Ph.D. degree from Wuhan University, in 2005. He is currently a Professor in the Nanjing University of Posts and Telecommunications. His research interest is in the field of information security, including Web security, IoT security and mobile system security.

2