Privacy on the Internet

Privacy on the Internet

Chapter 42 Privacy on the Internet Marco Cremonini Universita` degli Studi di Milano Chiara Braghin Universita` degli Studi di Milano Claudio Agost...

158KB Sizes 7 Downloads 151 Views

Chapter 42

Privacy on the Internet Marco Cremonini Universita` degli Studi di Milano

Chiara Braghin Universita` degli Studi di Milano

Claudio Agostino Ardagna Universita` degli Studi di Milano

1. PRIVACY IN THE DIGITAL SOCIETY Privacy in today’s digital society is one of the most debated and controversial topics. Many different opinions about what privacy actually is and how it could be preserved have been expressed, but still we can set no clearcut border that cannot be trespassed if privacy is to be safeguarded.

The Origins, the Debate As it often happens when a debate heats up, the extremes speak louder and, about privacy, the extremes are those that advocate the ban of the disclosure of whatever personal information and those that say that all personal information is already out there, therefore privacy is dead. Supporters of the wide deployment and use of anonymizing technologies are perhaps the best representatives of one extreme. However, these are just the extremes; in reality privacy in the digital society is a fluid concept that such radical positions cannot fully contain. It is a fact that even those supporting full anonymity recognize that there are several limitations to its adoption, either technical or functional. On the other side, even the most skeptical cannot avoid dealing with privacy issues, either because of laws and norms or because of common sense. Sun Microsystems, for example, is actually supporting privacy protection and is a member of the Online Privacy Alliance, an industry coalition that fosters the protection of individuals’ privacy online. Looking at the origins of the concept of privacy, Aristotle’s distinction between the public sphere of politics and the private sphere of the family is often considered the root. Much later, the philosophical and anthropological debate around these two spheres of an

individual’s life evolved. John Stuart Mill, in his essay, On Liberty, introduced the distinction between the realm of governmental authority as opposed to the realm of self-regulation. Anthropologists such as Margaret Mead have demonstrated how the need for privacy is innate in different cultures that protect it through concealment or seclusion or by restricting access to secret ceremonies. More pragmatically, back in 1898, the concept of privacy was expressed by U.S. Supreme Court Justice Brandeis, who defined privacy as “The right to be let alone.”[1] This straightforward definition represented for decades the reference of any normative and operational privacy consideration and derivate issues and, before the advent of the digital society, a realistically enforceable ultimate goal. Internet has changed the landscape because the very concept of being let alone while interconnected becomes fuzzy and fluid. In 1948, privacy gained the status of fundamental right of any individual, being explicitly mentioned in the United Nations Universal Declaration of Human Rights (Article 12): “No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honor and reputation. Everyone has the right to the protection of the law against such interference or attacks.”[2] However, although privacy has been recognized as a fundamental right of each individual, the Universal Declaration of Human Rights does not explicitly define what privacy is, except for relating it to possible interference or attacks. About the digital society, less rigorously but otherwise effectively in practical terms, in July 1993, The New Yorker published a brilliant cartoon by Peter Steiner that since then has been cited and reproduced dozen of times to refer to the supposed intrinsic level of privacy—here in the sense of anonymity or hiding personal traits—that can

Computer and Information Security Handbook. DOI: http://dx.doi.org/10.1016/B978-0-12-394397-2.00042-8 © 2013 Elsevier Inc. All rights reserved.

739

740

be achieved by carrying out social relations over the Internet. That famous cartoon shows one dog that types on a computer keyboard and says to the other one: “On the Internet, no one knows you’re a dog.”[3] The Internet, at least at the very beginning of its history, was not perceived as threatening to individuals’ privacy; rather, it was seen as increasing it, sometimes too much, since it could easily let people disguise themselves in the course of personal relationships. Today that belief may look naı¨ve with the rise of threats to individual privacy that have accompanied the diffusion of the digital society. Nevertheless, there is still truth in that cartoon because, whereas privacy is much weaker on the Internet than in real space, concealing a person’s identity and personal traits is technically even easier. Both aspects concur and should be considered. A commonly used definition of privacy is by Alan Westin: “Privacy is the claim of individuals, groups and institutions to determine for themselves, when, how and to what extent information about them is communicated to others.”[4] The definition is rather generic since it does not specify what kind of personal information is to be considered. In the digital society, in particular in the context of network communication, the information can be the identity of the sender, the identity of the receiver, the identity of both, the fact that there is a communication is ongoing, the host/cell/location of the participants taking part to the communication, etc. In order to address these different aspects of privacy, some terms are often used in place of privacy. In [5,6] the differences among anonymity, unobservability, and unlinkability are pointed out. In the digital society scenario, anonymity is defined as the state of not being identifiable, unobservability as the state of being indistinguishable, and unlinkability as the impossibility of correlating two or more actions/items/pieces of information. Privacy, however defined and valued, is a tangible state of life that must be attainable in both the physical and the digital society. The reason that in the two realms—the physical and the digital—privacy behaves differently has been widely debated, too, and many of the critical factors that make a difference in the two realms, the impact of technology and the Internet, have been spelled out clearly. However, given the threats and safeguards that technologies make possible, it often remains unclear what the goal of preserving privacy over the Internet should be—being that extreme positions are deemed unacceptable. Lessig, in his book Free Culture,[7] provided an excellent explanation of the difference between privacy in the physical and in the digital world: “The highly inefficient architecture of real space means we all enjoy a fairly robust amount of privacy. That privacy is guaranteed to us by friction. Not by law [. . .] and in many

PART | V

Privacy and Access Management

places, not by norms [. . .] but instead, by the costs that friction imposes on anyone who would want to spy. [. . .] Enter the Internet, where the cost of tracking browsing in particular has become quite tiny. [. . .] The friction has disappeared, and hence any ‘privacy’ protected by the friction disappears, too.” Thus, privacy can be seen as the friction that reduces the spread of personal information, that makes it more difficult and economically inconvenient to gain access to it. The merit of this definition is to put privacy into a relative perspective, which excludes the extremes that advocate no friction at all or so much friction to stop the flow of information. It also reconciles privacy with security, being both aimed at setting an acceptable level of protection while allowing the development of the digital society and economy rather than focusing on an ideal state of perfect security and privacy.

Privacy Threats Threats to individual privacy have become publicly appalling since July 2003, when the California Security Breach Notification Law [8] went into effect. This law was the first one to force state government agencies, companies, and nonprofit organizations that conduct business in California to notify California customers if personally identifiable information (PII) stored unencrypted in digital archives was, or is reasonably believed to have been, acquired by an unauthorized person. The premise for this law was the rise of identity theft, which is the conventional expression that has been used to refer to the illicit impersonification carried out by fraudsters who use PII of other people to complete electronic transactions and purchases. The California Security Breach Notification Law lists as PII: Social Security number, driver’s license number, California Identification Card number, bank account number, credit- or debit-card number, security codes, access codes, or passwords that would permit access to an individual’s financial account (see checklist, “An Agenda For Action For Protecting One’s Identity”). By requiring by law the immediate notification to the PII owners, the aim is to avoid direct consequences such as financial losses and derivate consequences such as the burden to restore an individual’s own credit history. Starting on January 1, 2008, the data security breach notification law in California also applies to medical information and health insurance data. Besides the benefits to consumers, this law has been the trigger for similar laws in the United States—today, the majority of U.S. states have one—and has permitted the flourishing of regular statistics about privacy breaches, once almost absent. Privacy threats and analyses are now widely debated, and research focused on privacy problems has become one of the most important.

Chapter | 42

Privacy on the Internet

741

An Agenda for Action for Protecting One’s Identity You’ll want to protect the privacy of your personal information while you’re online. Here’s a checklist of some of the most important things you can do to protect your identity and prevent others from easily getting your personal information: (Check All Tasks Completed): _____1. Check a site’s privacy policy before you enter any personal information and know how it will be used. _____2. Make sure you have a secure internet connection, by checking for the unbroken key or closed lock icon in your browser, before you enter any personal information onto a webpage. _____3. Only give a credit card number when buying something. _____4. Register your credit cards with your card provider’s online security services, such as Verified by Visa and MasterCard SecureCode. _____5. Use just one credit card for online purchases; if possible, use an account with a low spending limit or small available balance. _____6. Don’t use a debit card for your online purchases. Credit cards are better because bank-provided security guarantees apply to credit cards, so an unauthorized charge is limited to $50.

The DataLossDB, maintained by the Open Security Foundation, [9] publishes one of the most complete references for privacy breaches and data losses, recording incidents involving data losses back from 2003 to date. Looking at the largest incidents, the magnitude of some breaches is astonishing: In 2009, about 130 million records were lost by Heartland Payment Systems, USA; in 2007, about 94 million records were hacked at TJX stores in the United States; in 2011, the target was Sony Corp. and its 77 million customer’s records. Many other incidents in the dozen of million size are recorded and have gained the headlines on the press, involving all sort of confidential information managed by very different kind of organizations. Most of the incidents have been consequence of hacking from outside the corporate network, but with notable exceptions. In 2004, an employee of America Online Inc. stole 92 million email addresses and sold them to spammers; in 2006 a computer containing about 26 million personal records of the U.S. Department of Veterans Affairs was stolen, in 2007 two CDs were lost containing the entire HM Revenue and Customs (GB) child benefit database (about 25 million records) and 7 million banking details. Similarly, lost tape backups or other storage media containing million of personal records were the reason for severe data loss incidents in 2008 at T-Mobile, at Deutsch Telekom company, at LaSalle Bank, USA, and at GS Caltex in South Korea.

_____7. Don’t select the “remember my password” option when registering online. _____8. Change your passwords every 60 to 90 days and don’t use personal information as your password, instead use a string of at least five letters, numbers and punctuation marks. _____9. Don’t store your passwords near your computer or in your purse or wallet. _____10. Don’t give more information than a site requires. _____11. Keep your anti-virus software up-to-date to reduce the risk of malicious code running on your PC. _____12. Don’t go online unless you have a personal firewall enabled to add a layer of protection to your PC by stopping unknown connections to your PC. _____13. Don’t reply directly to e-mail messages asking for personal information. _____14. Type web addresses directly into your web browser instead of clicking on e-mail links. _____15. Get anti-virus and anti-spam filtering software and keep it up to date by using its automatic update feature, if your service provider or employer doesn’t provide it for you. _____16. Check out online retailers’ ratings at BizRate and the Better Business Bureau and the before buying.

It is interesting to note that the existence of criminals looking after huge archives of personal data is not a phenomenon that appeared with the advent of the Internet and modern interconnected digital networks. In 1984, hackers accessed a credit-reporting database, likely managed on mainframe systems, at TRW Inc. containing 90 million records, and in 1986, document about 16 million vital records of Canadian taxpayers was stolen from Toronto’s District Taxation Center. Whereas these incidents are the most notable, the phenomenon is distributed over the whole spectrum of breach sizes. Hundreds of privacy breaches are reported in the order of a few thousand records lost and all categories of organizations are affected, from public agencies, universities, banks and financial institutions, manufacturing and retail companies, and so on. To this end, it is interesting to quote the authors of the 2011 Data Breach Investigations Report by the Verizon RISK team [10]: “2010 exhibited a much more even distribution. The main factor in this shift is the lack of “mega-breaches” in our combined caseload. Many incidents involving the compromise of multi-millions of records (or more) in the last few years occurred in financial institutions. Without one or two of these skewing the results, things naturally balance out a bit more. Another factor to consider is that criminals seemed to gain interest in stealing data other than payment cards. Account takeovers, theft of IP and

742

other sensitive data, stolen authentication credentials, botnet activity, etc. (which are typically less mega-breachable) affected firms at increased rates in 2010”. This is a precious warning not to focus excessively on that “megabreaches” that get the headlines as the sole indicator of the status of privacy on the Internet. Even in 2010, when huge breaches did not happened (2011 is different, as we illustrated), privacy threats and incidents soared in numbers, while not in the amount of records stolen. Therefore, the threats are real and still well alive even for small-to-medium firms and organizations. Again from the DataLossDB, we have an overview about the incidence of data breaches by breach type, business type and vector. With respect to breach type, the main ones are: 19% due to hacking; 15% due to stolen laptops; 11% due to malicious Web services; and 11% due to frauds. A plethora of other reasons for breach, from disposal documents, media and computers to lost or missing storage media, malwares and emails are responsible for almost 40% of all breaches. With respect to business type, incidents are distributed as follows: 49% affect business; 19% governments; 16% medical institutions; and 16% education. Finally, the vectors mainly exploited to conduct privacy and data breaches are: 55% of incidents originate from outside an organization; 39% from inside; and 6% unknown. It is interesting to note how, for incidents originated from inside an organization, the majority is accidental, rather than intentional. This fact points out the relevant role of mistakes, disorganization, mismanagement and all other accidental reasons that may pose severe threats to data privacy.

2. THE ECONOMICS OF PRIVACY The existence of strong economic factors that influence the way privacy is managed, breached, or even traded off has long been recognized [11,12]. However, it was with the expansion of the online economy, in the 1990s and 2000s, that privacy and economy become more and more entangled. Many studies have been produced to investigate, from different perspectives and approaches, the relation between the two. A comprehensive survey of works that analyzed the economic aspects of privacy can be found in [13]. Two issues among the many have gained most of the attention: assessing the value of privacy and examining to what extent privacy and business can coexist or are inevitably conflicting one with the other. For both issues the debate is still open and no ultimate conclusion has been reached yet.

PART | V

Privacy and Access Management

Privacy and Business The relationship between privacy and business has been examined from several angles by considering which incentives could be effective for integrating privacy with business processes and, instead, which disincentives make business motivations to prevail over privacy. Froomkin [14] analyzed what he called “privacydestroying technologies” developed by governments and businesses. Examples of such technologies are collections of transactional data, automated surveillance in public places, biometric technologies, and tracking mobile devices and positioning systems. To further aggravate the impact on privacy of each one of these technologies, their combination and integration result in a cumulative and reinforcing effect. On this premise, Froomkin introduces the role that legal responses may play to limit this apparently unavoidable “death of privacy.” Odlyzko [15,16] is a leading author that holds a pessimistic view of the future of privacy, calling “unsolvable” the problem of granting privacy because of price discrimination pressures on the market. His argument is based on the observation that the markets as a whole, especially Internet-based markets, have strong incentives to price discriminate, that is, to charge varying prices when there are no cost justifications for the differences. This practice, which has its roots long before the advent of the Internet and the modern economy—one of the most illustrative examples is 19th-century railroad pricing practices—provides relevant economic benefits to the vendors and, from a mere economic viewpoint, to the efficiency of the economy. In general, charging different prices to different segments of the customer base permits vendors to complete transactions that would not take place otherwise. On the other hand, the public has often contrasted plain price discrimination practices since they perceive them as unfair. For this reason, many less evident price discrimination practices are in place today, among which bundling is one of the most recurrent. Privacy of actual and prospective customers is threatened by such economic pressures toward price discrimination because the more the customer base can be segmented—and thus known with greatest detail—the better efficiency is achieved for vendors. The Internetbased market has provided a new boost to such practices and to the acquisition of personal information and knowledge of customer habits. More pragmatically, some analyses have pointed out that given the current privacy concerns, an explicitly fair management of customers’ privacy may become a positive competitive factor [17]. Similarly, Hui et al. [18] have identified seven types of benefits that Internet businesses can provide to consumers in exchange for their personal information.

Chapter | 42

Privacy on the Internet

Privacy and the Web With respect to the often difficult relation between business goals and privacy requirements, a special attention should be given to Web-based business and the many tools and mechanisms that have been developed to improve the knowledge about customers, their habits and preferences in order to offer to them purchase suggestions and personalized services. The drawback of such business goals is the spread of profiling and the inevitable erosion of customers and users privacy. Despite the hype that from time to time involves popular search engines like Google or digital social networks like Facebook for their practices of collecting huge amount of information about users accessing their services, detailed and comprehensive analyses of the data collection and management activity on the Internet are still a few and often focused on one specific, while relevant, company. One of the most informative analysis was published by Krishnamurthy and Wills at the World Wide Web Conference of 2009 [19]. In their work, they reported the results of a study spanning over several years and focused on the technical ways by which third-party aggregators acquire data and the depth of user-related information acquired. The choice of third-party aggregators is useful, because if web sites and services collecting user data are countless, third-party aggregators are a few. The common technique adopted to track users is to use a cookie, which, in short, is a text file containing information about the user’s access to a web service, often encoded in opaque formats, that a web server sends to a client, the client stores locally through its web browser and he exchanges with the server in case of further connections to the same server. Javascript, is the other main technique for tracking users: being code executed at client’s side, they have access to information stored by the browser. The two mechanisms provide first(in case of web-based services like ecommerce sites) and third-party aggregators recording many behavioral information, which, together with physical information like IP addresses, permit to identify users when surfing through the catalogue of an ecommerce site, correlate accesses to different sites and, in general, provide web analytics information about traffic measurements, user profiling and geolocation identification. From the mentioned study, one notable result is that the top-10 third-party aggregators serve up to 80% of first-party servers analyzed. Interestingly, the authors observe that due to the raising concerns about privacy, modern web browsers now provide means to block thirdparty Javascript and cookies, but, as expected, the arms race between users profiling and privacy never settle down and tracking techniques are evolving to bypass

743

those new limitations. In particular, being first-party scripts and cookies still usually allowed, being them the key to an improved user experience and variety of functionality, third-party aggregators are more and more “hiding” their content in first-party servers. Data examined by the study confirm this, exhibiting a striking growth of first-party cookies, Javascript and hidden third-party objects in the timeframe considered. Therefore, it is safe to conclude that, despite new tools and features aimed at privacy protection, blocking third-party content used for tracking users is extremely difficult, without severely impair the user experience. Considering the limitations that affect technical means in protecting privacy, it is natural to turn the attention to regulations and guidelines too. One example of guidelines is the “Guide to Protecting the Confidentiality of Personally Identifiable Information (PII)” published by the National Institute for Standard and Technology (NIST) in 2010 [20]. The aim is “to assist Federal agencies in protecting the confidentiality of personally identifiable information (PII) in information systems”. It is inspired by the Fair Information Practices, which are the principles underlying most privacy laws and privacy best practices. This publication by NIST consists of a comprehensive set of guidelines, merging consideration about information security and privacy, mostly focused on preventing frauds or massive data breaches. With regard to profiling and tracking, however, the NIST guidelines seem ineffective. The case of the new proposed regulation by the European Union [21] is different. Being still a proposal, it may be subject to changes, amendments and its adoption delayed; therefore only preliminary considerations are possible. However, the overall rational seems clear. First, the regulation is aimed at establishing a uniform legislation among all European member states, resolving the current heterogeneity of approaches that leads to different practices and privacy protection levels. Then, it mandates a notification procedure in case of data breach, similar, in the spirit, to the many US states legislations. Finally, and most pertinent to the issues discussed in the present section, it remarkably reinforces the opt-in approach (i.e., this means that by default, personal information could not be acquired and managed by first- and third-parties unless the user explicitly express his/her consent, and it is opposed to the opt-out approach generally adopted in the USA) and imposes strong management requirements to first- and third-parties. More in particular, the proposed regulation is driven by the assumption that personal data belong and still fully belong to data owners even when collected by firstand third-parties. As a consequence, the regulation has several prescriptions about the extent of knowledge that data owners should be provided with from organizations collecting their data, the full right to inspect, modify and

744

delete personal data stored and even transmitted to others by those organizations, and even prescriptions about the provisioning of an European-based support service by data collectors. Therefore, the new EU regulation, if approved in the current form, could sensibly interfere with the practices adopted by data aggregators and first-parties for tracking and profiling users. However, given its status, it is still early to draw firm conclusions about its effectiveness and real impact in the arms race between business and privacy requirements.

3. PRIVACY-ENHANCING TECHNOLOGIES Technical improvements of Web and location technologies have fostered the development of online applications that use the private information of users (including physical position of individuals) to offer enhanced services. The increasing amount of available personal data and the decreasing cost of data storage and processing makes it technically possible and economically justifiable to gather and analyze large amounts of data. In this context, users are much more concerned about their privacy, which has been recognized as one of the main reasons that prevents users from using the Internet for accessing online services. A number of useful privacy-enhancing technologies (PETs) have been developed for dealing with privacy issues, and previous works on privacy protection have focused on a wide variety of topics [22 26]. In this section, we discuss the privacy protection problem in three different contexts. We start by giving an overview of languages and approaches for the specification of access control policies in distributed networks; we then discuss languages and solutions for the definition of privacy-aware access control policies and privacy preferences in distributed networks; we finally describe the problem of protecting privacy in mobile and pervasive environments, and we give an overview of solutions for preserving the location privacy of the users from different perspectives [27].

Access Control Models and Languages Access control systems have been introduced for regulating and protecting access to resources and data owned by parties. Originally, access control models and languages have been defined for centralized and closed environments: users were assumed to be known by the system, and to be identified and authenticated before any authorization process could start. The advent of distributed environments has changed the overall scenario making the above assumptions not possible and traditional access control inapplicable. In an open scenario, in fact, the users are usually not known to the server a-priori and, therefore, identification and authentication are not always possible/wanted. Suitable access control models and

PART | V

Privacy and Access Management

languages need to depart from identification and authentication, and to provide a solution that fits the requirements of an open scenario. Recent works have focused on the definition of attribute-based and credential-based access control, where authorization policies specify the set of attributes and credentials that a user must possess, or conditions over credentials and attributes that a user needs to satisfy to access a resource. In this context, credential-based languages refer to those solutions that use properties of the users, included in credentials signed by third parties, to evaluate access control policies. In general, credentials are characterized by a type (Identity Card), a unique identifier, and an issuer (US government), and certify a set of properties for the user (FirstName 5 John, LastName 5 Doe). Similarly, attribute-based languages refer to those solutions that evaluate policies using attributes that are self-certified by the owner, without a signature made by a third party. Several different models and languages have been defined, prescribing access decisions on the basis of some properties that the requesting party may have. First, logic-based languages ([28 31]) have been proposed with the main goal of providing high-expressive solutions allowing the definition of policies that well adapt to the dynamics and complexity of distributed environments. However, although highly expressive and powerful, logic languages are difficult to adopt in practice especially in a distributed scenario where simplicity, interoperability, and easiness of use are fundamental requirements. [32 34] In addition to frameworks and solutions supporting logicbased access control policies, different automated trust negotiation approaches [35 37] have been developed to establish trust between unknown parties. A trust negotiation is a bidirectional exchange of requests and responses between two parties to the aim of establishing trust by incremental data disclosure. Trust negotiation research has investigated algorithms and protocols that i) allow a party to select credentials that have to be released to the counterpart and ii) protect sensitive information in credentials and/or policies during the negotiation process. In general, trust negotiation protocols provide low performance, with high overhead, and are therefore not suitable for many distributed use cases. The research community has then focused on the definition of access control models and languages that support the requirements of distributed environments, and address the issues of logic-based languages and trust negotiation approaches. The eXtensible Access Control Markup Language (XACML)[38], which is the result of a standardization effort by OASIS, provides an attribute-based access control solution that balances between expressivity and simplicity, is flexibile and extensible, and integrates well with open environments. XACML proposes an XMLbased language to express and interchange access control

Chapter | 42

Privacy on the Internet

policies, defines an architecture for the evaluation of policies, and specifies a communication protocol for messages interchange. Main features of XACML are: (1) policy combination, a method for combining policies on the same resource independently specified by different entities; (2) combining algorithms, different algorithms representing ways of combining multiple decisions into a single decision; (3) attribute-based restrictions, the definition of policies based on properties associated with subjects and resources rather than on their identities; (4) multiple subjects, the definition of more than one subject relevant to a decision request; (5) policy distribution, policies can be defined by different parties and enforced at different enforcement points; (6) implementation independence, an abstraction layer that isolates the policy-writer from the implementation details; and (7) obligations, a method for specifying the actions that must be fulfilled in conjunction with the policy enforcement. Focusing on the language, XACML is based on a model providing a formal representation of policies. Each XACML policy has an element Policy or PolicySet as root, which in turn may contain other Policy or PolicySet elements. Each Policy element is composed by a Target that evaluates policy applicability to a request, a set of Rule that corresponds to positive authorizations (i.e., with attribute effect 5 permit) or negative authorizations (i.e., with attribute effect 5 deny), a set of Obligation specifying actions to be done during the enforcement of an access decision, and a rule combing algorithm (attribute RuleCombiningAlgId) that establishes how conflicting decisions taken by different rules can be reconciled in a single decision (i.e., deny overrides, permit overrides, first applicable, only one applicable). Today, XACML is considered as the de-facto standard for attribute-based access control in distributed environments and is adopted in many different scenarios. In addition, although the XACML language is not specifically designed for managing privacy, it has represented a relevant innovation in the field of access control policies and has set the basis for the definition of many authorization languages with enhanced functionalities, including support for anonymity, trust negotiation, credentials, complex conditions (recursion, delegation, chain of trust), dialog management [39 41].

Languages for Privacy-Aware Access Control and Privacy Preferences The importance gained by privacy requirements has brought with it the definition of access control models that are enriched with the ability of supporting privacy requirements. These enhanced access control models encompass two privacy aspects: to guarantee the desired

745

level of privacy of information exchanged between different parties by controlling the access to services/resources, and to control all secondary uses of information disclosed for the purpose of access control enforcement. Users requiring access to a server application need then to protect access to their personal data by specifying and evaluating privacy policies. The most important proposal in this field is the Platform for Privacy Preferences Project (P3P), a World Wide Web Consortium (W3C) project aimed at protecting the privacy of users by addressing their need to assess that the privacy practices adopted by a server provider comply with users’ privacy requirements. The goal of P3P is twofold: i) to allow Web sites to state their datacollection practices in a standardized, machine-readable way, and ii) to provide users with a solution to understand which data will be collected and how those data will be used. To this aim, P3P allows Web sites to declare their privacy practices in a standard and machine-readable XML format known as P3P policy. A P3P policy contains the specification of the data it protects, the data recipients allowed to access the private data, consequences of data release, purposes of data collection, data retention policy, and dispute resolution mechanisms. Supporting privacy preferences and policies in Web-based transactions allows users to automatically understand and match server practices against their privacy preferences. Thus, users do not need to read the privacy policies at every site they interact with, but they are always aware of the server practices in data handling. The corresponding language that would allow users to specify their preferences as a set of preference rules is called a P3P Preference Exchange Language (APPEL)[42]. APPEL can be used by users’ agents to reach automated or semi-automated decisions regarding the acceptability of privacy policies from P3Penabled Web sites. Unfortunately, interactions between P3P and APPEL have shown that users can explicitly specify just what is unacceptable in a policy, whereas the APPEL syntax is cumbersome and error prone for users. Other approaches have focused on the definition of access control frameworks that integrate both policy evaluation and privacy functionalities. A solution that introduced a privacy-aware access control system has been defined by Ardagna et al. [43]. This framework allows the integration, evaluation, and enforcement of policies regulating access to service/data and release of personal identifiable information, and provides a mechanism to define constraints on the secondary use of personal data for the protection of users’ privacy. In particular, the following types of privacy policies have been specified: G

Access control policies. They govern access/release of services/data managed by the party (as in traditional access control).

746

G

G

PART | V

Release policies. They govern release of properties/ credentials/personal identifiable information (PII) of the party and specify under which conditions this information can be disclosed. Data handling policies. They define how personal information will be (or should be) dealt with at the receiving parties.

An important feature of this framework is to support requests for certified data, issued and signed by trusted authorities, and uncertified data, signed by the owner itself. It also allows to define conditions that can be satisfied by means of zero-knowledge proof [44] and based on the physical position of the users [45]. Most of the research on security and privacy has focused on the server-side of the problem, while symmetric approaches have been used and implemented at the client-side to protect the privacy of the users (privacy preference definition based on policies). In the last few years, however, some solutions for privacy protection that strictly focus on clients’ needs have been defined.

Privacy in Mobile Environments The concept of location privacy can be defined as the right of individuals to decide how, when, and for which purposes their location information could be released to other parties. The lack of location privacy protection could be exploited by adversaries to perform various attacks [46]: G

G

G

G

G

Unsolicited advertising, when the location of a user could be exploited, without her consent, to provide advertisements of products and services available nearby the user position. Physical attacks or harassment, when the location of a user could allow criminals to carry out physical assaults on specific individuals. User profiling and tracking, when the location of a user could be used to infer other sensitive information, such as state of health, personal habits, or professional duties, by correlating visited places or paths. Political, religious, sexual persecution and discrimination, when the location of a user could be used to reduce the freedom of individuals, and mobile technologies are used to identify and persecute opponents. Denial of service, when the location of a user could motivate an access denial to services under some circumstances.

A further complicating factor is that location privacy can assume several meanings and introduce different requirements, depending on the scenario in which the users are moving and on the services the users are

Privacy and Access Management

interacting with. The following categories of location privacy can then be identified: G

G

G

Identity privacy protects the identity of the users associated with or inferable from location information. To this purpose, protection techniques aim at minimizing the disclosure of data that can let attackers infer a user identity. Identity privacy is suitable in application contexts that do not require the identification of the users for providing a service. Position privacy protects the position of individual users by perturbing corresponding information and decreasing the accuracy of location information. Position privacy is suitable for environments where users’ identities are required for a successful service provisioning. A technique that most solutions exploit, either explicitly or implicitly, consists of reducing the accuracy by scaling a location to a coarser granularity (from meters to hundreds of meters, from a city block to the whole town, and so on). Path privacy protects the privacy of information associated with users movements, such as the path followed while travelling or walking in an urban area. Several location-based services (personal navigation systems) could be exploited to subvert path privacy or to illicitly track users.

Since location privacy definition and requirements differ depending on the scenario, no single technique is able to address the requirements of all location privacy categories. Therefore, in the past, the research community, focusing on providing solutions for the protection of location privacy of users, has defined techniques that can be divided into three main classes: anonymity-based, obfuscation-based, and policy-based techniques. These classes of techniques are partially overlapped in scope and could be potentially suitable to cover requirements coming from one or more of the categories of location privacy. It is easy to see that anonymity-based and obfuscation-based techniques can be considered dual categories. Anonymity-based techniques have been primarily defined to protect identity privacy and are not suitable for protecting position privacy, whereas obfuscation-based techniques are well suited for position protection and not appropriate for identity protection. Anonymity-based and obfuscation-based techniques could also be both exploited for protecting path privacy. Policy-based techniques are in general suitable for all location privacy categories, although they are often difficult to understand and manage for end users. Among the classes of techniques just introduced, current research on location privacy has mainly focused on supporting anonymity and partial identities. Beresford and Stajano [47] propose a method, called mix zones, which uses an anonymity service based on an

Chapter | 42

Privacy on the Internet

infrastructure that delays and reorders messages from subscribers. Within a mix zone (an area where a user cannot be tracked), a user is anonymous, meaning that the identities of all users coexisting in the same zone are mixed and become indiscernible. Other works are based on the concept of k-anonymity [48] applied to location privacy. Bettini et al. [49] design a framework able to evaluate the risk of sensitive location-based information dissemination. Their proposal puts forward the idea that the geolocalized history of the requests submitted by a user can be considered as a quasi-identifier that can be used to discover sensitive information about the user. Gruteser and Grunwald [50] develop a middleware architecture and an adaptive algorithm to adjust location information resolution, in spatial or temporal dimensions, to comply with users’ anonymity requirements. To this purpose, the authors introduced the concepts of spatial cloaking. Spatial cloaking guarantees k-anonymity by enlarging the area where a user is located to an area containing k indistinguishable users. Cornelius et al. [51] describe AnonySense, a privacy-aware architecture that implements complex applications using collaborative and opportunistic sensing by mobile devices. The proposed solution protects the privacy of the involved users by submitting sensing tasks to mobile devices in an anonymous way, and by collecting anonymized (but verified) sensed data. Chow et al. [52] provide a spatial cloaking solution for P2P environments and protocols, based on information sharing, historical location, and cloaked area adjustment schemes. Anonymity has also been exploited to protect the path privacy of the users. Chow and Mokbel [53] provide a survey of the most advanced techniques for privacy protection in continuous LBS and trajectory data publication. Alternatively, when the user identity is required for location-based service provision, obfuscation-based techniques have been deployed. Obfuscation is the process of degrading the quality of location information for privacy reasons. An important issue is to manage the trade-off between individual needs for high-quality information services and for location privacy. Ardagna et al. [54] define an approach based on obfuscation for protecting the location privacy of the users against malicious adversaries. The proposed solution is based on a metric called relevance, which models the level of location privacy, and balances location privacy with the accuracy needed for the provision of reliable location-based services. The authors introduce different obfuscation-based techniques aimed at preserving location privacy by artificially perturbing location information (modeled as planar and circular areas). These techniques, which can be used alone or in combination, degrade the location information accuracy by (1) enlarging the radius, (2) reducing the radius, and (3) shifting the center of the measured location. The

747

robustness of the obfuscation techniques is tested against attackers with different knowledge and capabilities, to verify their suitability in a real environment. Hashem and Kulik [55] present a decentralized approach to privacy in wireless ad-hoc networks that mixes both k-anonymity and obfuscation concepts. First of all, the users obfuscate their precise location; then, they anonymize the request by transforming the obfuscated area in a k-anonymous area that contains the obfuscated area of other k-1 users. After creating the anonymous area, an algorithm selects a query requester with a near-uniform randomness, thus ensuring sender anonymity. Finally, policy-based techniques are based on the notion of privacy policies and are suitable for all the categories of location privacy. In particular, privacy policies define restrictions that must be enforced when location of users is used by or released to external parties. As an example, the IETF Geopriv working group [56] addresses privacy and security issues related to the disclosure of location information over the Internet. The main goal is to define an environment supporting both location information and policy data.

4. NETWORK ANONYMITY In recent years, Internet has become an essential aspect of our daily activities, thus the interest in security and privacy issues on the Internet has grown exponentially. In particular, in such a distributed environment, privacy should also imply anonymity: a person shopping online may not want her visits to be tracked, the sending of email should keep the identities of the sender and the recipient hidden from observers, and so on. That is, when surfing the Web, users want to keep secret not only the information they exchange but also the fact that they are exchanging information and with whom. Such a problem has to do with traffic analysis, and it requires ad hoc solutions. Traffic analysis is the process of intercepting and examining messages to deduce information from patterns in communication. It can be performed even when messages are encrypted and cannot be decrypted. In general, the greater the number of messages observed or even intercepted and stored, the more can be inferred from the traffic. It cannot be solved just by encrypting the header of a packet or the payload: in the first case, the packet could still be tracked as it moves through the network; the second case is ineffective as well since it would still be possible to identify who is talking to whom. In this section, we first describe the onion routing protocol [57 59], one of the better-known approaches that is not application-oriented. Then, we provide an overview of other techniques for assuring anonymity and privacy over networks and we discuss the problem of mail anonymity.

748

PART | V

Onion Routing and TOR Onion routing is intended to provide real-time bidirectional anonymous connections that are resistant to both eavesdropping and traffic analysis in a way that is transparent to applications. That is, if Alice and Bob communicate over a public network by means of onion routing, they are guaranteed that the content of the message remains confidential and no external observer or internal node is able to infer that they are communicating. Onion routing works beneath the application layer, replacing socket connections with anonymous connections and without requiring any change to proxy-aware Internet services or applications. It was originally implemented on Sun Solaris 2.4 in 1997, including proxies for Web browsing (HTTP), remote logins (rlogin), email (SMTP), and file transfer (FTP) and now runs on most common operating systems. It consists of a fixed infrastructure of onion routers, where each router has a longstanding socket connection to a set of neighboring ones. Only a few routers, called onion router proxies, know the whole infrastructure topology. In onion routing, instead of making socket connections directly to a responding machine, initiating applications make a socket connection to an onion routing proxy that builds an anonymous connection through several other onion routers to the destination. In this way, the onion routing network allows the connection between the initiator and responder to remain anonymous. Although the protocol is called onion routing, the routing that occurs during the anonymous connection is at the application layer of the protocol stack, not at the IP layer. However, the underlying IP network determines the route that data actually travels between individual onion routers. Given the onion router infrastructure, the onion routing protocol works in three phases: G G G

Anonymous connection setup Communication through the anonymous connection Anonymous connection destruction

During the first phase, the initiator application, instead of connecting directly with the destination machine, opens a socket connection with an onion routing proxy (which may reside in the same machine, in a remote machine, or in a firewall machine). The proxy first establishes a path to the destination in the onion router infrastructure, then sends an onion to the first router of the path. The onion is a layered data structure in which each layer of the onion (public-key encrypted) is intended for a particular onion router and contains (1) the identity of the next onion router in the path to be followed by the anonymous connection; (2) the expiration time of the onion; and (3) a key seed to be used to generate the keys to encode the data sent through the anonymous connection in both directions. The onion is sent through the path

Privacy and Access Management

established by the proxy: an onion router that receives an onion peels off its layer, identifies the next hop, records on a table the key seed, the expiration time and the identifiers of incoming and outgoing connections and the keys that are to be applied, pads the onion and sends it to the next onion router. Since the most internal layer contains the name of the destination machine, the last router of the path will act as the destination proxy and open a socket connection with the destination machine. Note that only the intended onion router is able to peel off the layer intended to it. In this way, each intermediate onion router knows (and can communicate with) only the previous and the next-hop router. Moreover, it is not capable of understanding the content of the following layers of the onion. The router, and any external observer, cannot know a priori the length of the path since the onion size is kept constant by the fact that each intermediate router is obliged to add padding to the onion corresponding to the fixed-size layer that it removed. Once the anonymous connection is established, data can be sent in both directions. The onion proxy receives data from the initiator application, breaks it into fixedsize packets, and adds a layer of encryption for each onion router in the path using the keys specified in the onion. As data packets travel through the anonymous connection, each intermediate onion router removes one layer of encryption. The last router in the path sends the plaintext to the destination through the socket connection that was opened during the setup phase. This encryption layering occurs in the reverse order when data is sent backward from the destination machine to the initiator application. In this case, the initiator proxy, which knows both the keys and the path, will decrypt each layer and send the plaintext to the application using its socket connection with the application. As for the onion, data passed along the anonymous connection appears different to each intermediate router and external observer, so it cannot be tracked. Moreover, compromised onion routers cannot cooperate to correlate the data stream they see. When the initiator application decides to close the socket connection with the proxy, the proxy sends a destroy message along the anonymous connection and each router removes the entry of the table relative to that connection. There are several advantages in the onion routing protocol. First, the most trusted element of the onion routing infrastructure is the initiator proxy, which knows the network topology and decides the path used by the anonymous connection. If the proxy is moved in the initiator machine, the trusted part is under the full control of the initiator. Second, the total cryptographic overhead is the same as for link encryption but, whereas in link encryption one corrupted router is enough to disclose all the data, in onion routing routers cannot cooperate to

Chapter | 42

Privacy on the Internet

correlate the little they know and disclose the information. Third, since an onion has an expiration time, replay attacks are not possible. Finally, if anonymity is also desired, then all identifying information must be additionally removed from the data stream before being sent over the anonymous connection. However, onion routing is not completely invulnerable to traffic analysis attacks: if a huge number of messages between routers is recorded and usage patterns analyzed, it would be possible to make a close guess about the routing, that is, also about the initiator and the responder. Moreover, the topology of the onion router infrastructure must be static and known a priori by at least one onion router proxy, which make the protocol little adaptive to node/router failures. Tor [60] is the second generation onion routing. It addresses some of the limitations highlighted earlier, providing a reasonable tradeoff among anonymity, usability, and efficiency. In particular, it provides perfect forward secrecy and it does not require a proxy for each supported application protocol. Tor is also an effective circumvention tool (a tool to bypass Internet filtering in order to access content blocked by governments, workplaces or schools). All circumvention tools use the same core method to bypass network filtering: they proxy connections through third party sites that are not filtered themselves. [61] is a report on the usage of these tools. One of the most interesting result is the small usage of circumvention tools with respect to the number of filtering countries, explained by the authors by the fact that users are probably not aware of the existence of this kind of tools or are not able to find them.

Network Anonymity Services Some other approaches offer some possibilities for providing anonymity and privacy, but they are still vulnerable to some types of attacks. For instance, many of these approaches are designed for World Wide Web access only; being protocol-specific, these approaches may require further development to be used with other applications or Internet services, depending on the communication protocols used in those systems. David Chaum [62] introduced the idea of mix networks in 1981 (further extended subsequently, like in [63]) to enable unobservable communication between users of the Internet: it provides of sender and receiver and sender anonymity. Mixes are intermediate nodes that may reorder, delay, and pad incoming messages to complicate traffic analysis. A mix node stores a certain number of incoming messages that it receives and sends them to the next mix node in a random order. Thus, messages are modified and reordered in such a way that it is nearly impossible to correlate an incoming message with an outgoing message. Messages are sent through a series of mix

749

nodes and encrypted with mix keys. If participants exclusively use mixes for sending messages to each other, their communication relations will be unobservable, even if the attacker records all network connections. Also, without additional information, the receiver does not have any clue about the identity of the message’s sender. As in onion routing, each mix node knows only the previous and next node in a received message’s route. Hence, unless the route only goes through a single node, compromising a mix node does not enable an attacker to violate either the sender nor the recipient privacy. Mix networks are not really efficient, since a mix needs to receive a large group of messages before forwarding them, thus delaying network traffic. However, onion routing has many analogies with this approach and an onion router can be seen as a real-time Chaum mix. Reiter and Rubin [64] proposed an alternative to mixes, called crowds, a system to make only browsing anonymous: it aims at hiding from Web servers and thirdparties information about either the user or the information she retrieves. This is obtained by preventing a Web server from learning any information linked to the user, such as the IP address or domain name, the page that referred the user to its site, or the user’s computing platform. The approach is based on the idea of “blending into a crowd,” that is, hiding one’s actions within the actions of many others. Before making any request, a user joins a crowd of other users. Then, when the user submits a request, it is forwarded to the final destination with probability p and to some other member of the crowd with probability 1-p. When the request is eventually submitted, the end server cannot identify its true initiator. Even crowd members cannot identify the initiator of the request, since the initiator is indistinguishable from a member of the crowd that simply passed on a request from another.

Anonymous Remailers An anonymous remailer is a system that provides sender anonymity for electronic mail. The basic idea is that a mail server receives messages and then forwards them without revealing where they originally came from. There are different types of anonymous remailer servers. A type-0 remailer (or Pseudonymous remailer) removes the server address, sets a random pseudonym to the sender and sends the message to the intended recipient. In this way, the recipient may send a message back to the sender. The server keeps a list containing the matching of the pseudonyms to sender real email addresses. The remailer is vulnerable to traffic analysis, moreover if the server is compromised and an attacker is able to obtain the matching list all the senders are revealed.

750

Type-I remailers (or Cypherpunk remailers) were developed to deal with the problems highlighted above. The basic idea is the same: the remailer receive the message, removes the sender address and then send it to the recipient, but there are some changes: (i) the message may be encrypted with the remailer public key; (ii) the remailer does not keep any log that could be used to identify senders; and (iii) the message is not sent directly to the recipient but to a chain of remailers. In this way, a single remailer does not know both the sender and the recipient. The drawback is that it is not possible to reply to the message and that they are still vulnerable to some kinds of attack. For this reason, Type-II (or Mixmaster) and type-III (or Mixminion) remailers have been proposed, but they are not really used in practice since they require specially customized software in order to send mails.

PART | V

Privacy and Access Management

augmented sensibility of individuals to its erosion, the many scientific and technological efforts to introduce novel solutions, and a better knowledge of the problem with the help of fresh data contribute to stimulating the need for better protection and fairer use of personal information. For this reason, it is likely that internet privacy will remain an important topic in the years to come and more innovations toward better management of privacy issues will emerge. Finally, let’s move on to the real interactive part of this Chapter: review questions/exercises, hands-on projects, case projects and optional team case project. The answers and/or solutions by chapter can be found in the Online Instructor’s Solutions Manual.

CHAPTER REVIEW QUESTIONS/EXERCISES True/False

5. SUMMARY In this Chapter we have discussed privacy from different viewpoints, from historical to technological. The very nature of the concept of privacy requires such an enlarged perspective because it often appears indefinite, being constrained into the tradeoff between the undeniable need of protecting personal information and the evident utility, in many contexts, of the availability of the same information. The digital society and the global interconnected infrastructure eased accessing and spreading of personal information; therefore, developing technical means and defining norms and fair usage procedures for privacy protection are now more demanding than in the past. Economic aspects have been introduced since they are likely to strongly influence the way privacy is actually managed and protected. In this area, research has provided useful insights about the incentive and disincentives toward better privacy. We presented some of the more advanced solutions that research has developed to date, either for anonymizing stored data, hiding sensitive information in artificially inaccurate clusters, or introducing third parties and middleware in charge of managing online transactions and services in a privacy-aware fashion. Location privacy is a topic that has gained importance in recent years with the advent of mobile devices and that is worth a specific consideration. Furthermore, the important issue of anonymity on the internet has been investigated. To let individuals surf the Web, access online services, and interact with remote parties in an anonymous way has been the goal of many efforts for years. Some important technologies and tools are available and are gaining popularity. To conclude, whereas privacy on the internet and in the digital society does not look to be in good shape, the

1. True or False? Privacy in today’s digital society is one of the most debated and controversial topics. 2. True or False? As it often happens when a debate heats up, the extremes speak louder and, about privacy, the extremes are those that advocate the ban of the disclosure of whatever personal information and those that say that all personal information is already out there, therefore privacy is alive. 3. True or False? Threats to individual privacy have become publicly appalling since July 2012, when the California Security Breach Notification Law [8] went into effect. 4. True or False? The existence of strong economic factors that influence the way privacy is managed, breached, or even traded off has not been recognized. 5. True or False? The relationship between privacy and business has been examined from several angles by considering which incentives could be ineffective for integrating privacy with business processes and, instead, which disincentives make business motivations to prevail over privacy.

Multiple Choice 1. With respect to the often difficult relation between business goals and privacy requirements, a special attention should be given to _______ business and the many tools and mechanisms that have been developed to improve the knowledge about customers, their habits and preferences in order to offer to them purchase suggestions and personalized services. A. Privacy-enhancing technology B. Location technology C. Web-based D. Technical improvement

Chapter | 42

Privacy on the Internet

E. Web Technology 2. Technical improvements of _______ and location technologies have fostered the development of online applications that use the private information of users (including physical position of individuals) to offer enhanced services? A. Privacy-enhancing technology B. Location technology C. Web-based D. Web E. Web Technology 3. What systems have been introduced for regulating and protecting access to resources and data owned by parties? A. Access control B. XACML C. XML-based language D. Certification Authority E. Security 4. The importance gained by privacy requirements has brought with it the definition of _____________ that are enriched with the ability of supporting privacy requirements. A. Access control models B. Languages C. Privacy-Aware Access Control D. Privacy preferences E. Taps 5. What govern access/release of services/data managed by the party (as in traditional access control)? A. Release policies B. Access control policies C. Data handling policies D. Intellectual property E. Social engineering

EXERCISE Problem What should I know about privacy policies?

Hands-On Projects Project How can I protect my privacy when shopping online?

Case Projects Problem How can I prevent web sites from sharing my web browsing habits?

751

Optional Team Case Project Problem How can I prevent my computer from keeping a history of where I browse?

REFERENCES [1] S.D. Warren, L.D. Brandeis, The right to privacy, Harv. Law Rev. IV (5) (1890). [2] United Nations, Universal Declaration of Human Rights. ,www. un.org/Overview/rights.html., 1948. [3] P. Steiner, On the internet, nobody knows you’re a dog, Cartoonbank, The New Yorker. ,www.cartoonbank.com/item/ 22230., 1993. [4] A. Westin, Privacy and Freedom, New York, 1987. [5] A. Pfitzmann, M. Waidner, Networks without user observability design options, Proceedings of Workshop on the Theory and Application of Cryptographic Techniques on Advances in Cryptology (EuroCrypt’85), vol. 219, LNCS Springer, Linz, Austria, 1986. [6] A. Pfitzmann, M. Ko¨hntopp, Anonymity, unobservability, and pseudonymity—a proposal for terminology, Designing Privacy Enhancing Technologies, Springer, Berlin, 2001. [7] L. Lessig, Free Culture, Penguin Group, 2003 ,www.free-culture.cc/.. [8] California Security Breach Notification Law, Bill Number: SB 1386. ,http://info.sen.ca.gov/pub/01-02/bill/sen/sb_1351-1400/sb_ 1386_bill_20020926_chaptered.html., February 2002. [9] DataLossDB. ,http://datalossdb.org/., 2012. [10] Data Breach Investigations Report, the Verizon RISK team. ,http://www.verizonbusiness.com/Products/security/dbir/., 2011. [11] J. Hirshleifer, The private and social value of information and the reward to inventive activity, Am. Econ. Rev. 61 (1971) 561 574. [12] R.A. Posner, The economics of privacy, Am. Econ. Rev. 71 (2) (1981) 405 409. [13] K.L. Hui, I.P.L. Png, Economics of privacy, in: T. Hendershott (Ed.), Handbooks in Information Systems, vol. 1, Elsevier, 2006. [14] A.M. Froomkin, The death of privacy? 52 Stanford Law Rev. (2000) 1461 1469. [15] A.M. Odlyzko, Privacy, economics, and price discrimination on the internet, in: N. Sadeh (Ed.), Proceedings of the Fifth International Conference on Electronic Commerce (ICEC2003), ACM, 2003, pp. 355 366. [16] A.M. Odlyzko, Privacy and the clandestine evolution of ecommerce, Proceedings of the Ninth International Conference on Electronic Commerce (ICEC2007), ACM, 2007. [17] M. Brown, R. Muchira, Investigating the relationship between internet privacy concerns and online purchase behavior, J. Electron. Comm. Res. 5 (1) (2004) 62 70. [18] K.L. Hui, B.C.Y. Tan, C.Y. Goh, Online information disclosure: motivators and measurements, ACM Trans. Internet Technol. 6 (4) (2006) 415 441. [19] B. Krishnamurthy, C.E. Wills, Privacy Diffusion on the Web: A Longitudinal Perspective, WWW 2009, Madrid, Spain, April 20 24, 2009. [20] E. McCallister, T. Grance, K.A. Scarfone, NIST Special Publication 800 122 Guide to Protecting the Confidentiality of

752

[21]

[22]

[23] [24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32] [33]

[34]

[35]

[36]

[37]

PART | V

Personally Identifiable Information (PII). ,http://www.nist.gov/ manuscript-publication-search.cfm?pub_id 5 904990., April 2010. European Commission, Protection of personal data. ,http://ec. europa.eu/justice/newsroom/data-protection/news/120125_en.htm., 2012. R. Chandramouli, Privacy protection of enterprise information through inference analysis, Proceedings of IEEE Sixth International Workshop on Policies for Distributed Systems and Networks (POLICY 2005), Stockholm, Sweden, 2005, pp. 47 56. L.F. Cranor, Web Privacy with P3P, O’Reilly & Associates, 2002. G. Karjoth, M. Schunter, Privacy policy model for enterprises, Proceedings of the Fifteenth IEEE Computer Security Foundations Workshop, Cape Breton, Nova Scotia, 2002. B. Thuraisingham, Privacy constraint processing in a privacyenhanced database management system, Data Knowl. Eng. 55 (2) (2005) 159 188. M. Youssef, V. Atluri, N.R. Adam, Preserving mobile customer privacy: an access control system for moving objects and customer profiles, Proceedings of the Sixth International Conference on Mobile Data Management (MDM 2005), Ayia Napa, Cyprus, 2005, pp. 67 76. P. Samarati, Protecting respondents’ identities in microdata release, in: IEEE Transactions on Knowledge and Data Engineering (TKDE), November December, 2001, vol. 13, 6. P. Bonatti, P. Samarati, A unified framework for regulating access and information release on the Web, J. Comput. Secur. 10 (3) (2002) 241 272. K. Irwin, T. Yu, Preventing attribute information leakage in automated trust negotiation, Proceedings of the Twelfth ACM Conference on Computer and Communications Security (CCS 2005), Alexandria, VA, USA, November 2005. S. Jajodia, P. Samarati, M. Sapino, V. Subrahmanian, Flexible support for multiple access control policies, ACM Trans. Database Syst. 26 (2) (June 2001) 214 260. M. Winslett, N. Ching, V. Jones, I. Slepchin, Assuring security and privacy for digital library transactions on the web: client and server security policies, Proceedings of the Fourth International Forum on Research and Technology Advances in Digital Librarie. World Wide Web Consortium (W3C), Platform for privacy preferences (P3P) project. ,www.w3.org/TR/P3P/., 2002. P. Ashley, S. Hada, G. Karjoth, M. Schunter, E-P3P privacy policies and privacy authorization, Proceedings of the ACM Workshop on Privacy in the Electronic Society (WPES 2002), Washington, 2002, pp. 103 109. P. Ashley, S. Hada, G. Karjoth, C. Powers, M. Schunter, Enterprise privacy authorization language (epal 1.1). ,www. zurich.ibm.com/security/enterprise-privacy/epal., 2003. K. Seamons, M. Winslett, T. Yu, Limiting the disclosure of access control policies during automated trust negotiation, Proceedings of the Network and Distributed System Security Symposium (NDSS 2001), San Diego, CA, USA, April 2001. T. Yu, M. Winslett, K. Seamons, Supporting structured credentials and sensitive policies trough interoperable strategies for automated trust, ACM Trans. Inf. Syst. Secur. (TISSEC) 6 (1) (February 2003) 1 42. P. Bonatti, D. Olmedilla, Driving and monitoring provisional trust negotiation with metapolicies, Proceedings of the IEEE Sixth

[38]

[39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]

[47]

[48]

[49]

[50]

[51]

Privacy and Access Management

International Workshop on Policies for Distributed Systems and Networks (POLICY 2005), Stockholm, Sweden, June 2005. eXtensible Access Control Markup Language (XACML) Version 2.0. ,http://docs.oasis-open.org/xacml/2.0/access_control-xacml2.0-core-spec-os.pdf., February 2005. V. Cheng, P. Hung, D. Chiu, Enabling web services policy negotiation with privacy preserved using XACML, Proceedings of the Fortieth Hawaii International International Conference on Systems Science (HICSS 2007), Hawaii, USA, January 2007. D. Haidar, N. Cuppens, F. Cuppens, H. Debar, XeNA: an access negotiation framework using XACML, Ann. Telecomm. 64 (1 2) (January 2009). D. Chadwick, S. Otenko, T. Nguyen, Adding support to XACML for dynamic delegation of authority in multiple domains, Proceedings of the Tenth Conference in Communications and Multimedia Security (CMS 2006), Heraklion, Crete, Greece, October 2006. C. Ardagna, S. De Capitani di Vimercati, S. Paraboschi, E. Pedrini, P. Samarati, M. Verdicchio, Expressive and Deployable Access Control in Open Web Service Applications, IEEE Trans. Serv. Comput. (TSC) 4 (2) (April June, 2011) 86 109. World Wide Web Consortium (W3C). A P3P Preference Exchange Language 1.0 (APPEL1.0). ,www.w3.org/TR/P3Ppreferences/., 2002. C.A. Ardagna, M. Cremonini, S. De Capitani di Vimercati, P. Samarati, A privacy-aware access control system, J. Comput. Secur. 16 (4) (2008) 369 392. J. Camenisch, E. Van Herreweghen, Design and implementation of the idemix anonymous credential system, Proceedings of the Ninth ACM Conference on Computer and Communications Security (CCS 2002), Washington, 2002, pp. 21 30. C.A. Ardagna, M. Cremonini, E. Damiani, S. De Capitani di Vimercati, P. Samarati, Supporting location-based conditions in access control policies, Proceedings of the ACM Symposium on Information, Computer and Communications Security (ASIACCS ‘06), Taipei, 2006, pp. 212 222. A.R. Beresford, F. Stajano, Mix zones: user privacy in locationaware services, Proceedings of the Second IEEE Annual Conference on Pervasive Computing and Communications Workshops (PERCOMW04), Orlando, 2004, pp. 127 131. P. Samarati, L. Sweeney, Generalizing data to provide anonymity when disclosing information (abstract), Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS ‘98), ACM, New York, NY, USA, 1998. C. Bettini, X.S. Wang, S. Jajodia, Protecting privacy against location-based personal identification, Proceedings of the Second VLDB Workshop on Secure Data Management (SDM’05), Trondheim, Norway, 2005, pp. 185 199. M. Gruteser, D. Grunwald, Anonymous usage of location-based services through spatial and temporal cloaking, Proceedings of the First International Conference on Mobile Systems, Applications, and Services (MobiSys), San Francisco, 2003, pp. 31 42. C. Cornelius, A. Kapadia, D. Kotz, D. Peebles, M. Shin, N. Triandopoulos, Anonysense: privacy-aware people-centric sensing, Proceedings of the Sixth International Conference on Mobile Systems, Applications, and Services (MobiSys 2008), Breckenridge, CO, USA, June 2008.

Chapter | 42

Privacy on the Internet

[52] C.-Y. Chow, M.F. Mokbel, X. Liu, Spatial cloaking for anonymous location-based services in mobile peer-to-peer environments, Geoinformatica 15 (2011) 351 380. [53] C.-Y. Chow, M.F. Mokbel, Trajectory privacy in location based services and data publication, ACM SIGKDD Explorations Newslett. 13 (1) (June 2011) 19 29. [54] C.A. Ardagna, M. Cremonini, S. De Capitani di Vimercati, P. Samarati, An obfuscation-based approach for protecting location privacy, in: IEEE Transactions on Dependable and Secure Computing (TDSC), January-February, 2011, vol. 8, 1, pp. 13 27. [55] T. Hashem L. Kulik, Safeguarding location privacy in wireless ad-hoc networks, Proceedings of the Nineth International Conference on Ubiquitous Computing (UbiComp 2007), Innsbruck, Austria, September 2007. [56] Geographic Location/Privacy (geopriv). ,www.ietf.org/html.charters/geopriv-charter.html., September 2006. [57] D. Goldschlag, M. Reed, P. Syverson, Hiding routing information (Volume 1174 of Lecture Notes in Computer Science) in: R. Anderson (Ed.), Information Hiding: First International Workshop, Springer-Verlag, 1999, pp. 137 150.

753

[58] D. Goldschlag, M. Reed, P. Syverson, Onion routing for anonymous and private internet connections, Commun. ACM 42 (2) (1999) 39 41. [59] M. Reed, P. Syverson, D. Goldschlag, Anonymous connections and onion routing, IEEE J. Sel. Areas Commun. 16 (4) (1998) 482 494. [60] R. Dingledine, N. Mathewson, P. Syverson, Tor: the secondgeneration onion router, Proceedings of the Thirteenth USENIX Security Symposium, San Diego, 2004. [61] H. Roberts, E. Zuckerman, J. York, R. Faris, J. Palfrey, Circumvention Tool Usage Report, Berkman Center Internet Soc. (October 2010). [62] D. Chaum, Untraceable electronic mail, return address, and digital pseudonyms, Commun. ACM 24 (2) (1981) 84 88. [63] O. Berthold, H. Federrath, S. Kopsell, Web MIXes: a system for anonymous and unobservable Internet access (Vol. 2009 of Lecture Notes in Computer Science) in: H. Federrath (Ed.), Anonymity 2000, Springer-Verlag, 2000, pp. 115 119. [64] M. Reiter, A. Rubin, Anonymous web transactions with crowds, Commun. ACM 42 (2) (1999) 32 48.