Background knowledge based privacy metric model for online social networks

The Journal of China Universities of Posts and Telecommunications April 2014, 21(2): 75–82 www.sciencedirect.com/science/journal/10058885 http://jcup...

Download PDF

479KB Sizes 0 Downloads 130 Views

Report

PDF Reader
Full Text

The Journal of China Universities of Posts and Telecommunications April 2014, 21(2): 75–82 www.sciencedirect.com/science/journal/10058885

http://jcupt.xsw.bupt.cn

Background knowledge based privacy metric model for online social networks CHENG Cheng ( ), ZHANG Chun-hong, JI Yang School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China

Abstract The data of online social network (OSN) is collected currently by the third party for various purposes. One of the problems in such practices is how to measure the privacy breach to assure users. The recent work on OSN privacy is mainly focus on privacy-preserving data publishing. However, the work on privacy metric is not systematic but mainly focus on the traditional datasets. Compared with the traditional datasets, the attribute types in OSN are more diverse and the tuple is relevant to each other. The retweet and comment make the graph character of OSN notably. Furthermore, the open application programming interfaces (APIs) and lower register barrier make OSN open environment, in which the background knowledge is more easily achieved by adversaries. This paper analyzes the background knowledge in OSN and discusses its characteristics in detail. Then a privacy metric model faces OSN background knowledge based on kernel regression is proposed. In particular, this model takes the joint attributes and link knowledge into consideration. The effect of different data distributions is discussed. The real world data set from weibo.com has been adopted. It is demonstrated that the privacy metric algorithm in this article is effective in OSN privacy evaluation. The prediction error is 30% lower than that of the work mentioned above Keywords privacy metric, social network, background knowledge, kernel regression

1

Introduction

The data of online social networks (OSNs) including user profile, preference, link attribute and service data are more and more collected by various governments, corporations and research institutes for data analysis. OSNs venders are also willing to publish these data in an aggregated or anonymous way to assist the design and evaluation of their services. With this trend going, as a double edged sword, the privacy problem is taken to the schedule. There should have enough privacy preservations to ensure user’s sensitive information with no breach. Such a privacy exposure might lead to undesirable social effects, like disclosing personal preferences but also uncovering social relations which is intended to be hidden. On the other hand, the more sufficient of data publishing, the Received date: 11-07-2013 Corresponding author: CHENG Cheng, E-mail: [email protected] DOI: 10.1016/S1005-8885(14)60289-2

better the data analysis should be done. In OSNs, the data contains both individual attributes and service data generated by interactions between users and modeled by graphs, in which, the nodes represent individuals and edges represent relationships. Thus, the data distributions in OSNs are much more diverse and are always relevant to each other. Moreover, since OSNs are in open environments in which the background knowledge is easily achieved by adversaries. Therefore, the privacy challenges in OSNs are much more serious than traditional datasets. To protect user privacy, the popular method recently is to let the original data anonymized. Therefore, the adversaries cannot reconstruct the data to connect traces of a single individual. A popular privacy metric of anonymity was developed in Ref. [1], and further extended for l-diversity [2], t-closeness etc. Although the published data are often made aggregated or anonymous in that the true identities and inter-behavior of users have been replaced or partially deleted by certain anonymity mechanisms, the

76

The Journal of China Universities of Posts and Telecommunications

privacy concern remains in OSNs since they did not consider the data characters in OSNs as we discussed above. The scope of privacy is a psychological notion, which is differed from social networks and individuals. In this article, in order to give a general model to measure the privacy leakage in anonymous OSN data, the non-parameter kernel regression was used [3] to integrate multiple attributes in OSN. The contributions are three-fold: 1) Focus is on anonymous OSNs data, which could be published to the third parties. We formulated the specific background knowledge in anonymous OSNs compared with that in traditional datasets, which could cause privacy breach in anonymous data publishing. 2) Armed with the privacy metric, investigation is made to the privacy protection challenges in anonymous OSNs and a formal privacy metric model is introduced by introducing kernel regression. The model focuses on data distribution and correlation, and abstracts away from the procedural distinctions such as whether the data is available in bulk or obtained by crawling the network. 3) Both theoretically and experimentally, it is demonstrated that the multivariate kernel can be exploited by adversaries to significantly improve the accuracy of predicting sensitive attributes. Based on prediction, the user privacy would be breached with high probability even protected by the traditional anonymity algorithms in OSNs.

2

Related work

Privacy requirements such as k-anonymity [1] and l-diversity [2] are designed to thwart attacks that attempt to identify individuals in the data and to discover their sensitive information. However, they are designed for the traditional datasets and the privacy metric for OSN has not been well-studied. Background knowledge poses significant challenges on defining privacy for the anonymous data. Most of the existing privacy metrics assume that the adversaries do not have much background knowledge. The background knowledge referred in Refs. [4–8] did not be measured and quantified. Li et al. [9] proposed a kernel-based estimate of adversary’s background knowledge relevant to the work the authors did, but its attribute independence assumption is often violated in the OSN. Agrawal et al. [10] proposed

2014

that an anonymity fundamental protection limits on two dimensions, which are cryptographic message protection and attacker distribution. Martin et al. [11] thought that the data publisher did not know what background knowledge the attacker possesses in practice. Thus, a formal study of worst-case background knowledge was initiated in Ref. [11]. It provided a polynomial time algorithm to measure amount of disclosure of sensitive information in the worst case. Du et al. [12], based on the maximum entropy principle, measured the privacy leakage in anonymous data publishing. His article treated all the conditional probabilities P ( AS | Q ) as unknown variables and the background knowledge as the constraints of these variables, in which variable AS stood for sensitive attributes and Q was quasi-identifiers. The above works on privacy metrics always focused on traditional datasets and did not discuss the background knowledge in OSN. There are only a few studies involved in integrating privacy attacks in the OSN data publishing by privacy quantification. Narayanan [4] made a survey on real-world examples of social network data releasing in five categories, such as academic and government data-mining, advertising, third-party applications, aggregation and other data-release scenarios. Most of them dealt with more information releasing than the needed for certain attacks. Zhou et al. [5] tried a practice on OSN data anonymous based on neighborhood structural comparison and Masoumzadeh et al. [13] discussed a preserving structural properties in edge perturbing anonymity techniques, both of them did not take the effects of adversary’s background knowledge into consideration. Ma et al. [14] considered two scenarios of background knowledge attack: passive adversary and active adversary. They constructed tuple-independent property and did not model the relationship between individuals. Also, Ma et al. [14] considered the background knowledge that can be mined from the data to be released. In practice as well, the adversary could acquire knowledge from other ways in OSN. Moreover, the existing works are not generic enough to refer to the variety of privacy requirements in OSN. Recently, it is noticed that there are increasingly work focused on privacy protections [15–18], and it is predicted the importance of privacy in the future information communication technology (ICT). Based on the work mentioned above, the privacy problem in another aspect is regarded, which is focus on a general model to measure the privacy leakage in anonymous OSN data by taking

Issue 2

CHENG Cheng, et al. / Background knowledge based privacy metric model for online social networks

background knowledge into consideration. Especially, the characteristic of OSN, such as joint attributes, data distribution and directed link, is considered.

3

Background knowledge definition and modeling

In this section, the OSN data types and their representation are discussed, the background knowledge is introduced in social network. 3.1 Data types The traditional datasets are often consisted as unique identifiers (ID), quasi-identifiers (QI), and sensitive attributes (SA). Unique identifier is the user’s ID or something others can uniquely identify a specific entity. Quasi-identifier is a set of non-sensitive attributes, the combination of which can be mapped to some unique identifier. Sensitive attributes are considered to be the privacy of a protected object, so the main task in anonymity data publishing is to assure sensitive attributes could not be linked to a unique identifier. Social networks are often consists of vertex, edges, and information associated with each node and edge. Compared to traditional data sets, elements in SNS are more complicated [4]. Vertex (V), e.g. user ID, includes people’s static and unique identity information. Edge (E), vertex attribute ( AV ) and edge attribute ( AE ) are quasi-identifiers, which we assume to be non-sensitive but could be used to estimate personal identity. This kind of data like user profile, friend links and interaction data could be obtained easily from OSN APIs. Sensitive Attribute ( AS ) is considered as private information of an individual, which should be protected from being attacked. In OSN, AS might be differs from individuals, it should be considered in anonymity algorithm design. 3.2

Background knowledge

Definition 1 Background knowledge: background knowledge is defined as the additional information an adversary obtained besides the released data. 3.2.1

The overview of background knowledge

It’s well known that, hidden information is mined from the released data there is of high probabilities which could induce the privacy leakage. The background knowledge in

77

OSN could have many different forms [12]. They can be classified into three categories: knowledge about data, knowledge about individuals and knowledge of the mechanism or algorithm for anonymity. Knowledge about data is consisted of data distribution and the knowledge mined from the published anonymous data. For example, elder users are less active after 9 o'clock on the night. Knowledge about individuals is consisted of not only link relation, user profile, but also topic or group users have participated in. Knowledge about anonymity algorithm means adversaries know the parameter and other details of the anonymity algorithm in use. Furthermore, not only accurate knowledge but uncertain background knowledge could induce the privacy leakage. For example, an adversary knows Alice has cancer with some probability in [0.2, 0.4] [9]. Background knowledge is always used to locate a target individual. In general, the relation between background knowledge B and the target individual’s sensitive attribute AS could be divided into following three principles, which have been proposed in Ref. [19]. 1) B ⇒ ¬AS . Knowledge about the sensitive attribute is just negatively associated. For example, from the knowledge that the target individual is 70 years old, we could infer his active time on social network cannot be 2 o'clock in the morning, which could be represented as {Age=70} ⇒ ¬{ActiveTime=2} . 2) B ⇒ AS . Knowledge about the sensitive attribute is positive associations. For example, an individual whose occupation is computer engineer would have high probability to be near-sighted, which is {Occupation=computer_engineer} ⇒ {Desease=myopia} . 3) The set B1 = {b1 ,b2 ,...} is a family with same value, and B1 ⊂ B . If b ∈ B1 and b ⇒ AS , then B1 ⇒ AS . Knowledge about the sensitive attribute has cluster feature. For example, several individuals could be clustered into the same group in OSN, individuals might have high probability to associate with a sensitive attributes once one of them associate (negative or positive) with. 3.2.2

Background knowledge in OSNs

Since OSNs are often in open environments, background knowledge could be more likely achieved. Compared with traditional datasets, the behaviors like follow, retweet and comment make the graph character of

78

The Journal of China Universities of Posts and Telecommunications

OSNs notably. Meanwhile, attributes in OSN are more diverse, not only gender and age, but the real-time location and terminal type are also considered as user attributes. The relevance between attributes and user activity become more complex. Thus, the scope of adversary’s background knowledge is extended. We supplemented following two cases in OSN. 1) Background knowledge about joint attributes There are multi-dimension attributes in OSNs as discussed in Sect. 3.1, which correlate to each other in high to some extent. For convenience, the previous work is always assumed the attributes are independent. The relation between sensitive value and background knowledge about joint attributes could be modeled as P ( AS |Bi ,B j ) , in which

{v|vi ∈ V ,1≤i≤n}

{e | e

ij

connected

2014

by

a

set

of

links

∈ E ,1≤i, j≤n} . For each v ∈ V , A tuple ti

(1≤i≤n )

represents for all attributes including vertex

attributes ( AV ), link attributes ( AE ) and sensitive attributes ( AS ) of an individual. Considering the variety of attributes, i.e. gender, the age and province of people are vertex attributes. d, m and g represent the number of dimensions in the three attributes separately. Therefore, t [ AV ] = ( t [ AV1 ] , t [ AV2 ] ,..., t [ AVd ]) , and t [ AS ] = ( t [ AS1 ] , t [ AS2 ] ,..., t [ ASm ]) , in which t [ AV ] and t [ AS ] are

1× d

and

(1≤j≤g )

1× m

is a

vectors separately. And

n× n

t ⎡⎣ AEj ⎤⎦

matrix representing the link

Bi , B j ⊂ B is two sets of attributes.

attributes value between users on the jth link attribute.

2) Link knowledge Link knowledge like the directed links of two individuals, is associated with a sensitive value. Let Bab

For example, tab ⎡⎣ AEj ⎤⎦ =1 is the value between two individuals a, b on the jth link attributes.

represent the link knowledge of two individuals a and b. So the relation between link knowledge and sensitive value could be modeled as Bab ⇒ AS or Bab ⇒ ¬AS . For example, {Link(Alice,Bob) = 1& &ActiveTime(Bob) = 2} ⇒

{ActiveTime(Alice) = 2} that means that if Alice and Bob have a link between each other and Bob is always active at 2 a.m. on OSN, so Alice has a high probability to be active at 2 a.m.

4

Privacy metric model in social network

In this section, the authors propose a methodology for measuring privacy leakage in social networks based on background knowledge. The progress in the work was described by the authors in Ref. [9], the authors firstly introduce here kernel regression to measure the knowledge that an adversary could have. Compared to [9], which assumed that user attributes are independent, the model has three characteristics: 1) Focus on background knowledge that is consistent with the original data. 2) Study the joint attributes in OSN, and take into account the data distribution. 3) Discuss the effect of link knowledge in privacy metric. 4.1 Preliminaries A social network

G (V , E )

is a set of nodes

A certain individual could have multi-dimensional sensitive attribute, only one dimension sensitive attribute is discussed in following discussion. The model can also be extended to multi-dimension sensitive attributes. The authors also introduce the domain D to represent all possible values on AV , AE and AS . Our model of privacy metric focuses on what types of data are released and how the data are sanitized [20] (generalization, distortion and bucketization etc.). The definition of privacy breach is given as bellow, which is also the evaluation criterion of privacy metric model. Definition 2 Privacy breach: set two parameters α and β ( 0≤α ,β ≤1,α <β ) , they can be defined by individuals. Let T * be the released data and B = {b1 , b2 ,..., bn } ∈ D be the background knowledge, in which, each tuple b j (1≤j≤n ) represents the background knowledge related to the individual j. The function Ppri :{ B [ AV ] , B [ AE ]} → B [ AS ] is the prior belief of an adversary, which depends on the background knowledge. p1 is used to denote the probability that an individual could be identified in the released data. Similarity, p2 denotes the probability that the correctly prediction of Ppri (T *) based on the released data. Thus, if there exists p1 <α ; 0≤α , β ≤1, α <β

⎫⎪ ⎬ p1 p2 + ⎣⎡1 − (1 − p1 )(1 − p2 ) ⎦⎤ >β ; 0≤α , β ≤1, α <β ⎭⎪

The privacy is breached by background knowledge.

(1)

Issue 2

CHENG Cheng, et al. / Background knowledge based privacy metric model for online social networks

4.2 Privacy metric model via multivariate kernel regression The kernel regression method is a non-parametrical technique in statistics to estimate the conditional expectation of random variable. Given a dataset, the kernel regression method tries to find the underlying function that is best-fit match to the data at those data points. The work in Ref. [9] has discussed the advantages of kernel regression method to approximate the function Ppri . We extend the kernel estimation framework to model the multivariate background knowledge. Given a d-dimension vector q = ( q1 ,q2 ,...,qd ) ∈ D [ AV ] , we have the regression equation Ppri ( q ) = E ( t [ AS ] | q1 , q2 ,..., qd ) . In the recent works, the items q are usually assumed to be independent. Thus, using Nadaraya-Watson kernel weighted average [3], the prediction at q is estimated as: K h ( qi , b j [ AVi ]) ∑ b j [ AS ]1≤∏ b j ∈B i ≤d ˆ (2) Ppri ( q ) = ∑ ∏ K h ( qi , b j [ AVi ]) b j ∈B 1≤i≤d

where K ( ⋅) is a generalized product kernel that admits both continuous and categorical data, and h is a bandwidth matches the distribution of D [ AVi ] . B and b j are defined in Definition 2, each b j corresponds to a sensitive value b j [ AS ] . The sensitive value for tuple b j is smoothed by the kernel function. Note that the denominator is used to normalize the distribution. For the choice of the kernel function K ( ⋅) is not as important as the choice of the bandwidth h [3], we use the most popular Gaussian kernel function in our approach. ⎡ 1 ⎛ q − b [ A ] ⎞2 ⎤ 1 j Vi K h ( q,b j [ AVi ]) = exp ⎢ − ⎜ (3) ⎟⎟ ⎥ h ⎢ 2 ⎜⎝ ⎥ 2π h ⎠ ⎣ ⎦ 1) Joint attributes based privacy metric In this section, the authors adopt the covariance to compute the relevance between two attributes. Let Avi and Avj represent two attributes. X i and X j are attributes vector, each of them has N values. Then, the covariance is given by N

C ⎣⎡ Avj , Avj ⎦⎤ =

∑( X i =1

ik

− X i ) (Y jk − Y j )

(4)

N −1 Considering the difference of data distribution, each attribute should associate with a kernel bandwidth, which

79

we called local bandwidth. Such that the original set yield the bandwidth matrix. The local bandwidth is estimated as following [21]: −α

⎧ ⎫ ⎪ ⎪ ˆ f ( qi ) ⎪ ⎪ μi = ⎨ (5) ⎬ ; 0<α ≤1 d ⎪ exp ⎛ 1 lg fˆ ( q ) ⎞ ⎪ i ⎟ ⎜d ∑ ⎪⎩ ⎝ i =1 ⎠ ⎪⎭ in which, α is the regulation parameter and f ( q ) is

the sampling density can be computed by ⎛ q − b j [ AV ] ⎞ 1 fˆ ( q ) = K⎜ (6) ⎟⎟ ∑ h B b j ∈B ⎜⎝ h ⎠ Note that the matrix would be diagonal if the attributes are independent. Considering the joint attributes, we advanced the bandwidth matrix by introducing the global bandwidth h and the local bandwidth μi (1≤i≤d ) , which represents the local sampling density of an attribute. Thus, the bandwidth matrix could be defined as H =hμi C

−1

2

[21].

The global bandwidth can be computed by Cross Validation directly [3]. Therefore, the Gaussian kernel function is modified by the introduced local bandwidth and covariance matrix. det C K H ( qi ,b j [ AVi ]) = ⋅ 2π hμi ⎡ ( q − b [ A ] ) C ( q − b [ A ] )T ⎤ i j i j Vi Vi ⎥ exp ⎢ − (7) 2 2 ⎢ ⎥ 2 h μi ⎣ ⎦ Accordingly, privacy metric model could adapt to the joint attributes. 2) Link knowledge based privacy metric In the above privacy metric model, only the vertex knowledge B [ AV ] of an adversary is considered.

However, the link knowledge

B [ AE ]

has a high

probability to identify an individual uniquely. In this section, the authors take link knowledge into consideration and estimate its effects on privacy leakage. Each individual corresponds to a vector ti [ AE ] , which is represented as the integrated link knowledge between individual i and other ( n − 1 ) individuals, and can be n

described as ti [ AE ] = ∑ rij υ j , ( i ≠ j ) . In the expression, j =1

rij is the integrated link knowledge value between i and j. υ j represents the direction vector from i to j.

Based on the kernel regression model, the task on link

80

The Journal of China Universities of Posts and Telecommunications

knowledge could be changed to how to calculate the distance on link attributes between the two individuals. Since OSN is large-scaled and always contains billions of individuals, two compared individuals could hardly have common linked neighbors. Thus, Euclidean distance directly to compute the link attributes distance could not be used. Since Euclidean norm is often adopted to represent a vector uniquely. In this article, the authors integrate the link attributes of an individual into constant by Euclidean norm. So that, Li = ti [ AE ] = ri12 + ri 22 + ... + rin2 is used to represent the integrated link attributes of an individual i. Therefore, link knowledge could be introduced to the privacy metric model Eq. (7) as a new variable.

5

Experiment

The main goal of the experiments is to study the effect of background knowledge in OSNs. Especially, the work focuses on the joint attributes and link knowledge. To estimate the privacy breach resulted by background knowledge, the effect is compared using two utility metrics: 1) Prediction error by mining the joint attributes and local bandwidth. 2) Prediction error by introducing the link attributes. 3) The number of privacy breached individuals in k-anonymity data publishing. Experimental data comes from Weibo.com, the most famous Chinese OSN website. We use eight user attributes and user’s links of Weibo.com, as shown in Table 1, where the sensitive attribute is user’s active time. Tuples with missing values are eliminated and there are 580 valid tuples in total. All algorithms are implemented in Matlab. The correlation coefficients of the attributes from Table 1 are calculated, and there are 10 joint attributes bigger than 0.1. The biggest correlation coefficient in the dataset is 0.47.

2014

5.1 Effects of joint attributes in privacy metric The authors use the prediction error to measure the effects of joint attributes in privacy metric. Let R be the tuples would be predicted. For each tuple r ∈ R , the actual sensitive value on r in the original table is represented as vorg ( r ) . We estimate the probability of r using adversaries’ background knowledge, which is denoted as the prediction value vpre ( r ) . The prediction error is defined as: vpre ( r ) − vorg ( r ) 1 ε= ∑ R r ∈R vorg ( r )

(8)

Prediction errors reflect the preservation of the association between the quasi-identifiers and the sensitive attribute. The effects of joint attribute and local bandwidth based on different global bandwidth are examined. Fig. 1 shows the results. The method mentioned here is compared with the traditional ones in Ref. [9], shown that the variables are independent. Two groups of variables form Weibo.com are used, the first group contains variables have a smaller correlation coefficient from 0.01 to 0.10 and the other group has a bigger correlation coefficient from 0.3 to 0.5. With growth of global bandwidth, prediction error goes to be stable. The prediction errors by traditional method in the two groups are almost the same. However, the group with bigger correlation coefficient is more accurate than that of predicted, it is 0.04 better than the group with smaller correlation. Meanwhile, the result shows that our method by taking joint attributes into account is more accurate in privacy metric, in which prediction error is 30% lower than the work in Ref. [9].

Table 1 Experimental data No. 1 2 3 4 5 6 7 8

Attributes

Type

Domain Bandwidth

Gender Categorical 2 Status count Numeric 7 461 Favorites count Numeric 928 Province Categorical 24 Account creation time Numeric 3 Followers (links) Numeric 5 208 Friends (links) Numeric 1 166 Active time Sensitive 50

0.296 7 0.008 9 0.001 9 0.001 8 0.146 6 0.006 0 0.025 8 0.009 8

Fig. 1

Experiment on correlation coefficient

5.2 Effects of link knowledge in privacy metric In this section, the authors simplify the link knowledge

Issue 2

CHENG Cheng, et al. / Background knowledge based privacy metric model for online social networks

in Weibo.com to three parameters, which is follower only, friend only or mutual friends. Based on the result of Sect. 5.1, the comparison dimension of privacy metric is extended to link knowledge. Fig. 2 shows the result of introducing link knowledge in privacy metric. The prediction error is increased with the growth of global bandwidth. Compared with joint attributes, the prediction error by introducing link knowledge is a little smaller depending on the selection of global bandwidth. The maximum D-value is 0.035 and the effects are better when the global bandwidth is less than 0.015 or greater than 0.050, which is relevant to the data distribution of link knowledge. Furthermore, the curve of introducing link knowledge rises more sharply.

Fig. 2

5.3

Effects of link knowledge

Privacy breach in k-anonymity

Among the existing privacy models, k-anonymity is the most popular and the basic privacy model. A k-anonymity of the original dataset means the equivalence class of tuples induced by the anonymous data processing are all of size k or greater. To examine the effects of our model in anonymous data, the k-anonymity as the data anonymous publishing model is chosen. Definition 3 The matching k-anonymity requirement: The matching k-anonymity requirement guarantees that the adversary with background knowledge cannot learn the sensitive value of an individual from a released set of at most 1/k. Since the experiment on privacy breach based on k-anonymity, so the privacy breach parameter is α =1/ k . The global bandwidth is fixed h=0.05 , α =0.001 and β =0.6 (see Definition 2). Fig. 3 demonstrates the number of individuals whose privacy has been breached. There are 80 test individuals in total, and about 58 individuals at most are leaked privacy based on our method, which is

81

more serious than the work assumed individuals are independent.

Fig. 3

6

Privacy breach in k-anonymity

Conclusions

With discussing the data types and background knowledge of OSN, the authors propose a general privacy metric model. The privacy leakage of OSN data is measured based on multivariate kernel regression, which has been proved efficient in high dimension regression. Considering joint attributes and link knowledge on privacy leakage, covariance matrix and Euclidean norm are adopted to advance the kernel regression. Meanwhile, the introducing of local bandwidth improves the accuracy of privacy metric in OSN. Several directions for future research exist, including privacy-preserving data publishing of OSN, as well as personalized privacy protections. Acknowledgements The work was supported by the Social Network Based Cloud Service

Technology

for

TV

Content

and

Application

(202BAH41F03).

References 1. Sweeney L. K-anonymity: a model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowlege-based Systems, 2002, 10(5): 557−570 2. Machanavajjhala A, Kifer D, Gehrke J, et al. L-diversity: privacy beyond k-anonymity. Proceedings of the International Conference on Data Engineering (ICDE’06), Apr 3−7, 2006, Atlanta, GA, USA. Piscataway, NJ, USA: IEEE, 2006: 24 3. Ruppert D, Wand M P. Multivariate locally weighted least squares regression. Annals of Statistics,1994, 22(3): 1346−1370 4. Narayanan A, Shmatikov V. De-anonymizing social networks. Proceedings of the IEEE 30th Symposium on Security and Privacy, May 17−20, 2009, Berkeley, CA, USA. Piscataway, NJ, USA: IEEE, 2009: 173−187 5. Zhou B, Pei J. Preserving privacy in social networks against neighborhood attacks. Proceedings of the IEEE 24th International Conference on Data

82

6.

7.

8.

9.

10. 11.

12.

The Journal of China Universities of Posts and Telecommunications Engineering (ICDE’08), Apr 7−12, 2008, Cancun, Mexico. Piscataway, NJ, USA: IEEE, 2008: 506−515 Rechert K, Wohlgemuth S, Echizen I, et al. User centric privacy in mobile communication scenarios. Proceedings of the IEEE/IPSJ 11th International Symposium on Applications and the Internet (SAINT’11), Jul 18−21, 2011, Munich, Germany. Piscataway, NJ, USA: IEEE, 2011: 202−207 Machanavajjhala A, Kifer D, Abowd J, et al. Privacy: theory meets practice on the map. Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE’08), Apr 7−12, 2008, Cancun, Mexico. Piscataway, NJ, USA: IEEE, 2008: 277−286 Li T C, Li N H, Zhang J. Modeling and integrating background knowledge in data anonymization. Proceedings of the 25th IEEE International Conference on Data Engineering (ICDE’09), Mar 29−Apr 2, 2009, Shanghai, China. Piscataway, NJ, USA: IEEE, 2009: 6−17 Atzmueller M, Puppe F. A methodological view on knowledge-intensive subgroup discovery. Proceedings of the 15th International Conference on Managing Knowledge in a World of Networks (EKAW’06), Oct 2−6, 2006, Podebrady, Czech Republic. 2006: 318−325 Agrawal D, Kesdogan D. Measuring anonymity: the disclosure attack. IEEE Security and Privacy, 2003, 1(6): 27−34 Martin D, Kifer D, Machanavajjhala A, et al. Worst-case background knowledge for privacy-preserving data publishing. Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE’07), Apr 15−20, 2007, Istanbul, Turkey. Piscataway, NJ, USA: IEEE, 2007: 126−135 Du W L, Teng Z X, Zhu Z T. Privacy-maxent: integrating background knowledge in privacy quantification. Proceedings of the 34th ACM SIGMOD International Conference on Management of Data (SIGMOD’08), Jun 9−12, 2008, Vancouver, Canada. New York, NY, USA: ACM, 2008: 459−472

2014

13. Masoumzadeh A, Joshi J. Preserving structural properties in edge-perturbing anonymization techniques for social networks. IEEE Transactions on Dependable and Secure Computing, 2012, 9(6): 877−889 14. Ma C, Yau D, Yip N, et al. Privacy vulnerability of published anonymous mobility traces. IEEE/ACM Transactions on Networking, 2012, 21(3): 720−733 15. Mahmoud M, Shen X M. A cloud-based scheme for protecting source-location privacy against hotspot-locating attack in wireless sensor networks. IEEE Transactions on Parallel and Distributed Systems, 2012, 23(10): 1805−1818 16. Barker S, Genovese V. Access control with privacy enhancements a unified approach. IEEE Transactions on Dependable and Secure Computing, 2012, 9(5): 670−683 17. Ghinita G, Kalnis P, Tao Y F. Anonymous publication of sensitive transactional data. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(2): 161−174 18. Li Y P, Chen M H. Enabling multilevel trust in privacy preserving data mining. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(9): 1598−1612 19. Li T C, Li N H. Injector: Mining background knowledge for data anonymization. Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE’08), Apr 7−12, 2008, Cancun, Mexico. Piscataway, NJ, USA: IEEE, 2009: 446−455 20. Agrawal R, Srikant R. Privacy-preserving data mining. Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’00), May 16−18, 2000, Dallas, TX, USA. New York, NY, USA: ACM, 2000: 439−462 21. Wand M, Jones M. Kernel smoothing. London, UK: Chapman & Hall/CRC, 1995

(Editor: WANG Xu-ying)

From p. 74 7. Sasano H, Ogami Y. An improved bidirectional search for obtaining weight spectrum of convolutional codes. IEICE Transactions on Fundamentals, 2010, 93(5): 993−996 8. Bocharova I E, Handlery M. A BEAST for prowling in trees. IEEE Transactions on Information Theory, 2004, 50(6): 1295−1302 9. Bocharova I E, Handlery M, Johannesson R, et al. BEAST decoding of block codes obtained via convolutional codes. IEEE Transactions on Information Theory, 2005, 51(5): 1880−1891 10. Hug F, Bocharova I E, Johannesson R, et al. A rate R=5/20 hypergraph-based woven convolutional code with free distance. IEEE

Transactions on Information Theory, 2010, 56(4): 1618−1623 11. Johnsson D, Bjarkeson F, Hell M, et al. Searching for new convolutional codes using the cell broadband engine architecture. IEEE Communication Letters, 2011, 15(5): 560−562 12. Sasano H, Moriya S. A construction of high rate punctured convolutional codes. Proceedings of the 2012 IEEE International Symposium on Information Theory and Its Applications, Oct 28−31, 2012, Honolulu, HI, USA. Piscataway, NJ, USA: IEEE, 2012: 662−666 13. Zou W X, Wang G Y, Wang Z Y, et al. Searching punctured convolutional codes based on quantum genetic algorithm. Journal of Shenzhen university science and engineering, 2013, 30(6): 572−577 (in Chinese)

(Editor: WANG Xu-ying)

Background knowledge based privacy metric model for online social networks

Background knowledge based privacy metric model for online social networks

Recommend Documents