6th IFAC Workshop on Distributed Estimation and Control in 6th 6th IFAC IFAC Workshop Workshop on Distributed Distributed Estimation Estimation and and Control Control in in Networked Systemson 6th IFAC Workshop on Distributed Estimation and Control in Networked Networked Systems Systems September 8-9, 2016. Tokyo, Japan Available online at www.sciencedirect.com Networked Systems September September 8-9, 8-9, 2016. 2016. Tokyo, Tokyo, Japan Japan September 8-9, 2016. Tokyo, Japan
ScienceDirect
IFAC-PapersOnLine 49-22 (2016) 043–048 Privacy-Constrained Communication Privacy-Constrained Communication Privacy-Constrained∗ Communication ∗
Farhad Farokhi ∗ Girish Nair ∗ ∗ Girish Nair ∗ Farhad Farokhi Farhad Farokhi ∗ Girish Nair ∗ Farhad Farokhi Girish Nair ∗ Department of Electrical and Electronic Engineering, ∗ ∗ Department of Electrical and Electronic Engineering, of and Engineering, University of Melbourne, VIC 3010, Australia ∗ Department Department of Electrical ElectricalParkville, and Electronic Electronic Engineering, University of Melbourne, Parkville, VIC 3010, Australia University of Melbourne, Parkville, VIC 3010, Australia (e-mail: {ffarokhi,gnair}@unimelb.edu.au) University of Melbourne, Parkville, VIC 3010, Australia (e-mail: {ffarokhi,gnair}@unimelb.edu.au) (e-mail: {ffarokhi,gnair}@unimelb.edu.au) (e-mail: {ffarokhi,gnair}@unimelb.edu.au) Abstract: A game is introduced to study the effect of privacy in strategic communication Abstract: A game is introduced to study the effect privacy in strategic communication Abstract: A to the of privacy in communication between a well-informed sender and a receiver. The of receiver wants to accurately estimate Abstract: A game game is is introduced introduced to study study the effect effect of privacy in strategic strategic communication a well-informed sender and aa receiver. The receiver wants to accurately estimate between a well-informed sender and receiver. The receiver wants to accurately estimate abetween random variable. The sender, however, wants to communicate a message that balances between a well-informed sender and a receiver. The receiver wants to accurately estimateaaa a random variable. The sender, however, wants to communicate aa message that balances a random variable. The sender, however, wants to communicate message that balances trade-off between providing an accurate measurement and minimizing the amount of leakeda a randombetween variable.providing The sender, however,measurement wants to communicate a message that balances trade-off accurate and minimizing the amount of leaked trade-off between providing an accurate and minimizing the amount of leaked private information, which isan assumed to measurement be correlated with the to-be-estimated variable. The trade-off between providing an accurate measurement and minimizing the amount of leaked private information, which is assumed to be correlated with the to-be-estimated variable. The private information, which is assumed to be correlated with the to-be-estimated variable. The mutual information between the transmitted message and the private information is used as private information, which is the assumed to be correlated with the to-be-estimated variable. Theaaa mutual information between transmitted message and the private information is used as mutual information between the transmitted message and the private information is used as measure of the amount of leaked information. An equilibrium is constructed and its properties mutual information between the transmitted message and the private information is used as a measure of the amount of leaked information. An equilibrium is constructed and its properties measure of the amount of leaked information. An equilibrium is constructed and its properties are investigated. measure of the amount of leaked information. An equilibrium is constructed and its properties are investigated. are investigated. are2016, investigated. © IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Keywords: Data privacy; Information Theory; Communication Systems; Statistical inference Keywords: Data privacy; privacy; Information Theory; Theory; Communication Systems; Systems; Statistical inference inference Keywords: Keywords: Data Data privacy; Information Information Theory; Communication Communication Systems; Statistical Statistical inference 1. INTRODUCTION Friedman and Schuster, 2010; Huang et al., 2014; Le Ny 1. INTRODUCTION INTRODUCTION Friedman and Schuster, 2010; Huang et al., 2014; Le Ny 1. Friedman and Schuster, 2010; Huang et al., 2014; Ny and Pappas, Those studies rely noise, 1. INTRODUCTION Friedman and2014). Schuster, 2010; Huang et on al.,adding 2014; Le Le Ny and Pappas, 2014). Those studies rely on adding noise, and Pappas, 2014). Those studies rely on adding noise, typically Laplace noises, to guarantee the privacy of the Participatory and crowd-sensing technologies rely on hon- and Pappas, 2014). Those studies rely on adding noise, typically noises, to guarantee the privacy of the typically Laplace noises, to the of the by Laplace making the outcome less sensitive to local Participatory and crowd-sensing technologies rely on on honParticipatory crowd-sensing technologies rely est data fromand recruited users to generate estimates of users typically Laplace noises, to guarantee guarantee the privacy privacy of pathe by making the outcome less sensitive to local paParticipatory and crowd-sensing technologies rely on honhonusers by making the outcome less sensitive to local parameter variations. Various studies were devoted to finding est data data from from recruited users to generate estimates of users est recruited users to generate estimates of variables, such as traffic condition. Providing accurate users by making the outcome less sensitive to local parameter variations. Various studies were devoted to finding est data from recruited users to generate estimates of rameter variations. Various were to noise distribution in differential privacy (Soriavariables, such as users trafficcan condition. Providing accurate variables, such as traffic condition. Providing accurate information by the undermine their privacy. For “optimal” rameter variations. Various studies studies were devoted devoted to finding finding noise distribution in differential privacy (Soriavariables, such as users trafficcan condition. Providing accurate “optimal” noise distribution in differential privacy (SoriaComas and Domingo-Ferrer, 2013; Geng and Viswanath, information by the the users can undermine their privacy. For “optimal” information by undermine their privacy. For instance, a road user that provides her start and finish “optimal” noise distribution in differential privacy (SoriaComas and Domingo-Ferrer, 2013; Geng and Viswanath, information by the users can undermine their privacy. For Comas and Domingo-Ferrer, 2013; Geng and Viswanath, 2014) or other variants such as Lipschitz privacy (Koufoinstance, a road user that provides her start and finish instance, aa road user that provides her start and finish points as well as travel time to a participatory-sensing app Comas and Domingo-Ferrer, 2013; Geng and Viswanath, 2014) or other variants such as Lipschitz privacy (Koufoinstance, road user that provides her start and finish 2014) or other variants such as Lipschitz privacy (Koufopoints as well well as travel travel time toquality participatory-sensing app giannis etother al., 2015). Contrary these studies, study points as time aa app can significantly improve theto of the traffic estimavariants such as to Lipschitz privacywe (Koufopoints as well as as travel time a participatory-sensing participatory-sensing app 2014) giannisoret et al., al., 2015). 2015). Contrary to these studies, we study can significantly significantly improve thetoquality quality of the thelife. traffic estimagiannis Contrary to these we study privacy-constrained communication usingstudies, game theory by can improve the of traffic estimation at the expense of exposing her private Therefore, giannis et al., 2015). Contrary to these studies, we study can significantly improve the quality of thelife. traffic estima- privacy-constrained communication using game theory by tioncan at the the expense ofproviding exposing her private private life. Therefore, privacy-constrained communication using game theory by explicitly modelling the conflict of interest between the tion at expense of exposing her Therefore, she benefit from “false” information, not to privacy-constrained communication using game theorythe by tion at the expense ofproviding exposing “false” her private life. Therefore, explicitly modelling the conflict of interest between she can benefit from providing “false” information, not to explicitly modelling the conflict of interest between the senders and the receiver stemming from the privacy conshe can benefit from information, not to deceive the system but to protect her privacy. The amount explicitly modelling the conflict of interest between the she can benefit from providing “false” information, not to senders and the receiver stemming from the privacy condeceive the system but to protect her privacy. The amount senders and the receiver stemming from the privacy conFurther, we showstemming that in the case continuous deceive the but to protect privacy. amount of the deviation from the truth is her determined by the value straint. and the receiver from theof privacy condeceive the system system butthe to truth protect her privacy. The The amount straint. Further, show that the case of continuous of the the deviation from the truth isthe determined by To the value senders straint. Further, we show that in the case of continuous random variables we even when thein emphasis on the privacy of deviation from is determined by the value privacy that can vary across population. better straint. Further, we show that in the case of continuous of the deviation from the truth is determined by the value random variables even when the emphasis on the privacy of privacy that can vary across the population. To better random variables even when the emphasis on the privacy grows, the sender either does not add noise to the transof privacy vary the To understand this can effect, weacross use a game-theoretic framework variables even when the emphasis onto the privacy of privacy that that can vary across the population. population.framework To better better random the sender either not add noise the transunderstand this effect, we use aa game-theoretic game-theoretic framework grows, the sender either does not noise to transmitted messages the does intensity of the noise does not understand effect, use to model thethis conflict ofwe interest and to study the effect of grows, grows, the sender or either does not add add noise to the the transunderstand this effect, we use a game-theoretic framework mitted messages or the intensity of the noise does not to model the conflict of interest and to study the effect of mitted messages or the intensity of the noise does not grow with the value of privacy, which are not the case to model the conflict of interest and to study the effect of privacy in strategic communication. mitted messages or the intensity of the noise does not to modelin conflictcommunication. of interest and to study the effect of grow with the value of privacy, which are not the case privacy inthe strategic communication. grow with the value of privacy, which are not the case in the differential privacy literature. The interesting fact privacy strategic grow with the value of privacy, which are not the case privacy in strategic communication. in differential privacy literature. The interesting fact We develop a model in which the receiver is interested that in the the differential privacy literature. The interesting fact sender does not add noise was in (Akyol thethe differential privacy literature. Theproved interesting fact Weestimating develop aa amodel model in which which theToreceiver receiver is interested interested that the sender does not add noise was proved in (Akyol We develop in the is in random variable. this aim, it asks a in that the sender does not add noise was proved in (Akyol et al., 2015) for scalar random variables. However, the We develop a model in which the receiver is interested that the sender does not add noise was proved in (Akyol in estimating a random variable. To this aim, it asks a in estimating a random variable. To this aim, it asks a better-informed sender tovariable. provideTo a this measurement. The et al., 2015) for scalar random variables. et al., 2015) for scalar random variables. However, the above-mentioned observations regarding theHowever, intensitythe of in estimating a random aim, it asks a better-informed sender to provide a measurement. The et al., 2015) for scalar random variables. However, the better-informed provide aa measurement. sender wants to sender find a to trade-off between her desireThe to above-mentioned observations regarding the intensity of above-mentioned observations regarding the intensity of the noise are valid irrespective of the dimensions of the better-informed sender to provide measurement. The sender wants wants to find find measurement a trade-off trade-off between between her desire desire to above-mentioned observations regarding the intensity of sender to a her to provide an accurate of the variable while the noise are valid irrespective of the dimensions of the the noise are valid irrespective of the dimensions of the random variables. sender wants to find a trade-off between her desire to provide an accurate measurement of the variable while the noise are valid irrespective of the dimensions of the provide an accurate measurement of the variable while minimizing the amount of leaked private information, random variables. variables. provide an accurate measurement ofprivate the variable while random minimizing the amount amount of leaked leaked information, random variables. theory literature, wiretap channels have the information minimizing the of information, which is potentially correlated with private that variable or its In minimizing the amount of leaked private information, In the information theory literature, channels have which is potentially correlated with that variable or its In the information theory literature, wiretap channels have been studied heavily dating from wiretap the pioneering which is potentially correlated with that variable or its measurement. We assume that the sender has access the studied information theory literature, wiretap channels work have which is potentially correlated that variable or its In been heavily dating from the pioneering work measurement. We assume assume thatwith the sender has access access been studied heavily dating from the pioneering work in (Wyner, 1975). In these problems, the sender wishes measurement. We that the sender has to a possibly noisy measurement of the variable and a been studied heavily dating from the pioneering work measurement. We assume that the sender has access in (Wyner, 1975). In these problems, the sender wishes to a possibly noisy measurement of the variable and a in (Wyner, 1975). In these problems, the sender wishes devise encoding to create the a secure to aa possibly noisy the and perfect measurement of her privateof information. We useaa to in (Wyner, 1975). Inschemes these problems, senderchannel wishes to possibly noisy measurement measurement ofinformation. the variable variableWe anduse devise encoding schemes create aa secure channel perfect measurement of her her private private information. We use to to devise encoding schemes to create secure channel for communicating with the to receiver while hiding her perfect measurement of mutual information between the communicated message to devise encoding schemes to create a secure channel perfect measurement of her private information. We use for communicating with the receiver while hiding her mutual information between the communicated message for communicating with the receiver while hiding her data from an eavesdropper. This is a secrecy problem. mutual information between the communicated message and the private information to capture the amount of the for communicating with the This receiver hiding her mutual information betweento communicated message data from an eavesdropper. is aa while secrecy problem. and the theinformation. private information information tothe capture the amount amount of the the data from an eavesdropper. This is secrecy problem. In contrast, other studies have considered the privacy and private capture the of leaked We present a numerical algorithm for data from another eavesdropper. Thisconsidered is a secrecy problem. and theinformation. private information to capture the amount of the In contrast, studies have the privacy leaked information. We present a numerical algorithm for In contrast, other studies have considered the privacy problem in which the masking or equivocation of informaleaked We present a numerical algorithm for finding an equilibrium (i.e., policies from which no one contrast, otherthe studies have considered the privacy leaked information. We (i.e., present a numerical algorithm for In problem in which masking equivocation of informafinding an equilibrium equilibrium (i.e., policies from which no one one problem in which the masking or equivocation of information corresponds to either the or intended primary receiver finding an policies from no has an incentive to unilaterally deviate) of which the presented problem in which the masking or equivocation of informafinding an equilibrium (i.e., policies from which no one tion to either the intended has an anwhen incentive to unilaterally deviate) of the the presented tion corresponds corresponds to intended primary receiver than an eavesdropper et primary al., 2013;receiver Courhas incentive unilaterally deviate) of game the to random variables are discrete. In the rather corresponds to either either the the(Sankar intended primary receiver has incentive unilaterally deviate) of the presented presented rather than an eavesdropper (Sankar et al., 2013; Courgameanwhen when the ato random variables areondiscrete. discrete. In the the tion rather than an eavesdropper (Sankar et al., 2013; Courtade et al., 2012) or a secondary receiver with as much game the random variables are In continuous case, fundamental bound the estimation than an eavesdropper (Sankar et al.,with 2013; Courgame whencase, the aarandom variables areon discrete. In the rather tade et al., 2012) or a secondary receiver as much continuous case, fundamental bound on the estimation tade et al., 2012) or a secondary receiver with as much continuous fundamental bound the estimation error is provided for Gaussian random variables and the information as the primary one (Yamamoto, 1983, 1988). tade et al., 2012) or a secondary receiver with as much continuous case, afor fundamental bound on the estimation error is is provided provided for Gaussian random variables and the the information as the primary one (Yamamoto, 1983, 1988). error Gaussian random variables and information as the primary one (Yamamoto, 1983, 1988). equilibrium over affine policies are shown to achieve Information-theoretic guarantees on the amount of leaked error is provided for Gaussian random variables and the information as the primary one (Yamamoto, 1983, 1988). equilibrium over affine policies policies are showngrows. to achieve achieve the Information-theoretic guarantees the amount ofprivacy leaked equilibrium affine shown to guarantees on the amount leaked bound when over the emphasis on theare privacy private information when utilizingon equilibrium affine policies showngrows. to achieve the the Information-theoretic Information-theoretic guarantees onthe thedifferential amount of ofprivacy leaked bound when over the emphasis emphasis on the theare privacy grows. private information when utilizing the differential bound when the on privacy private information when utilizing the differential privacy framework was given in (Alvim et al., 2011; du Pin Calmon bound when the emphasisinon thepaper privacy grows.in essence, private information when utilizing the differential privacy The problem considered this is close, framework was given in (Alvim et al., 2011; du Pin Calmon framework was given in (Alvim et al., 2011; du Pin Calmon Fawaz,was 2012). was Thethe problem considered in this this paperand is close, close, in essence, essence, and framework givenPrivacy-aware in (Alvim et al.,machine 2011; dulearning Pin Calmon The problem considered in paper is in to idea of differential privacy its application and Fawaz, 2012). Privacy-aware machine was The problem considered in this paperand is close, in essence, discussed and Fawaz, 2012). Privacy-aware machine learning was in (Wainwright et al., 2012), wherelearning the mutual to estimation the idea of of differential privacy and its application to the idea differential privacy its application in and signal processing, e.g. (Dwork, 2008; and Fawaz, 2012). Privacy-aware machine learning was discussed in (Wainwright et al., 2012), where the mutual to the idea of differential privacy and its application discussed in (Wainwright et al., 2012), where the mutual information between the transmitted message and the in estimation and signal processing, e.g. (Dwork, 2008; in estimation and signal processing, e.g. (Dwork, 2008; discussed in (Wainwright et al., 2012), where the mutual information message and the inThe estimation and signal processing, e.g. (Dwork, 2008; original information between the transmitted message and the data between points is the usedtransmitted to measure and constrain work of F. Farokhi was supported by a McKenzie Fellowinformation between the transmitted message and the original data points is used to measure and constrain the The original data points is used to measure and constrain the work of F. Farokhi was supported by a McKenzie Fellowloss of privacy. The contributions of the paper, which sets ship, ARC grant LP130100605, a grant from Melbourne School of The work of F. Farokhi was supported by a McKenzie Fellow The work of F. Farokhi was supported by a McKenzie Felloworiginal data points is used to measure and constrain the loss of privacy. The contributions of the paper, which sets ship, ARC grant LP130100605, a grant from Melbourne School of loss of privacy. The contributions of the paper, which sets it apart from the above-mentioned studies, is that we use Engineering. The work of G. Nair was supported by ARC grants ship, ARC grant LP130100605, a grant from Melbourne School of loss of privacy. The contributions of the paper, which sets ship, ARC grant LP130100605, a grant from Melbourne School of it apart from the above-mentioned studies, is that we use Engineering. The work of G. Nair was supported by ARC grants it DP140100819 and work FT140100527. Engineering. The of G. Nair was supported by ARC grants it apart apart from from the the above-mentioned above-mentioned studies, studies, is is that that we we use use Engineering. of G. Nair was supported by ARC grants DP140100819 and FT140100527. DP140100819The and work FT140100527.
DP140100819 and FT140100527. Copyright © 2016, 2016 IFAC 43 Hosting by Elsevier Ltd. All rights reserved. 2405-8963 © IFAC (International Federation of Automatic Control) Copyright 2016 IFAC 43 Peer review© of International Federation of Automatic Copyright ©under 2016 responsibility IFAC 43 Control. Copyright © 2016 IFAC 43 10.1016/j.ifacol.2016.10.370
2016 IFAC NECSYS 44 September 8-9, 2016. Tokyo, Japan
(Z, W )
S
Y
Farhad Farokhi et al. / IFAC-PapersOnLine 49-22 (2016) 043–048
R
1) In participatory-sensing schemes, the sender’s measurement of the state potentially depends on the way that the sender experiences the underlying process or services. For instance, in traffic estimation, the sender’s measurement is fairly accurate on the route that she has travelled and, thus, an honest revelation of Z provides a window into the life of the commuter. However, the underlying state X is not related to the private information of sender W since she is only an infinitesimal part of the traffic flow. In such a case, we have P{X = x, Z = z, W = w} = P{Z = z|X = x, W = w}P{X = x, W = w} = P{Z = z|X = x, W = w}P{X = x}P{W = w}, where the second equality follows from independence of random variables W and X. 2) In many services, such as buying insurance coverage or participating in polling surveys, an individual should provide an accurate history of her life or beliefs. In these cases, the variable X highly depends on the private information of the sender W (if not equal to it). In such cases,the measurement Z may not contain any error as well.
ˆ X
Fig. 1. Communication structure between the sender S and the receiver R. a strategic game to model the interactions between the sender and the receiver. As we see shortly, there exists an equilibrium in which the players cooperate and thus the problem transforms into a team play; however, there are other equilibria in which the players would not necessarily cooperate. The idea of strategic communication has been studied in the economics literature in the context of cheap-talk games (Crawford and Sobel, 1982) in which well-informed senders communicates with a receiver that makes a decision regarding the society’s welfare. In those games, the sender(s) and the receiver have a clear conflict of interest, which results in potentially dishonest messages. Contrary to those studies, here, the conflict of interest is motivated by the sense of privacy of the sender, which changes the form of the cost functions. Cheap-talk games were recently adapted to investigate privacy in communication and estimation (Farokhi et al., 2015). That study, however, focuses quadratic cost functions and Gaussian random variables. Privacy constrained information processing with an entropy based privacy measure was studied by (Akyol et al., 2015). Contrary to these studies, we consider both discrete and continuous random variables with arbitrary distributions. For the continuous Gaussian random variables, we also present a fundamental bound on the variance of the estimation error at the equilibrium, which is missing from the literature.
The results of this paper rely on the knowledge of that the cross correlation of the random variables X and W . In practice, these information might not be available due to the complexity of the privacy problem. However, the presented results provide valuable insights regarding the nature of the problem and the arising complexities. In what follows, the privacy game is properly defined. 2.1 Receiver
The rest of the paper is organized as follows. In Section 2, the problem formulation for discrete random variables is introduced and the equilibria of the game are constructed. The results are extended to continuous random variables in Section 3. The paper is concluded in Section 4.
ˆ ∈ X using the The receiver constructs its best estimate X ˆ = x conditional distribution P{X ˆ|Y = y} = βxˆy for all (ˆ x, y) ∈ X × Y. The matrix β = (βxˆy )(ˆx,y)∈X ×Y ∈ B is the policy of the receiver with the set of feasible policies defined as B = β :βxˆy ∈ [0, 1],∀(ˆ x, y) ∈ X × Y ∧ βxˆy =1, ∀y ∈ Y .
2. DISCRETE RANDOM VARIABLES
x ˆ∈X
We consider strategic communication between a sender and a receiver as depicted in Fig. 1. The receiver wants to have an accurate measurement of a discrete random variable X ∈ X , where X denotes the set of all the possibilities. To this aim, the receiver deploys a sensor (which is a part of the sender) to provide a measurement of the variable. The measurement is denoted by Z ∈ X . The sender also has another discrete random variable denoted by W ∈ W, which is correlated with X and/or Z. This random variable is the sender’s private information, i.e., it is not known by the receiver. The sender wants to transmit a message Y ∈ Y that contains useful information about the measured variable while minimizing the amount of the leaked private information (note that, because of the correlation between W and X and/or Z, an honest report of Z may shine some light on the realization of W ). Throughout this paper, for notational consistency, we use capital letters to denote the random variables, e.g., X, and small letters to denote a value, e.g., x. Assumption 1. The discrete random variables X, Z, W are distributed according to a joint probability distribution p : X × X × W → [0, 1], i.e., P{X = x, Z = z, W = w} = p(x, z, w) for all (x, z, w) ∈ X × X × W.
The receiver prefers an accurate measurement of the variable X. Therefore, the receiver wants to minimize ˆ with the mapping d : X × the cost function E{d(X, X)} X → R≥0 being a measure of distance between the entries of set X . An example of such a distance is 0, x = x ˆ, d(x, x ˆ) = 1, x = x (1) ˆ. When using the distance mapping in (1), the term ˆ becomes the probability of error at the reE{d(X, X)} ceiver. The results of this paper are valid irrespective of the choice of this mapping. 2.2 Sender The sender constructs its message y ∈ Y according to the conditional probability distribution P{Y = y|Z = z, W = w} = αyzw for all (y, z, w) ∈ Y × X × W. Therefore, the tensor α = (αyzw )(y,z,w)∈Y×X ×W ∈ A denotes the policy of the sender. The set of feasible policies is given by A = α : αyzw ∈ [0, 1], ∀(y, z, w) ∈ Y × X × W ∧ αyzw = 1, ∀(z, w) ∈ X × W .
The conflict of interest between the sender and the receiver can be modelled and analysed as a game. This conflict of interest can manifest itself in the following ways:
y∈Y
44
2016 IFAC NECSYS September 8-9, 2016. Tokyo, Japan
Farhad Farokhi et al. / IFAC-PapersOnLine 49-22 (2016) 043–048
ˆ + I(Y ; W ), The sender wants to minimize E{d(X, X)} with I(Y ; W ) denoting the mutual information between random variables Y and W (Thomas and Cover, 2006), to strike a balance between transmitting useful information about the measured variable and minimizing the amount of the leaked private information. In this setup, the privacy ratio captures the sender’s emphasis on protecting her privacy. For small , the sender provides a fairly honest measurement of the state. However, as increases, the sender provides a less relevant message to avoid revealing her private information through the communicated message.
45
Algorithm 1 The best-response dynamics for learning an equilibrium. Require: α0 ∈ A, β 0 ∈ B 1: for k = 1, 2, . . . do 2: if k is even then 3: αk ∈ arg minα∈A U (α, β k−1 ) 4: β k ← β k−1 5: else 6: β k ∈ arg minβ∈B V (αk−1 , β) 7: αk ← αk−1 8: end if 9: end for
2.3 Equilibria Therefore, we can rewrite the costs of the sender and the receiver, respectively, as U (α, β) = ξ(α, β) + ζ(α) and V (α, β) = ξ(α, β). Now, we can properly define the equilibrium of the game. Definition 1. (Nash Equilibrium): A pair (α∗ , β ∗ ) ∈ A× B constitutes a Nash equilibrium of the privacy game if (α∗ , β ∗ ) ∈ N with
The cost function of the sender is equal ˆ + I(Y ; W ), U (α, β) = E{d(X, X)} where P{Y = y, W = w} I(Y ; W ) = y∈Y w∈W
P{Y = y, W = w} × log (2) P{Y = y}P{W = w} with P{Y = y} = w∈W z∈X x∈X αyzw p(x, z, w), P{W = w} = z∈X x∈X p(x, z, w), and P{Y = y, W = w, Z = z} P{Y = y, W = w} =
N = {(α, β) ∈ A × B | U (α, β) ≤ U (α , β), ∀α ∈ A V (α, β) ≤ V (α, β ), ∀β ∈ B}.
As all signalling games (Sobel, 2012), the privacy game admits a family of trivial equilibria known as babbling equilibria in which the sender’s message is independent of the to-be-estimated variable and the receiver discards sender’s message. Theorem 1. (Babbling Equilibria): Let α∗ ∈ A be ∗ such that αyzw = 1/|Y| for all (y, z, w) ∈ Y × X × W. Further, let β ∗ ∈ B be such that βxˆ∗y = 1 for x ˆ ∈ arg maxx∈X z ∈X w ∈W p(x, w , z ). Then, (α∗ , β ∗ ) constitutes an equilibrium.
z∈X
=
P{Y = y|W = w, Z = z}
z∈X
× P{W = w, Z = z} = αyzw p(x, z, w). z∈X x∈X
Moreover, we have ˆ = ˆ =x E{d(X, X)} d(x, x ˆ)P{X ˆ, X = x}
Proof. The proofs are removed due to space limitations. See (Farokhi and Nair, 2016) for the detailed proofs.
x∈X x ˆ∈X
The messages passed at a babbling equilibrium are meaningless and do not contain any information. In what follows, we propose methods for capturing other equilibria of the game. Theorem 2. Any (α∗ , β ∗ ) ∈ arg min(α,β)∈A×B [ξ(α, β) + ζ(α)]) constitutes an equilibrium of the game.
with ˆ =x ˆ =x P{X ˆ, X = x, Y = y} P{X ˆ, X = x} = =
y∈Y
=
y∈Y
ˆ =x P{X ˆ|X = x, Y = y}P{X = x, Y = y} βxˆy
y∈Y
=
P{Y = y|X = x, Z = z, W = w}
Proof. See (Farokhi and Nair, 2016).
z∈X w∈W
Theorem 2 paves the way for constructing numerical methods to find an equilibrium of the game. This can be done by employing the various numerical optimization methods to minimize the function ξ(α, β) + ζ(α). However, this is a difficult task as this function is not convex (it is only convex in each variable separately and not in both variables simultaneously). Remark 1. Theorem 2 shows that, for at least one equilibrium of the game, the sender and the receiver cooperate and the problem transforms into a team play. However, it should be noted that there are other equilibria, e.g., those captured by Theorem 1, in which the sender and the receiver do not cooperate.
× P{X = x, Z = z, W = w}
βxˆy αyzw p(x, z, w).
y∈Y z∈X w∈W
Following these calculations, we can define mappings ξ : A × B → R and ζ : A → R such that ˆ ξ(α, β) = E{d(X, X)} = d(x, x ˆ)βxˆy αyzw p(x, z, w) x∈X x ˆ∈X y∈Y z∈X w∈W
and ζ(α) = I(Y ; W ) = αyzw p(x, z, w) y∈Y w∈W
z∈X x∈X
z∈X
× log
(
w∈W,z∈X ,x∈X
We can simplify the construction of an equilibrium of the game for the special case where the transmitted message Y and the to-be-estimated variable X span over the same set. Theorem 3. Assume that Y = X . Let β be such that ˆ = y and βxˆ y = 0 if x ˆ = y. Moreover, let α ∈ βxˆ y = 1 if x
αyzw p(x, z, w) . αyzw p(x, z, w))P{W = w}
x∈X
45
2016 IFAC NECSYS 46 September 8-9, 2016. Tokyo, Japan
Farhad Farokhi et al. / IFAC-PapersOnLine 49-22 (2016) 043–048
Algorithm 2 The best-response dynamics for learning an equilibrium. Require: α0 ∈ A, β 0 ∈ B 1: for k = 1, 2, . . . do 2: if k is even then 3: α ∈ arg minα∈A U (α, β k−1 ) 4: if U (αk−1 , β k−1 ) − U (α , β k−1 ) > then 5: αk ← α 6: else 7: αk ← αk−1 8: end if 9: β k ← β k−1 10: else 11: β k ∈ arg minβ∈B V (αk−1 , β) 12: if V (αk−1 , β k−1 ) − V (αk−1 , β ) > then 13: βk ← β 14: else 15: β k ← β k−1 16: end if 17: αk ← αk−1 18: end if 19: end for
ˆ E{d(X, X)}
0.6
0.2
0 10 -2
0.05 0 0.3
0.4
0.5
10 -1
10 0
10 1
10 2
10 -1
10 0
10 1
10 2
I(Y ; W )
0.8 0.6 0.4 0.2 0 10 -2
ˆ and mutual Fig. 2. Expected estimation error E{d(X, X)} information I(Y ; W ) at the extracted equilibrium versus the privacy ratio . Theorem 5. For {(αk , β k )}k∈N generated by Algorithm 2 and all > 0, (αk , β k ) ∈ N for all k ≥ 2 + Ψ(α0 , β 0 )/, where Ψ(α0 , β 0 ) = ξ(α0 , β 0 ) + ζ(α0 ).
arg minα∈A [ξ(α, β ) + ζ(α)]. Then, (α , β ) constitutes an equilibrium of the game.
Proof. See (Farokhi and Nair, 2016).
Proof. See (Farokhi and Nair, 2016). Remark 2. The proof of Theorem 3 reveals that the sender’s policy is the solution of the optimization problem minα∈A E{d(X, Y )} + I(W ; Y ). This problem is equivalent with solving minα∈A:I(W ;Y )≤ϑ E{d(X, Y )} where ϑ is an appropriate function of . Therefore, intuitively, the sender aims at providing an accurate measurement of the state X while bounding the amount of the leaked information.
Example 1. Consider an example with X = W = Y = {1, . . . , 5}. Assume that Z = X, i.e., the sender has access to the perfect measurement of X. Moreover, let (P{X = x, W =w})x∈X ,w∈W 0.14 0.02 0.01 0.01 0.02 0.02 0.14 0.02 0.01 0.01 = 0.01 0.02 0.14 0.02 0.01 . 0.01 0.01 0.02 0.14 0.02 0.02 0.01 0.01 0.02 0.14 This distribution implies that there is a reasonable correlation between X and W . Therefore, for high enough , we expect a bad estimation quality (at the receiver) since, otherwise, the receiver can recover X, which carries a significant amount of information about W . For this example, we use the results of Theorem 3 extract a nontrivial equilibrium for each . Fig. 2 (top) illustrate the estimation ˆ as a function of the privacy ratio . As error E{d(X, X)} we expect, by increasing , the sender puts more emphasis on protecting her privacy rather than providing a useful ˆ measurement to the receiver and, therefore, E{d(X, X)} increases. Fig. 2 (bottom) shows the mutual information I(Y ; W ) as a function of the privacy ratio . Evidently, with increasing , the amount of leaked information about the private information of the sender decreases. In both figures, there seems to be sudden change when the privacy ratio passes the critical value = 0.38. That is, for all < 0.38, the truth-telling seems to be an equilibrium of the game; however, if > 0.38, the sender adds false reports to protects her privacy. ♦
For more general cases, we can use a distributed learning algorithm to recover an equilibrium. An example of such a learning algorithm is the iterative best-response dynamics. Following this, we can construct Algorithm 1 to recover an equilibrium of the game distributedly. To present our results, we need to introduce a more practical notion of equilibrium. Definition 2. (-Nash Equilibrium): For all > 0, a pair (α∗ , β ∗ ) ∈ A × B constitutes an -Nash equilibrium of the privacy game if (α∗ , β ∗ ) ∈ N with N = {(α, β) ∈ A × B | U (α, β) ≤ U (α , β) + , ∀α ∈ A V (α, β) ≤ V (α, β ) + , ∀β ∈ B}.
This notion of equilibrium means that each player cannot gain by more than from unilaterally changing her actions, which is a practical notion if the act of changing her actions has “some cost” for the player. Now, we are ready to prove that Algorithm 1 can extract an -Nash equilibrium. Theorem 4. For {(αk , β k )}k∈N generated by Algorithm 1 and all > 0, there exists K ∈ N such that (αk , β k ) ∈ N for all k ≥ K . Proof. See (Farokhi and Nair, 2016).
0.1 0.4
3. CONTINUOUS RANDOM VARIABLES
In this section, we consider continuous random variables X, Z ∈ X ⊆ Rnx and W ∈ W ⊆ Rnw . The following assumption is made. Assumption 2. The continuous random variables X, Z, W are distributed according to a joint probability density function p : X × X × W → R≥0 , i.e., P{X ∈ X¯ , Z ∈
Theorem 4 shows that Algorithm 1 converges to an -Nash equilibrium, for any > 0, in a finite number of iterations. We can slightly tweak Algorithm 1 to also present bounds on the required number of iterations to extract an -Nash equilibrium. 46
2016 IFAC NECSYS September 8-9, 2016. Tokyo, Japan
Farhad Farokhi et al. / IFAC-PapersOnLine 49-22 (2016) 043–048
¯ W ∈ W} ¯ = Z, ¯ w∈W ¯ p(x, z, w)dwdzdx for all x∈X¯ z∈Z ¯ ∈ W. Lebesgue-measurable sets X¯ , Z¯ ⊆ X and W
where, for any two random variables F and G, VF G = E{F G }. The bound in Theorem 7 shows that even when goes to infinity (i.e., the sender does not care for the estimation error), the error does not grow to VXX , which is not the case in the differential privacy literature (because the intensity of the additive noise typically grows unbounded with more emphasis on the privacy in the differential privacy framework). In (Akyol et al., 2015), it was proved that for the scalar random variables, the additive noise at the equilibrium is indeed zero. However, the abovementioned observations stemming from Theorem 7 are valid irrespective of the dimension of the random variables. Definition 4. (Affine Gaussian Policies): The sender is said to use an affine Gaussian policy if Y = K1 Z + K2 W + N , where N is a Gaussian random noise independent of Z and W . The set of all such policies is denoted by A . Definition 5. (Nash Equilibrium in Affine Gaussian Policies): A pair (α∗ , β ∗ ) constitutes a Nash equilibrium of the privacy game in affine Gaussian policies if (α∗ , β ∗ ) ∈ Naff with Naff = {(α, β) ∈ A × B | U (α, β) ≤ U (α , β), ∀α ∈ A V (α, β) ≤ V (α, β ), ∀β ∈ B}. Theorem 8. There exists an equilibrium in affine Gaussian policies in which the sender employs the policy Y = K1 Z + K2 W + N, and the receiver employs the policy ˆ = (VXZ K + VXW K )Y, X
and
ζ(α) = I(Y ; W ) p(y|w) = p(y, w) log dwdy, p(y)
p(y|w) =
α(y|z , w)p(x , z , w)dx dz ,
1
2
where K1 ∈ arg min − K QK − log(det(I − K P K)), K2 K∈R(nx +nw )×ny 2
α(y|z , w )p(x , z , w )dx dz dw , p(y, w) = p(y|w)p(w) = p(y|w) p(x , z , w)dx dz . p(y) =
s.t.
K EK ≤ I,
The cost function of the sender is equal U (α, β) = ξ(α, β) + ζ(α), and the costs of the receiver is V (α, β) = ξ(α, β). Now, we can properly define the equilibrium of the game. Definition 3. (Nash Equilibrium): A pair (α∗ , β ∗ ) constitutes a Nash equilibrium of the privacy game if (α∗ , β ∗ ) ∈ N with N = {(α, β) ∈ A × B | U (α, β) ≤ U (α , β), ∀α ∈ A V (α, β) ≤ V (α, β ), ∀β ∈ B}.
with
We can prove similar results as in the previous section. Theorem 6. Any (α∗ , β ∗ ) ∈ arg min(α,β) [ξ(α, β) + ζ(α)] constitutes an equilibrium of the game.
Proof. See (Farokhi and Nair, 2016).
V V V V Q = V ZX VXZ V ZX VXW , W X XZ W X XW −1 P = VZW VW W VW Z VZW , VW Z VW W VZZ VZW E= V V WZ
and
VN N = I −
WW
K1 K2
E
K1 . K2
The optimization problem in Theorem 8 is non-convex. However, we can solve this problem explicitly for scalar random variables. Corollary 9. Let ny = nx = nw = 1 and VZW VW X = VZX VW W . There exists an equilibrium in affine Gaussian policies in which the sender employs the policy Y = K1 Z + K2 W, and the receiver employs the policy ˆ = (VXZ K1 + VXW K2 )Y, X
Proof. The proof follows the same line of reasoning as in Theorem 2. In the remainder of the paper, all random variables are jointly Gaussian. Now, we are able to prove a bound on the performance of the receiver. Theorem 7. For jointly Gaussian variables X, Z, W , ˆ ≤VXX − (VXZ − VXW V −1 VW Z ) min E{d(X, X)} (α∗ ,β ∗ )∈N
Proof. See (Farokhi and Nair, 2016).
ˆ ∈ X using The receiver constructs its best estimate X the conditional density function β(·|Y ) : X → R≥0 , i.e., ˆ ∈ X¯ |Y } = P{X β(ˆ x|Y )ˆ x for all Lebesgue-measurable x ˆ∈X¯ sets X¯ ⊆ X . Let us denote the set of all such policies by B. Similarly, the receiver prefers an accurate measurement of the variable X. Therefore, the receiver wants to minimize ˆ the cost function E{d(X, X)}, where d(x, x ˆ) = x − x ˆ22 . (3) The sender constructs its message y ∈ Y ⊆ Rny according to the conditional probability density function α(·|Z, W ) : ¯ W} = X × W → Y, i.e., P{Y ∈ Y|Z, ¯ α(y|Z, W )dy for y∈Y all Lebesgue-measurable sets Y¯ ⊆ Y. Let us denote the set of all such policies by A. The sender aims at minimizing ˆ + I(Y ; W ), with I(Y ; W ) denoting the difE{d(X, X)} ferential mutual information between random variables Y and W . We can (re)define ˆ ξ(α, β) = E{d(X, X)} = β(ˆ x|y)α(y|z, w)p(x, z, w)dˆ xdydwdzdx
where
47
WW
where
−1 −1 × (VZZ − VZW VW W VW Z ) −1 × (VZX − VZW VW W VW X ),
47
1 1 K1 K 2 = V + V ξ 2 + 2V ξ ξ ZZ WW ZW
2016 IFAC NECSYS 48 September 8-9, 2016. Tokyo, Japan
Farhad Farokhi et al. / IFAC-PapersOnLine 49-22 (2016) 043–048
of the IEEE International Symposium on Information Theory Proceedings (ISIT), 189–193. Crawford, V.P. and Sobel, J. (1982). Strategic information transmission. Econometrica: Journal of the Econometric Society, 1431–1451. du Pin Calmon, F. and Fawaz, N. (2012). Privacy against statistical inference. In Proceedings of the 50th Annual Allerton Conference onCommunication, Control, and Computing (Allerton), 1401–1408. Dwork, C. (2008). Differential privacy: A survey of results. In M. Agrawal, D. Du, Z. Duan, and A. Li (eds.), Theory and Applications of Models of Computation, volume 4978 of Lecture Notes in Computer Science, 1– 19. Springer Berlin Heidelberg. Farokhi, F., Sandberg, H., Shames, I., and Cantoni, M. (2015). Quadratic gaussian privacy games. In Proceedings of the IEEE 54th Annual Conference on Decision and Control, 4505–4510. Farokhi, F. and Nair, G. (2016). Privacyconstrained communication. Technical Note. https://dl.dropboxusercontent.com/u/36867745/ NecSys2016Privacy.pdf. Friedman, A. and Schuster, A. (2010). Data mining with differential privacy. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 493–502. Geng, Q. and Viswanath, P. (2014). The optimal mechanism in differential privacy. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), 2371–2375. Huang, Z., Wang, Y., Mitra, S., and Dullerud, G.E. (2014). On the cost of differential privacy in distributed control systems. In Proceedings of the 3rd International Conference on High Confidence Networked Systems, 105–114. Koufogiannis, F., Han, S., and Pappas, G.J. (2015). Optimality of the laplace mechanism in differential privacy. Preprint: arXiv:1504.00065 [cs.CR] http://arxiv.org/abs/1504.00065. Le Ny, J. and Pappas, G.J. (2014). Differentially private filtering. IEEE Transactions on Automatic Control, 59(2), 341–354. Sankar, L., Rajagopalan, S.R., and Poor, H.V. (2013). Utility-privacy tradeoffs in databases: An informationtheoretic approach. IEEE Transactions on Information Forensics and Security, 8(6), 838–852. Sobel, J. (2012). Signaling games. In R.A. Meyers (ed.), Computational Complexity, 2830–2844. Springer New York. Soria-Comas, J. and Domingo-Ferrer, J. (2013). Optimal data-independent noise for differential privacy. Information Sciences, 250, 200–214. Thomas, J.A. and Cover, T.M. (2006). Elements of Information Theory. Wiley Series in Telecommunications and Signal Processing. Wiley-Interscience, 2 edition. Wainwright, M.J., Jordan, M.I., and Duchi, J.C. (2012). Privacy aware learning. In Proceedings of Advances in Neural Information Processing Systems (NIPS), 1430– 1438. Wyner, A.D. (1975). The wire-tap channel. Bell System Technical Journal, The, 54(8), 1355–1387. Yamamoto, H. (1983). A source coding problem for sources with additional outputs to keep secret from the receiver or wiretappers. IEEE Transactions on Information Theory, 29(6), 918–923. Yamamoto, H. (1988). A rate-distortion problem for a communication system with a secondary decoder to be hindered. IEEE Transactions on Information Theory, 34(4), 835–842.
ˆ E{d(X, X)}
0.5 0.45 0.4 0.35 0.3 10 -4
10 -2
10 0
10 2
10 4
10 -2
10 0
10 2
10 4
I(Y ; W )
0.15
0.1
0.05
0 10 -4
ˆ and mutual Fig. 3. Expected estimation error E{d(X, X)} information I(Y ; W ) at the extracted equilibrium versus the privacy ratio . The red line illustrates the bound in Theorem 7. with [1 ξ] denoting an eigenvector of VZZ VZW −1 VZX VXZ VZX VXW VW X VXZ VW X VXW VW Z V W W −1 /2 VZW VW W VW Z VZW − . VW Z VW W VZZ − VZW V −1 VW Z WW
Proof. See (Farokhi and Nair, 2016).
Consider a numerical example with ny = nx = nw = 1 such that VXX = VW W = VXZ = 1, VXW = VZW = 0.4, and VZZ = 1.5. Figure 3 shows the expected estimation ˆ and mutual information I(Y ; W ) at the error E{d(X, X)} extracted equilibrium in affine Gaussian policies versus the privacy ratio . The red line illustrates the bound in Theorem 7. As we can see, the upper bound is achieved with an affine Gaussian policy. Therefore, at least for large , the equilibrium in affine Gaussian policies is also an equilibrium of the game. Interestingly, there is no additive noise at this equilibrium. This was previously observed in (Akyol et al., 2015). 4. CONCLUSION We developed a game-theoretic framework to investigate the effect of privacy in the quality of the measurements provided by a well-informed sender to a receiver. We used a privacy ratio to model the sender’s emphasis on protecting her privacy. Equilibria of the game were constructed. Future work can focus on extending the results to multiple sensors. REFERENCES Akyol, E., Langbort, C., and Basar, T. (2015). Privacy constrained information processing. In Proceedigns of the 54th Annual Conference on Decision and Control, 4511–4516. Alvim, M.S., Andr´es, M.E., Chatzikokolakis, K., and Palamidessi, C. (2011). On the relation between differential privacy and quantitative information flow. In L. Aceto, M. Henzinger, and J. Sgall (eds.), Automata, Languages and Programming, volume 6756 of Lecture Notes in Computer Science, 60–76. Springer Berlin Heidelberg. Courtade, T. et al. (2012). Information masking and amplification: The source coding setting. In Proceedings 48