Lightweight privacy-preserving data aggregation scheme for smart grid metering infrastructure protection

Lightweight privacy-preserving data aggregation scheme for smart grid metering infrastructure protection

Accepted Manuscript Lightweight Privacy-Preserving Data Aggregation Scheme for Smart Grid Metering Infrastructure Protection# Ulas Baran BALOGLU , Ya...

1MB Sizes 0 Downloads 97 Views

Accepted Manuscript

Lightweight Privacy-Preserving Data Aggregation Scheme for Smart Grid Metering Infrastructure Protection# Ulas Baran BALOGLU , Yakup DEM˙IR PII: DOI: Reference:

S1874-5482(17)30110-5 10.1016/j.ijcip.2018.04.005 IJCIP 246

To appear in:

International Journal of Critical Infrastructure Protection

Received date: Revised date: Accepted date:

30 June 2017 23 January 2018 27 April 2018

Please cite this article as: Ulas Baran BALOGLU , Yakup DEM˙IR , Lightweight Privacy-Preserving Data Aggregation Scheme for Smart Grid Metering Infrastructure Protection# , International Journal of Critical Infrastructure Protection (2018), doi: 10.1016/j.ijcip.2018.04.005

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

Highlights: • A lightweight privacy-preserving data aggregation scheme for smart grids • Lossless scheme by using a combination of encryption and perturbation techniques • Suitable for devices with limited hardware • Resilient to both filtering and true value attacks • Case study using Holt-Winters and STL prediction methods is presented

Lightweight Privacy-Preserving Data Aggregation Scheme for Smart Grid Metering Infrastructure Protection 1

ACCEPTED MANUSCRIPT

Ulas Baran BALOGLU1*, Yakup DEMİR2 1

Computer Engineering Department, Munzur University, Tunceli, Turkey Electrical and Electronics Engineering Department, Firat University, Elazig, Turkey * [email protected]

2

Abstract

AN US

CR IP T

The electric industry’s planned shift to smart grid metering infrastructure has raised several concerns especially on preserving the privacy. Various data perturbation and aggregation solutions are developed to address this concerns. The drawback of these solutions is that a simple random noise scheme cannot protect privacy, and more advanced perturbation techniques may increase hardware costs of smart metering devices. The proposed data aggregation scheme combines the power of perturbation techniques with crypto-systems in an efficient and lightweight way so that it is suitable for devices with limited hardware such as smart meters. We investigated the privacy preserving capabilities of the proposed aggregation scheme with Holt-Winters and Seasonal Trend Decomposition using Loess prediction methods. The results indicate that the proposed scheme is resilient to both filtering and true value attacks. Keywords: data perturbation; privacy; security; smart grid protection.

ED

M

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

1. Introduction

PT

Recently, there is significant research on smart grids and its smart components such as electricity meters. The introduction of these new technologies raises privacy problems and concerns because data engineering and data mining techniques could investigate large

CE

volumes of private data quickly. The electric power industry has to cooperate with information technologists to adopt cybersecurity into the smart grid to maintain reliability

AC

because reliability requires security [1]. Sender authentication and privacy-preserving of consumer data are two major security problems in smart grid communication [2]. Applications and services should be designed in a way to operate efficiently without intruding on the privacy of consumers [3]. Smart meter data can be associated with a consumer’s activities, and such privacy-sensitive household data shouldn’t be shared, disclosed or used by any third-party company to profile energy consumption patterns [4]. Smart meters are essential components of a smart grid, and they are usually referred as the future generation of power measurement systems [5]. Advances in technology not only 2

ACCEPTED MANUSCRIPT have started the emergence of smart grids, but they also have triggered the development of smart metering devices over time. The evolution first begins with the use of Automatic Meter Reading (AMR) devices, which provide one-way communication. AMR technology has made it easier to read the electricity meters, but it is still not suitable for implementing smart grid applications. Consequently, the Automatic Metering Infrastructure (AMI), which can provide two-way communication, has emerged. AMI meters can be read for a much shorter period, and they can be communicated remotely [6]. Historical development of metering devices for

PT

ED

M

AN US

CR IP T

electrical grids is shown in Figure 1.

CE

Figure 1. Historical development of metering devices and technologies.

With current communication techniques, smart meters not only monitor the power

AC

consumption of all household appliances but also report meter readings in smaller time intervals. Naturally, a smart meter becomes a key requirement for smart grid applications, such as a demand response system which encourages consumers for reducing the demand to limit peak amounts [7]. Collected metering data from these applications could be used to generate profiles of consumers even in anonymous environments, and failure to adequately secure personal privacy may threaten customer participation and risk success of smart grid applications. From the metering data, daily routines of your house can be revealed such as when you sleep, when you go to work or which brands of appliances at home are used at what 3

ACCEPTED MANUSCRIPT time. This information can be used for a variety of purposes from burglary to assassination. The information that metering devices read should then be used only by smart grid applications, and should not be readable or accessible except for this. This study focuses on developing a lightweight privacy-preserving data aggregation scheme which efficiently utilizes components of a smart grid metering infrastructure. Our goal is preserving the information privacy in smart meters while keeping hardware costs down. Unlike many other privacy-preserving data aggregation schemes, the computational

CR IP T

cost of the aggregation and privacy preserving is minimal in this study. The amount of processing power has been tried to be minimized to decrease the infrastructure installation cost. Nonetheless, reducing processing power may lead to complexity problems in distributed systems, such as a limited number of Privacy Preserving Nodes (PPNs) problem [8]. The proposed approach extends this previous study and introduces a task scheduling layer to cope

AN US

with this minimization problem. Another contribution of the proposed approach is the calculation of each user data in two processing units at least and monitoring the status of the system continuously with the developed algorithms. They help the system to prevent data loss in the event of a malfunction in processing units and also capture possible errors resulting

M

from smart grid communication. Conclusively, the proposed scheme does not require distribution or model of the data for reconstructing perturbed data, and unlike other studies the reconstruction process is lossless.

ED

The remainder of this paper is organized as follows. Related work is given in Section 2 followed by the details of the proposed privacy-preserving data aggregation scheme and

PT

problem formulation in Section 3. We use two different prediction methods to investigate the privacy preserving capabilities of the proposed aggregation scheme in Section 4. The last

CE

section presents the concluding remarks.

AC

2. Related Work

Privacy-preserving for smart metering infrastructure is an important research problem

because meter readings can be used to observe a household activity in real time [8]. Those meter readings are sensitive data. They may create risks related to profiling or data mining in the future so that they should be kept and processed securely. Nobody wants someone else has the information of their household activities or schedules of particular home appliances. In a smart metering infrastructure, each meter measures energy consumption and sends it to a utility company at regular intervals, which is typically 15 minutes [9]. There should be 4

ACCEPTED MANUSCRIPT an external aggregator at company side to gather the metering data which represent energy consumption in the form of a time series. Privacy of consumers can be protected by implementing a security architecture which enables aggregation of the collected data [8]. There are different approaches in the literature for this purpose. Those approaches can be classified as trusted aggregator, homomorphic cryptosystem, secret sharing, blind signature, differential privacy and trusted dealer [10-12]. Aggregation schemes can be categorized as message appending and mathematical summation [13]. Techniques without aggregation aren’t

CR IP T

feasible because they require smart meters with computational capabilities. It would be expensive to construct, renew or modify such systems so that a reliable and secure aggregation scheme would be a better choice.

Data perturbing process for time series adds noise to hide the true metering values or daily routines of consumers while preserving the usability [14]. This process protects user

AN US

privacy in two different ways: either locally by randomizing user data before its transmission, or in a centralized manner at an aggregator [15]. Perturbed time-series data transmitted from smart meters to protect original usage information. There are different perturbation techniques in the literature, such as adding noise (additive perturbation), k-anonymity, compressing data

M

and geometric transformation [16]. A distributed noise generation procedure can be used to employ differential privacy [17]. However, these methods have difficulties in achieving an effective privacy and utility tradeoff because of correlations and high fluctuations of time-

ED

series data [18]. Bayes Estimate and Principal Component Analysis might also be selected for noise generation and data reconstruction tasks [19]. Removal of noise from the data is

PT

difficult, and this process may lead to data losses. Another technique is homomorphic secret sharing, which is a special form of secret sharing, where homomorphic encryption is used to

CE

encrypt the secret. Homomorphic encryption allows anyone to encrypt data from a set of messages without knowing the decryption key [20]. Semantically not secure homomorphic

AC

schemes may also cause security problems [21]. Many of the existing studies in the literature tried to employ expensive computational

operations. These operations are not feasible for smart grid applications where there are usually limited resources regarding bandwidth and computation [22]. In a previously proposed study, finding the minimum number of PPNs is determined as an NP-hard problem [8]. In this study, a task scheduler layer is added to the scheme to eliminate the need for minimum PPN computation. This layer avoids problems due to a low number of processing units and also leads to a more scalable architecture. Unlike previous studies, calculations to be done by the smart metering devices are kept simple so that their hardware costs will not 5

ACCEPTED MANUSCRIPT increase too much. Finally, the proposed aggregation scheme is not concerned only with the privacy of the data in the aggregator. The scheme also achieves a robust architecture by distributing decryption process to multiple processing units.

3. The Proposed Data Aggregation Scheme We now describe how smart metering data is processed in the proposed data

CR IP T

aggregation scheme. There are four layers in the architecture as shown in Figure 2. The first layer contains smart meters where metering data collected as time-series, perturbed and then transmitted with encrypted noise information. The second layer contains a task scheduler which is responsible for two processes: data aggregation and transmitting metering data. The metering data is transmitted to at least two separate PPNs to maintain the integrity and to

AN US

increase robustness. Perturbed data and encrypted noise data are aggregated in this layer by appending. Metering data is reconstructed and decrypted in the third layer by PPNs units. The

AC

CE

PT

ED

M

final layer represents the utilities or other services where metering data is used or stored.

Figure 2. Privacy-preserving aggregation scheme for time-series metering data.

6

ACCEPTED MANUSCRIPT Some studies aggregate all metering data and encrypt them together. The main problem with this approach is its weakness to possible hardware failures. A malfunction in one metering device causes failure of all aggregation scheme so that these studies not only have to concentrate on fault tolerance but also have to concentrate on privacy preserving. The proposed aggregation scheme is robust and resistant to the hardware failures in the system. Since the aggregated metering data is sent to more than one processing units, it is possible to recover the metering data in case of a hardware failure or communication error. Processing

CR IP T

units, which are unresponsive to processes, will automatically be out of the system as they can be easily detected by the task scheduler.

3.1. Problem Formulation

AN US

The proposed scheme perturbs a consumer’s smart metering data such that the individual data items cannot be estimated with high accuracy by attackers and their trend cannot be used for creating a profile. Suppose that there are N smart metering devices (D1, D2 … DN) in the smart grid infrastructure. Each device records meter readings at a predetermined time interval t. When the value of t is 15, there are total 96 records in daily time-series of each

M

device. Smart grid metering data is represented as below, ⃑







(1)

ED

Perturbed data is generated by adding noise to the metering data before transmission. Scaled version of the noise yields a better privacy so that noise data is multiplied by a series

PT

of coefficients. Given a series dn, using a perturbation series pt with zero mean and E[pt] = 0, additive perturbation is defined as calculating the series xt as a summation of dt and pt for

CE

every t  1 [23]. For wavelet perturbation, wavelet coefficients are multiplied with Gaussian noise or other coefficients can be preferred to generate different perturbation schemes. In this

AC

study, perturbed time-series is defined as follows, (2)



where  denotes coefficient,  is the mean value,  is the standard deviation and z

represents the grey level. Gaussian noise is time independent, and its generation doesn’t associate with the consumer data. Perturbed smart grid metering data is represented as, ⃑ ⃑ ⃑



(3)

7

ACCEPTED MANUSCRIPT There are two problems related to time-series data reconstruction and preserving the privacy of the data. When reconstructing the perturbed data with function approximation error (t) should be zero for all vectors ⃑ ⃑ ⃑ ⃑

⃑ ( ⃑)





, the

⃑ .



(4)



(5)

Transmitted smart metering data consists of both perturbed and unperturbed time-

equation 5 can be rewritten as the following, ( ⃑)





CR IP T

series data. Scaled version of fn can be added to the metering data to improve privacy and



(6)

where,  denotes the scaling coefficient, which is a random variable. Previous solutions for this problem have to deal with minimization of the f function because high

AN US

approximation errors make it difficult to separate unperturbed time-series data from the perturbed data. In a smart grid environment, no utility wants to charge its consumers less or more than actual consumption amount so that metering data should be transmitted without a loss in data. Reconstructing time-series with noise may cause data losses so that we decide to transmit the data in two parts: perturbed time-series data and encrypted noise amount. This

M

information is used by the processing units to reconstruct the original data without a loss. A traditional single perturbation technique can’t satisfy this lossless reconstruction without

ED

compromising security. In the scheme, the noise is generated with a known function fn but with unknown parameters, and it is applied with an unknown scaling coefficient. As a result,

PT

noise variance is not small, and noise values are correlated. The second problem is defined as protecting a smart meter’s data values in a way that

CE

individual readings or data trends cannot be estimated by statistical models. The minimum metering data leak for a given time series function ( ⃑) with a noise function fn is defined as, ⏟

(7)

AC



L denotes the metering data leak function and as expected lower values of this function

achieve a better privacy. The optimal noise value should satisfy two constraints: it should limit the metering data leak, and it shouldn’t increase the size of transmitted data too much. The optimal noise value is application specific, and it should be determined according to the environment parameters.

8

CR IP T

ACCEPTED MANUSCRIPT

AN US

Figure 3. The system model for metering data transmissions of smart meters.

3.2. Encryption Scheme

A Decision Diffie-Hellman (DDH) based scheme is used for encryption of noise data.

M

Let G be a multiplicative cyclic group of Sophie German prime order q with random generator gG. For a,bZq, gab is a random element in G for independently chosen ga and gb.

ED

H0 : Z  G and H1 : Z  G are two hash functions. Array A with m random elements over Z

PT

is defined as,

{





⁄ ⁄

(8)



CE

For 0 < i  m, secret key is keyi = (Ai, A2i) and .

(9)

In Figure 3 system model for metering data transmissions of smart meters is given.

AC

Perturbed data and encrypted noise data are sent to the task scheduler as two separate transmissions. It is possible to employ different security schemes for the encryption part, such as Shamir’s Secret Sharing algorithm [24], homomorphic encryption [20,21,25], NTRU [2627], Paillier cryptosystem [28-29], elliptic curve cryptography [30], differential privacy [31], or a neural network based multi-key algorithm [32]. The proposed security scheme is preferred because of its low complexity and simplicity. For additional computational complexity reduction, no noise is added to the metering data for coefficients which are lower than a predetermined threshold value. This setting may lead to the leakage of only an 9

ACCEPTED MANUSCRIPT insignificant number of actual values as it is seen in the case study but it does not give information about the general trend of the series. The algorithm of the perturbation process is given below.

AN US

CR IP T

Algorithm 1. Perturbation Input: Metering data time-series d, perturbed series x, coefficient series k, time interval count t, threshold  1: while (t  0) do 2: if (k < ) then 3: p = 0; 4: else 5: calculate noise p; 6: x = d + p; 7: Encrypt(p); 8: Transmit(p); 9: end if 10: Transmit(x); 11: t--; 12: end while

The proposed data aggregation scheme transmits the metering data as discrete values so that any attacker or a malicious user may capture only a portion of the data which can’t be estimated accurately by only removing the noise. An attacker has to capture both perturbed

M

data and encrypted noise data and then also has to identify and encrypt the data. The proposed data aggregation scheme has the following assumptions: We assume that internal hardware of metering devices isn’t accessible by any

ED



attacker. This study concentrates on securing the transmitted data. We assume that all the information transmitted by the metering device can be

PT



accessible by attackers. There isn’t any safe zone in the communication

CE

infrastructure of metering devices. 

We assume that each metering device operates independently without interacting with other metering devices. This assumption makes the system robust as a failure

AC

in one metering device doesn’t affect the other devices.



Finally, we assume that processing units and the utility are in safe zone. Data can also be encrypted at the utility, but protection mechanisms for storage are beyond this study’s interest.

3.3. Task-Assign Algorithm Task scheduler learns nothing other than the perturbed data and encrypted parts of the noise. This structure communicates with smart meters and PPNs, and it assigns PPNs 10

ACCEPTED MANUSCRIPT according to the Task-Assign algorithm. Task scheduling waits idly like a server, and it transmits aggregated metering data to suitable processing units when there is data in the metering data queue. It is not possible to retrieve information about the consumption after aggregation of metering data. Two different data structures are used in the task scheduler. The first structure is a queue for keeping the received metering data. New meter readings append to the end of the queue, and meter readings picked from the front are sent to the processing units. The second

CR IP T

structure is a list which keeps the information about processing units. In the list data structure, processing units are stored as ordered pairs. The first entry is the PPN identifier and the second entry is the allocation counter which is used to measure available processing power. When a processing unit finishes its job, it transmits a message to the task scheduler, and allocation counter is incremented. The allocation counter is decremented when the task

no job is given to that processing unit.

AN US

scheduler assigns a new job to a processing unit. If the allocation counter value is 0, it means

AC

CE

PT

ED

M

Algorithm 2. Task-Assign Input: Metering data queue Q 1: while (Q is not empty) do 2: if (front == NIL) then return error 3: Data *aggregate = front; 4: aggregate.append(aggregate->next); 5: front = aggregate->next; 6: firstPPN = *PPNPointer; 7: while (firstPPN->allocation-counter != NIL) do 8: PPNPointer++; 9: firstPPN = *PPNPointer; 10: end while 11: PPNPointer++; 12: secondPPN = *PPNPointer; 13: while (secondPPN->allocation-counter != NIL) do 14: PPNPointer++; 15: secondPPN = *PPNPointer; 16: end while 17: Transmit(aggregate, firstPPN, secondPPN); 18: Delete aggregate; 19: end while

3.4. Malfunctioning Processing Unit Problem A malfunctioning PPN is either not responding or can’t perturb the metering data according to the proposed scheme. As it is described in Section 3.2, the task scheduler can easily detect processing units, which are not responding due to a possible hardware problem. A processing unit may also transmit corrupted or wrong data values. This data do not match with the data from the alternative processing unit at the utility. In this case, there are two 11

ACCEPTED MANUSCRIPT possible scenarios. First, consider one processing unit is malfunctioning. This unit can be detected by using the second and third units from the list data structure. If transmitted data of second and third units’ match, then the first unit is marked as malfunctioning. Otherwise, the second unit is marked. Second, consider both of the processing units are malfunctioning. Similar to the first scenario, these malfunctioning units can be detected by checking third and fourth units.

CR IP T

3.5. Security Model

An honest-but-curious security model is considered in this study. In this model, all entities follow the protocol honestly and do not learn anything beyond their outputs. However, some entities such as task scheduler can be considered as curious or semi-honest because they

AN US

collect data from other entities. Entities such as smart meters and PPNs should transmit data without leaking any information about their inputs, and they are considered as honest. The model is secure if and only if all entities just have the knowledge of the output and they have no new knowledge gathered from the other entities.

Task scheduler is in the middle of the proposed scheme, and it is temporarily storing

M

the metering data so that this entity is an appealing target for security attacks. A malicious task scheduler does not necessarily follow the security model and can leak encrypted metering

ED

data. Besides, data communication between metering devices and task scheduler is considered as open to attacks from malicious users. The privacy requirement or the security goal of the proposed scheme is hiding the trend of the consumption and preventing the reveal of true

PT

meter reading values. In more details, smart metering data should be kept secure, and no information will be leaked when a subset of time-series data is captured. To ensure this

CE

privacy protection, in the proposed scheme all entities other than PPNs and the utility can

AC

only process and transmit the perturbed metering data.

4. Case Study A case study has been carried out on different scenarios to test the privacy-preserving

scheme proposed within the scope of the study. Two different prediction methods are used to reconstruct perturbed time-series. The main advantage of using statistical techniques to demonstrate security attacks is monitoring privacy problems related to malicious users and servers in the infrastructure. Details of these prediction methods and evaluations of the proposed scheme are briefly given in this section. 12

ACCEPTED MANUSCRIPT Prediction with Holt-Winters Method Holt-Winters method [33] was developed by Holt and Winters to capture seasonality. There are two types of this method as the multiplicative model and the additive model. This method is remarkably used for exponential smoothing and time-series prediction. The additive Holt-Winters method for prediction is expressed as follows, ̂

(10)

(

)

CR IP T

where smoothing parameters at, bt and st are calculated as, (11) (12) (13)

In these equations, p denotes the period length and , , and  are filter parameters

AN US

which are decisive in the estimation process. Small values of  give more importance to previous data, and high values of it considers recent data. Values of  closer to 0 gives weight to trend and level changes become important when  value is around 1. Finally, high values of

 make predictions sensitive to variations.

M

Prediction with Seasonal Trend Decomposition using Loess

ED

The Seasonal Trend Decomposition using Loess (STL) is an algorithm that was developed by Robert B. Cleveland, William S. Cleveland, Jean E. McRae and Irma Terpenning [34]. It decomposes a time-series into three components namely: the trend,

PT

seasonality, and remainder. Loess is used to smooth the output. This algorithm is simple, fast and powerful and can decompose time-series with missing values. In this algorithm, every

CE

member of the time-series data is divided as follows, (14)

AC

where t denotes trend, s denotes seasonality, r indicates remainder, and i denotes the

corresponding index. STL consists two recursive procedures nested inside. Seasonality and trend are updated in the inner loop and weights are calculated when the execution moves to the outer loop. In Figure 4(a), a daily smart metering data is illustrated. This data is perturbed according to the proposed scheme and shown in Figure 4(b). Figure 4(c) shows the power spectral densities of both original data and the perturbed data. It is seen from the graph that perturbed data is only allocated at the frequencies where the original data is allocated. In other 13

ACCEPTED MANUSCRIPT words, their spectra have a similarity which means that it is impossible to separate the

CR IP T

perturbed data and the original data.

(b)

M

AN US

(a)

(c)

ED

Figure 4. Illustration of daily metering data: (a) original data (b) after perturbation. (c) Power spectral densities of both.

PT

In the first two experiments, it is assumed that an attacker is capable of performing privacy-invading inference attacks and the attacker obtains 50% true values of the perturbed

CE

data. The Holt-Winters and STL methods are used to estimate the consumption data from this captured data collection. Since the attacker has only 50% of the data, we assume that the attacker preprocessed the data and the unknown information is filled by using two different

AC

methods. In the first experiment it was filled with 0, and in the second experiment, it was filled with the previous values. The results of these two experiments are shown in Figures 5(a) and 5(b). Since the Holt-Winters method is additive, it cannot make any predictions in the first experiment. The predictions made with STL aren’t successful, and they do not even reflect the general trend. In the second experiment, estimation methods perform better, but they still cannot obtain the true data values. At only one point, the STL method was able to capture one true value which is due to the fact that some members of the time series with small coefficient values are not perturbed as explained to reduce the computational cost. 14

CR IP T

ACCEPTED MANUSCRIPT

(a)

(b)

Figure 5. Illustration of prediction attacks to the captured data: 50% true value leaks with (a) 0-Value insertions (b) Previous-Value insertions.

In the third experiment, the worst scenario is investigated, and it is assumed that the

AN US

attacker obtains 100% true values of the perturbed data. Although the performance of the STL method increases in this scenario, neither method succeeds in obtaining the true values as shown in Figure 6. The STL method captures a couple of true values, but there is no improvement in the performance of the Holt-Winters method. This time it only generates a smoother output. With parameter enhancements or using a combination of different methods,

M

an attacker might get a slightly better result than the results shown here. Even in this case, it seems tough for the attacker to capture true data values. The main reason for this is the

AC

CE

PT

ED

spectral similarity of both perturbed data and original data.

Figure 6. Illustration of prediction attacks to the captured data: 100% true value leaks.

15

ACCEPTED MANUSCRIPT Computational Complexity In the proposed data aggregation scheme, computational tasks include data aggregation operations, task scheduling, perturbation and encryption-decryption operations. Smart meters and processing units do not need high computational power because the proposed scheme is lightweight and it does not demand frequent encryption and decryption operations. Further, it is easy to increase or decrease the number of processing units

CR IP T

dynamically in the case of a change in demand. The ease of capacity change makes the infrastructure more scalable. Data aggregation and task scheduling operations are similarly lightweight. Table 1 shows that running time of the proposed scheme is increased slowly with the increase of the number of smart meters and the proposed scheme is much faster than Shamir’s Secret Sharing [24] algorithm, which is a fast and lightweight algorithm. From these

AN US

results, it is observed that the number of users has very little effect on the computational cost, which makes the proposed scheme a good candidate for smart metering infrastructure. Table 1. Running time of the proposed scheme and Shamir’s scheme with a PPN parameter setting of 10.

M

Number of Smart Meters 10000 15000 20000 25000 30000 10000 15000 20000 25000 30000

Time (ms) 8,21 10,70 14,33 18,84 24,02 10,15 13,39 20,86 24,66 30,23

CE

PT

ED

Scheme Proposed Security Proposed Security Proposed Security Proposed Security Proposed Security Shamir’s [24] Shamir’s [24] Shamir’s [24] Shamir’s [24] Shamir’s [24] Conclusions

AC

In this paper, we constructed and investigated a lightweight data aggregation scheme to be used in smart grid metering infrastructure. The proposed scheme is resilient to both filtering and true value attacks. In the case study, Holt-Winters and STL prediction methods are employed to show that attackers cannot obtain the metering data accurately. On the other side, same data can be reconstructed without a loss by the processing units even though a perturbation is applied. For a proper operation, smart grid applications should not experience any performance degradation or information leakage. The proposed scheme is suitable for these applications because smart meters and task scheduler only do the lightweight computations and complex 16

ACCEPTED MANUSCRIPT operations are done by scalable processing units. Performance evaluation demonstrates the efficiency regarding the computation. To the best of our knowledge, the proposed scheme is the first attempt to propose a lightweight and lossless solution by using a combination of encryption and perturbation techniques.

References

CR IP T

[1] Hawk, C., Kaushiva, A. ‘Cybersecurity and the smarter grid’, The Electricity Journal, 2014, 27(8), pp.84-95. [2] Chim, T. W., Yiu, S. M., Hui L. C. K., Li V. O. K. ‘Privacy-preserving advance power reservation’, IEEE Communications Magazine, 2012, 50(8), pp.18-23. [3] M rmol F. G., Sorge, C., Ugus, O., P rez, G. M. ‘Do not snoop my habits: Preserving privacy in the smart grid’, IEEE Communications Magazine, 2012, 50(5), pp.166-172.

AN US

[4] Yang, L., Xue, H., Fengjun L. ‘Privacy-Preserving Data Sharing in Smart Grid Systems’, IEEE International Conference on Smart Grid Systems, 2014, pp. 878-883. [5] Sharma, K., Saini, L. M. ‘Performance analysis of smart metering for smart grid: An overview’, Renewable and Sustainable Energy Reviews, 2015, 49, pp. 720-735. [6] Farhangi, H. ‘The path of the smart grid’, IEEE Power and Energy Magazine, 2010, 8(1), 18-28.

M

[7] Siano, P. 'Demand response and smart grids - A survey', Renewable and Sustainable Energy Reviews, 2014, 30, pp. 461-478.

ED

[8] Rottondi, C., Verticale, G., Capone A. 'Privacy-preserving smart metering with multiple data consumers', Computer Networks, 2013, 57, pp. 1699-1713.

PT

[9] Benhamouda, F., Joye, M., Libert, B. ‘A new framework for privacy-preserving aggregation of time-series data’, ACM Transactions on Information and System Security, 2016, 18(3), pp. 21.

CE

[10] Leontiadis, I., Elkhiyaoui, K., Molva, R. ‘Private and dynamic time-series data aggregation with trust relaxation’, In Proceedings of Cryptology and Network Security: 13th International Conference, CANS 2014, 2014, pp. 305-320.

AC

[11] Lu, R., Liang, X., Li, X., Lin, X., Shen, X. S. ‘Eppa: An efficient and privacy-preserving aggregation scheme for secure smart grid communication’, IEEE Transactions on Parallel and Distributed Systems, 2012, 23(9), pp. 1621-1631. [12] Chim, T., Yiu, S., Hui, L. C. K., Li, V. K. ‘Pass: Privacy-preserving authentication scheme for smart grid network’, In IEEE International Conference on Smart Grid Communications (SmartGridComm), 2011, pp. 196-201. [13] Bae, M., Kim, K., Kim, H. ‘Preserving privacy and efficiency in data communication and aggregation for AMI network’, Journal of Network and Computer Applications, 2016, 59, pp. 333344. [14] Laforet, F., Buchmann, E., Böhm, K. ‘Individual privacy constraints on time-series data’, Information Systems, 2015, 54, pp. 74-91.

17

ACCEPTED MANUSCRIPT

[15] Erdogdu, M. A., Fawaz, N., Montanari, A. ‘Privacy-utility trade-off for time-series with application to smart-meter data’, In Proceedings of the AAAI Conference on Artificial Intelligence, Workshop on Computational Sustainability, AAAI 2015 Workshop, 2015, pp. 32-36. [16] Hong, S. K., Gurjar, K., Kim, H. S., Moon, Y. S. ‘A survey on privacy preserving time-series data mining’, In 3rd International Conference on Intelligent Computational Systems (ICICS’2013), 2013, pp. 44-48.

CR IP T

[17] Bao, H., Lu, R. ‘A New Differentially Private Data Aggregation with Fault Tolerance for Smart Grid Communications’, IEEE Internet of Things Journal, 2015, 2(3), pp. 248-258. [18] Yang, X., Ren, X., Lin, J., Yu, W. ‘On Binary Decomposition based Privacy-preserving Aggregation Schemes in Real-time Monitoring Systems’, IEEE Transactions on Parallel and Distributed Systems, 2016, 27(10), pp. 2967-2983. [19] Huang, Z., Wenliang, D., Chen, B. ‘Deriving Private Information from Randomized Data’, In Proceedings of International Conference on Management of Data, ACM SIGMOD, 2005, pp. 37-48.

AN US

[20] Alharbi, K., Lin, X., Shao, J. ‘A Framework for Privacy-Preserving Data Sharing in the Smart Grid’, IEEE/CIC ICCC 2014 Symposium on Privacy and Security in Commutations, 2014, pp. 214219. [21] Gentry, C. ‘Fully homomorphic encryption using ideal lattices’, STOC’09, ACM, 2009, pp. 169178.

M

[22] Jia, W., Zhu, H., Cao, Z., Dong, X., Xiao, C. ‘Human-Factor-Aware Privacy-Preserving Aggregation in Smart Grid’, IEEE Systems Journal, 2014, 8(2), pp. 598-607.

ED

[23] Papadimitriou, S., Feifei, L., Kollios, G., Yu, P. S. ‘Time series compressibility and privacy’, VLDB '07 Proceedings of the 33rd international conference on Very large data bases, 2007, pp. 459470. [24] Shamir A. ‘How to share a secret’, Communications of the ACM, 1979, 22, pp. 612-613.

CE

PT

[25] Mai, V., Khalil, I. ‘Design and implementation of a secure cloud-based billing model for smart meters as an Internet of things using homomorphic cryptography’, Future Generation Computer Systems, 2017, 72, pp. 327-338. [26] Nitaj, A. ‘Cryptanalysis of NTRU with two public keys’, International Journal of Network Security, 2014, 16(2), pp. 112–117.

AC

[27] Abdallah, A., Shen X. 'Lightweight Security and Privacy Preserving Scheme for Smart Grid Customer-Side Networks', IEEE Transactions on Smart Grid, 2015, PP(99), pp.1-1. [28] Paillier, P. 'Public-Key Cryptosystems Based on Composite Degree Residuosity Classes’, In Proceedings of Eurocrypt, 1999, pp. 223-238. [29] Li, H., Lin, X., Yang, H., Liang, X., Lu, R., Shen, X. 'EPPDR: An Efficient Privacy-Preserving Demand Response Scheme with Adaptive Key Evolution in Smart Grid’, IEEE Transactions on Parallel and Distributed Systems, 2014, 25(8), pp. 2053-2064. [30] Mahmood, K., Chaudhry, S.A., Naqvi, H., Kumari, S., Li, X., Sangaiah, A.K. (in press) 'An Elliptic Curve Cryptography based Lightweight Authentication Scheme for Smart Grid Communication’, Future Generation Computer Systems. doi: 10.1016/j.future.2017.05.002

18

ACCEPTED MANUSCRIPT

[31] Dwork, C. 'Differential privacy’, Automata, languages and programming, 2006, 4052, pp. 1-12. [32] Li, P., Li, J., Huang, Z., Li, T., Gao, C.Z., Yiu, S.M., Chen, K. ‘Multi-key privacy-preserving deep learning in cloud computing’, Future Generation Computer Systems, 2017, 74, pp. 76-85. [33] Winters, P. R. ‘Forecasting sales by exponentially weighted moving averages’, Management Science, 1960, 6(3), pp. 324-342.

AC

CE

PT

ED

M

AN US

CR IP T

[34] Cleveland, R. B., Cleveland W. S., McRae, J. E., Tepenning, I. ‘STL: A Seasonal-Trend Decomposition Based on Loess’, Journal of Statistics, 1990, 6(1), pp. 3-33.

19