Expert Systems With Applications 84 (2017) 242–261
Contents lists available at ScienceDirect
Expert Systems With Applications journal homepage: www.elsevier.com/locate/eswa
A statistical unsupervised method against false data injection attacks: A visualization-based approach Mostafa Mohammadpourfard a, Ashkan Sami a,∗, Alireza Seifi b a b
Department of Computer Science and Engineering, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran Department of Power and Control, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran
a r t i c l e
i n f o
Article history: Received 10 December 2016 Revised 17 April 2017 Accepted 5 May 2017 Available online 8 May 2017 Keywords: Cyber-attacks False data injection Visualization State estimation Unsupervised learning Topology changes Distributed generation Smart grid
a b s t r a c t To achieve intelligence in the future grid, a highly accurate state estimation is necessary as it is a prerequisite for many key functionalities in the successful operation of the power grid. Recent studies show that a new type of cyber-attack called False Data Injection (FDI) attack can bypass bad data detection mechanisms in the power system state estimation. Existing countermeasures might not be able to manage topology changes and integration of distributed generations because they are designed for a specific system configuration. To address this issue, an unsupervised method to distinguish between attack and normal patterns is proposed in this paper. This method can detect FDI attacks even after topology changes and integration of renewable energy sources. In this method, we assume that injecting false data into the power systems will lead to a deviation in the probability distribution of the state vector from the normal trend. The main phases of the proposed algorithm are: (1) Normalizing the dataset, (2) Adding several statistical measures as the new features to the dataset to quantify the probability distribution of the state vectors, (3) Employing principal component analysis to reduce the dimensionality of the dataset, (4) Visualizing the reduced data for humans and exploiting their creativity to detect attacks, and (5) Locating the attacks using Fuzzy C-means clustering algorithm. The proposed method is tested on both the IEEE 14-bus and IEEE 9-bus systems using real load data from the New York independent system operator with the following attack scenarios: (1) attacks without any topology change, (2) attacks after a contingency, and (3) attacks after integration of distributed generations. Experimental results show that our proposed method is superior to the state-of-the-art classification algorithms in dealing with changes. In addition, the reduced data which is helpful in distinguishing between attack and normal patterns can be fed into an expert system for further improvement of the security of the power grid. © 2017 Elsevier Ltd. All rights reserved.
1. Introduction The power grid with thousands of substations and transmission lines has become one of the most critical infrastructures in the new era (Yan, Zhu, He, & Sun, 2013). Reliable and continuous delivery of electrical energy which is done by the power grid is fundamental to most aspects of nowadays’ society (Zonouz & Haghani, 2013). Smart grid (SG) is an enhancement of the traditional power grid which uses two-way flow of electricity and information to create an efficient, reliable, automated and distributed energy delivery network (Fang, Misra, Xue, & Yang, 2012). Providing these features needs accurate State Estimation (SE) which is a prerequisite
∗
Corresponding author. E-mail addresses:
[email protected] (M. Mohammadpourfard),
[email protected] (A. Sami), seifi@shirazu.ac.ir (A. Seifi). http://dx.doi.org/10.1016/j.eswa.2017.05.013 0957-4174/© 2017 Elsevier Ltd. All rights reserved.
for many key functionalities in the successful operation of electric ´ 2016). power systems (Weng, Negi, Faloutsos, & Ilic, The method of weighted least squares (WLS) is often used in most state estimation programs to best fit the measured data from supervisory control and data acquisition (SCADA) or Phasor measurement unit (PMU) networks (Zonouz et al., 2014). The result of SE is acceptable when the system model is correct and the variances of random errors/noises are known (Chaojun, Jirutitijaroen, & Motani, 2015). But grossly erroneous measurements can degrade the accuracy of state estimation results (Grainger & William, 1994) and thus, these measurements should be detected and removed from the estimator calculations. This detection is performed by Bad Data Detection (BDD) algorithms. When using WLS method, BDD is usually done by processing the measurements residuals. Similar to measurement errors and noises, False Data Injection (FDI) attack (Liu, Ning, & Reiter, 2011) can affect the SE results. However, in this attack, the adversary alters the readings of SCADA
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
243
Table 1 Properties of the proposed method and the state of the art methods for FDI attack detection. Type Key contribution
Relevant literature
AL
ALN
DTC DIRES
DEP(drawback) RAMS(drawback)
(1)
(Bobba et al., 2010 ; Giani et al., 2013)
No
No
No
No
No
No
(Yang et al., 2014; Abdallah & Shen, 2016; Bi & Zhang, 2014; Bobba et al., 2010 ; Vukovic, Sou, Dan, & Sandberg, 2012) (Esmalifalak, Liu, Nguyen, Zheng, & Han, 2014; Ozay, Esnaola, Vural, Kulkarni, & Poor, 2016)
Yes No
No
No
No
No
Yes Yes
No
No
No
Yes
(Chaojun et al., 2015)
Yes Yes
No
No
No
Yes
(Moslemi, Mesbahi, & Velni, 2017; Sedghi & Jonckheere, 2015)
Yes Yes
No
No
No
Yes
(Manandhar, Cao, Hu, & Liu, 2014)
Yes Yes
No
No
No
Yes
(Ashok, Govindarasu, & Ajjarapu, 2016)
Yes To some extent Yes
To some extent Yes
No
Proposed Method
Yes Yes
Yes
No
(2)
Forcing the adversary to manipulate more measurements to orchestrate FDI attack by adding more measurements into the estimation process. Improving the security of the essential measurements by performing additional security mechanisms. Classification algorithms such as Perceptron, k-Nearest Neighbor and Support Vector Machines are used to predict new observations using a set of training data. Kullback–Leibler distance (KLD) is used to detect attacks. After FDI, the KLD will be larger than normal. Therefore, a KLD threshold is set using historical data to detect attacks. Inconsistency between the Markov graph of the bus phase angles and the power grid graph can lead to attack detection. It also selects a threshold using historical data to find the attacked nodes. Kalman filter is used to detect attacks. The probability of attack detection is largely dependent on the value of a precomputed threshold for Euclidean detector. It leverages online information independent of traditional SCADA measurements to identify anomalies. It uses gathered information from load forecasts, generation schedules, and real-time data from existing PMUs to detect anomalies in compromised SCADA measurements. Proposes an unsupervised visualization-based method
and PMU networks in a way that attacks are not detected by traditional BDD algorithms (Ishii & Chakhchoukh, 2015; Mohsenian-Rad & Rahman, 2013; Sedghi & Jonckheere, 2015; Yang et al., 2014). Wrong state estimates can lead to wrong decisions in the control center (CC) which may eventually result in blackouts (Liang, Sankar, & Kosut, 2016; Rampurkar, Pentayya, Mangalvedekar, & Kazi, 2016; Yan et al., 2013). 1.1. Related work Several methods have been proposed to alleviate FDI attacks. These countermeasures can be classified into two main types: 1) Protection-based 2) Detection-based Protection based approaches try to combat FDI attacks through identifying and protecting critical sensors. They usually define a subset of critical measurements and make them more resilient to attacks by encryption, tamper-proof communication systems, etc (Yang et al., 2014). However, detection based approaches recognize FDI attacks by developing anomaly detection mechanisms which are based on analysis and modeling of the distribution of historical measurements (Chaojun et al., 2015). Overall, the second type methods have utilized graph theory, Kalman filter, classification algorithms, statistical threshold testing, etc. For a detailed review of the most related works and a qualitative comparison between them and the proposed method, six properties are investigated for each method as shown in Table 1. These properties are:
Yes
No
1. Attack Localization (AL): This property indicates the ability of the proposed method in determining the location of attacks. 2. Requiring Attacked Measurement Samples (RAMS): This property indicates whether the developed method needs to know the target value of each sample (attack/normal) and have corrupted measurement samples to accurately predict the new observations or not. This requirements is a drawback since it is hard to find target values in the real world. Moreover, for successful detection, they should cover all possible attack scenarios. 3. Applicable for Large Networks (ALN): This property indicates whether the method is applicable for large networks from computational complexity and cost perspectives or not. 4. Dealing with Topology Changes (DTC): This property indicates the ability of methods in detecting attacks after topology changes (e.g. contingencies). 5. Dealing with Integration of Renewable Energy Sources (DIRES): This property indicates the ability of methods in detecting attacks after integration of sustainable energy sources into the power system. 6. Depending on External Platforms (DEP): This property indicates whether the functionality of an attack detection method is dependent on external devices and output information of other modules or not. This dependence is a major drawback since compromise of other modules will affect the performance of the method.
244
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
1.2. Main motivation As shown in Table 1, one common drawback of the two types is that they have been designed for a specific system configuration (Ashok, Govindarasu, & Ajjarapu, 2016). This means that they have not considered the impact of topology changes and also integration of Renewable Energy Sources (RESs) into the power network. An approach which is designed for a specific system configuration might not be able to detect an attack or might locate it wrongly under a different system configuration. This is because the underlying data distribution and network topology will change after topology changes or integration of sustainable energy sources. In another hand, such an assumption will no longer hold in the future grid, where the frequent topological changes and intermittent generation (wind and solar farms) can lead to remarkable state variation in the grid operations (Weng et al., 2016). Results in Table 1 indicate that in terms of properties, the method proposed in Ashok et al. (2016) is more similar to our method. In Ashok et al. (2016), authors have emphasized on the issue of dynamically evolving cyber-threats and variable system configurations. Hence, they proposed a method to detect FDI attack with considering system reconfigurations (topology changes). It is based on utilizing redundant measurements. They have used those to get a statistical characterization of the difference between state estimation using SCADA data and forecast-based predictions for detecting FDI attacks. However, there are some assumptions that might not be practical in the real world: (1) it is assumed that PMU data is completely secure. To this end, using NASPINet (Dagle, 2009), an isolated network with various cyber security mechanisms have been suggested to withstand against tampering in sending PMU data. But at least, NASPINet should connect to other networks for sending PMU data to Phasor Data Concentrator (PDC). Moreover, most of the mentioned security mechanisms in NASPINet have not been implemented yet. In addition, several research efforts have focused on the vulnerabilities of PMUs (Ishii & Chakhchoukh, 2015; Mousavian, Valenzuela, & Wang, 2015). (2) Topology processing is performed using only SCADA data. Therefore, the adversary can launch an attack by manipulating SCADA data. More specifically, the attacker can hide the contingencies or generate fake contingencies for misleading the topology processer. 1.3. Summary of contributions In this paper, we propose an unsupervised anomaly detection method to detect cyber-attacks in power systems that are affected by topology changes or RESs. In particular, we present a visualization based false data injection detection method. While the problem of detecting cyber-attacks through visualization has been widely studied over the past years in the context of conventional information technology systems(Celenk, Conley, Willis, & Graham, 2010; Corchado & Corchado, 2011; Goodall, 2006; Koike, Ohno, & Koizumi, 2005; Luo & Xia, 2014), it has not been utilized in the context of the smart grid yet. When false data are injected into the system, the probability distributions of the state vectors will deviate from the normal ones, allowing the detection of attacks. To quantify the Probability Distribution (PD) of each system state vector, different statistical measures are proposed and calculated. Then we reduce the dimensionality of new generated data using principal component analysis (PCA) (Jolliffe, 2002) and visualize the reduced data in 2-D. By visualizing data, the control center operators can easily gain valuable insight over legitimate data patterns and will be able to highlight the salient points in the data. In addition, the reduced data can be fed into an expert system that evaluates system behavior and alerts when the system indicates anomalous behavior. For locating an attack after detection, we have applied fuzzy c-means (FCM) algorithm (Bezdek, Ehrlich, & Full, 1984) on
the attacked system state vector to cluster data into the attacked system states and normal ones. Our proposed method is a monitoring tool and would be executed based on the grid operator’s decision like (Ashok et al., 2016). The proposed method: • •
• •
•
Does not need additional hardware. Detects FDI attacks using only the state vectors. This implies that we detect attacks using minimal set of features. Does not need supervision or an attack model. Can detect attacks after topology changes and integration of RESs. May detect other types of data integrity attacks since it is an anomaly detection method.
Paper Outline: The remainder of the paper is organized as follows. Section 2 presents a brief overview of the state estimation, bad data detection, and false data injection attacks. Section 3 explains the extracted measures and the employed algorithms and then discusses how we leverage these algorithms and measures to detect and localize attacks. Section 4 explains the steps of building the test systems and shows the test results on the IEEE 14 bus and IEEE 9 bus systems. The conclusion is drawn in Section 5.
2. Background In this section, firstly the state estimation and the bad data detection are discussed. Then the procedure of orchestrating the state-of-art false data injection attack is reviewed.
2.1. DC state estimation The DC power flow model is utilized in this paper. In the DC model, the phase angles of all buses are considered as state variables since it assumes that bus voltage magnitudes are all equal to one. In the DC state estimation, the aim is estimating phase angles θ i using the m observed measurements. DC state estimation problem can be represented using a linear approximation model as follows:
z = Hx + e
(1)
where, z = [z1 , . . . zm ]T is the real-time measurement vector which contains mactive power measurements received from remote meters. x = [x1 , . . . xn ]T (m ≥ n ) is the system state vector which consists of phase angles of all the buses. But as we consider the bus 1 as the reference bus θ1 = 0, the state vector becomes x = [x2 , . . . xn ]T . Hm×n is a constant Jacobian matrix and is derived from the physical structure of the grid (Abur & Gomez Exposito, 2004). e = [e1 , . . . em ]T is the vector of measurement errors which follows the Gaussian distribution with zero mean. Let the covariance matrix of measurement errors be R. The estimated values of the system state xˆ using WLS state estimation is given by:
−1
xˆ = HT R−1 H
HT R−1 z
(2)
2.2. BDD In order to detect bad data, the measurement residual r = z − H xˆ is calculated. To avoid bad measurements which can be originated from faulty sensors, topological errors, etc., the residual is compared with a threshold τ . It is believed that there is no bad data if the residual r is less than the threshold r ≤ τ . However, this assumption is breached through a newly introduced attack called false data injection (Liu et al., 2011).
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
2.3. FDI attack In the FDI attack, an attacker who has knowledge of the system structure H can falsify state estimation results through manipulating multiple measurements at the same time without triggering BDD alarm. Let a = [a1 , . . . am ]T be the nonzero attack vector which is added to the original measurement vector z. This leads to a new measurement vector za = z + a. According to (Liu et al., 2011), the manipulated measurement za will bypass BDD if
a = Hc
(3) ]T is
where c = [c1 , . . . cn the injected error on the system state. Sending the new vector of measurements za to the state estimator will result in false estimates xˆbad = xˆ + c where xˆ is the true estimates of the system. The FDI attack can bypass BDD because there is no difference in the residual of the attacked measurements and normal ones, as shown below:
ra = za − Hxˆ bad = z + a − H(xˆ + c ) = z − Hxˆ + (a − Hc ) = z − Hxˆ = r
(4)
For orchestrating an FDI attack and manipulating a certain state variable, the adversary should inject false data into all the measurements which are dependent on that state variable. More specifically, the minimum number of manipulated measurements to inject false data into state estimation, is equal to the number of nonzero elements of the sparest column in H. The increased use of information and communication technology in the smart grid has made it easier to do such illegal modifications through exploiting potential vulnerabilities of the cyber layer (Ntalampiras, 2016a). 3. Our methodology First in this section, the newly proposed measures and the employed algorithms are described. Afterwards, we elaborate on our detection approach which utilizes those measures and algorithms. The overview of the proposed method is illustrated in Fig. 1. 3.1. Newly proposed measures and employed algorithms 1) Description of New Measures: When false data are injected into the system, the PDs of the state vectors will deviate from the normal ones. This implies that the shape of the distribution curves of the attacked state vectors will be different from the normal ones. Therefore, the proposed method firstly utilizes statistical measures to quantify the PD of system state vectors (x = [x2 , . . . xn ]T ) to detect FDI attacks. Statistical moment selection: To choose measures, we treated the problem like a supervised learning task. To this end, we selected different shape measures and calculated their values for the state vectors. Then, we continued in the supervised framework and turned each measure into a feature. Afterwards, we used Pearson correlation-based feature selection and evaluated the worth of each feature. Finally, several simulations and tests were performed to select the best set of features among the top ranked ones. The final set of features includes: 1) second moment; 2) third moment; 3) forth moment; 4) mean; 5) skewness; 6) kurtosis; and 7) sample variance (Hippel, 2005; Ramsey, Newton, & Harvill, 2002). These features are described in Table 2 where E is the expectation operator, x is the state vector, μ is the population mean, n is the number of system states, σ is the standard deviation and x¯ is the sample mean. To show how these features are effective in distinguishiong between attack and normal patterns, consider Fig. 2a which is the histogram of the third moment calculated for the state vectors
245
before false data injection attacks. To illustrate the impact of FDI attack on the third moment of the state vector, we simulate 6% decremental attack on the ninth state variable(θ 9 ), c = (0, . . . , 0.06 ∗ θ9 , 0, . . . , 0). Fig. 2b shows the histogram of the third moment of the state vectors with false data injection. By comparing the two figures, one can see that FDI attacks affect the value of the third moment which is a measure of the PD. This is also true for other measures. 2) Principal Component Analysis (PCA): PCA is a statistical procedure to reduce the dimensions of a given unlabeled dataset with original features to a new set of axes called principal components (PC) (Jolliffe, 2002). More specifically, the aim of PCA is to transform a set of m correlated features to a set of n (m < n) features which are not correlated to each other while retaining most characteristics of the data. Since the new set of uncorrelated features (PCs) are ordered in terms of variance, the first PC retains the most information about the original data and each subsequent PC preserves less information. Seeking for maximizing the variance of each component, causes PCA to be sensitive to the relative scaling of the original features. Therefore, normalization is usually done before applying PCA to avoid such a problem. The steps of PCA are displayed in Algorithm 1. 3) Density-based spatial clustering of applications with noise (DBSCAN): Clustering is a technique to group a dataset in such a way that all objects in the same group are more similar to each other than to those in another group. DBSCAN is a data clustering method. It is assumed that clusters are high-density areas which have been separated by points with lower density (Ester, Kriegel, Sander, & Xu, 1996; Kriegel, Kröger, Sander, & Zimek, 2011). It usually clusters a dataset well if it contains clusters with similar densities. It takes two parameters: Eps and MinPts. Like other clustering algorithms, it needs a method to find nearby data. To this end, Euclidean distance is usually used. The algorithm starts from an arbitrary selected point which is not visited yet. Then, the points which are in the neighborhood of this point by a distance less than given Eps are counted. If the counted points are more than given MinPts parameter, a cluster is created. Otherwise, the point will remain unclustered and is considered as noise. This means if a point belongs to a cluster, the density in a neighborhood for that point should be high enough. This process is repeated for all points in data and the labeled noise data might be recognized as part of a cluster in next steps. It is noteworthy that if an object is a member of a cluster, its Eps-neighborhood is also a member of that cluster. In this paper, we have used Eps = 0.06 and MinPts = 6. 4) Fuzzy c-means: In contrast to the non-fuzzy clustering methods which an object can only be a member of exactly one cluster, in the FCM, each object can be part of several clusters (Bezdek et al., 1984). This is done through assigning a membership grade to each object. Belonging to a cluster is determined by these membership grades. In other words, each object can belong to several clusters with different degrees of membership. These grades correspond to the distance between the cluster center and the data point. Each data point belongs to a specified cluster if the membership degree of this point in the mentioned cluster is maximum compared to the other clusters. In the FCM, the following cost function is minimized iteratively:
J=
n c
2 μm ij xi − cj
(5)
i=1 j=1
Where n is the number of data points, c is the number of clusters which is equal to two in our problem as we have attacked and normal state variables, xi is the ith data point, cj is the center of the
246
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
Fig. 1. Overview of the proposed method. Table 2 Statistical measures for quantifying the PD of the system state vectors. Feature
Formula
Second Moment
m2 = E[(x − μ ) ]
Description
Third Moment
m3 = E[(x − μ ) ]
Forth Moment
m4 = E[(x − μ ) ]
Mean
me =
Skewness
γ1 = E[( x−σμ ) ]
Kurtosis
γ2 = E[( x−σμ ) ]
Sample Variance
s =
2
3
4
x1 +...+xn n 3
4
2
n
i=1
(xi −x¯ )
n−1
It is the population variance. The variance is a measure of how far the data is spread. With FDI attack, the variance will be larger/smaller than the normal mode. It is a measure of the asymmetry of the PD. When false data are injected, the PD will be more/less asymmetric than the normal mode. It is a measure of peakedness or flatness of the shape of a distribution. When false data are injected into the power system, the distribution curve will be highly peaked/flattened. It is a measure of central tendency of a PD. When false data are injected, the distribution will be less/more dispersed than normal data distribution. It is equal to the third standardized moment. The standardized version of a moment is helpful because it makes the measure invariant to scale and variability. This is useful in comparing the shape of variant PDs. It is equal to the fourth standardized moment. In contrast to the population variance, sample variance is divided by the number of the elements in the population minus 1. This helps us to get an unbiased estimation of the state vector variance.
Algorithm 1 Principal component analysis. Input: original dataset z(m, n) ; % m= number of samples, n= number of features 1 Calculating mean of each dimension: μn = mean (z(m,n) ); 2 Calculating covariance matrix: = (D−μ)T (D−μ) ( ); % μ is a m rows matrix which μm is repeated in each row n−1 3 Find eigenvalues and eigenvectors of the covariance matrix; % PCs are eigenvectors of the covariance matrix 4 Selecting the first k eigenvectors as the k PCs (k ≤ n) Output: Principal Components (reduced dataset)
jth cluster and is obtained by:
n
μm xi ij m μ i=1 ij
i=1
c j = n
(6)
μij is the membership degree of the ith data point in the jth cluster and is calculated as follows:
μij =
c k=1
1 xi −cj xi −ck
m2−1
(7)
where m is the fuzziness index. The fuzziness of a cluster is determined by m. A large mwill lead to smaller membership grades and fuzzier clusters. The usual default value of m which is usually used when there is no domain/experimentation knowledge is
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
247
Fig. 2. Impact of false data injection attacks on the third moment.
2. The FCM algorithm starts with a given cluster number c and randomly initiated cluster memberships values, μij . Then, the cluster centers are calculated and μij is updated based on “(7)”. The cost function is calculated and minimized until reaching to a specified number of iterations. 3.2. Proposed method As illustrated in Fig. 1, first, we use the min–max normalization to transform the data into the same scale and then calculate the new measures. These measures are calculated for each system state vector. The value of these features is different with FDI attack and without it as the PD of the attacked state vector is different from the PD of the normal one. These new features are given to the PCA algorithm as input. PCA is used to reduce the dimensionality of the given data(7 → 2) and then the reduced data is visualized in 2-D. By visualizing data in 2-D, the grid operators can determine whether an attack has occurred or not as the pattern of abnormal and normal data will be different. Through visualization, we take advantage of the analytical capabilities and creativity of the grid operators. Visualization helps to discover data patterns easily and distinguish between normal data and anomalies. Although FDI attack is detectable through visualization, it is possible to detect attacks automatically by dividing data into attack and normal clusters. Therefore, there will be no need for human intervention. To this end, the DBSCAN algorithm is utilized as an alternative verification for evaluating the performance of the proposed method without relying on the capabilities of humans. DBSCAN is used to cluster the reduced data. DBSCAN approves the performance of the proposed method if it clusters data into attack and normal clusters. To locate an attack automatically, we have utilized outlier detection techniques. An outlier is a point that stands apart from other points and deviates from others. This means that the attacked state variable will behave as an outlier and will be inconsistent with other states. We have used Fuzzy c-means (FCM) algorithm to detect the attacked system states. FCM has been used widely in the outlier detection context (Hodge & Austin, 2004). 4. Numerical results The effectiveness of our detection algorithm is tested using three different case studies. In case A, FDI attack to a system with-
Table 3 NYISO load data characteristics. Region
Bus
Range (MW)
Mean(MW)
SD (MW)
CAPITL CENTRL DUNWOD GENESE HUD VL LONGIL MHK VL MILLWD N.Y.C. NORTH WEST
Bus2 Bus3 Bus4 Bus5 Bus6 Bus9 Bus10 Bus11 Bus12 Bus13 Bus14
[11.76–21.70] [51.23–94.20] [25.70–47.80] [4.14–7.60] [5.44–11.20] [15.39–29.50] [4.60–9.00] [1.68–3.50] [3.54–6.10] [9.12–13.50] [9.29–14.90]
16.68 72.85 35.35 5.76 8.26 21.41 6.87 2.48 4.71 11.07 11.98
2.38 (14.29%) 9.54 (13.10%) 5.11 (14.46%) 0.81 (14.18%) 1.27 (15.40%) 3.45 (16.15%) 1.11 (16.22%) 0.35 (14.12%) 0.72 (15.32%) 0.89 (8.09%) 1.31 (11.01%)
out any topological change is simulated. To test the ability of the method in handling topology changes, FDI attacks are injected into the system after a contingency (a line outage) in case B. Case C is designed to evaluate the robustness of the proposed method against integration of distributed generations into the power grid. In case C, we inject FDI attacks into the system after integration of a wind farm. The test systems used in this paper are based on the IEEE 14 bus and IEEE 9 bus systems, as shown in Fig. 3. To show the effectiveness of the proposed method, we have compared our method with different classification algorithms. 4.1. Data preparation MATPOWER (Zimmerman, Murillo-Snchez, & Thomas, 2011) is used to complete simulations. Furthermore, for simulating the power system behavior in a more practical pattern, the load data used in the test systems is based on online load profile from the New York independent system operator (NYISO, 2016). This means real world load values are used in the simulations. More specifically, there are 11 load regions in the NYISO data as shown in Fig. 4. The recorded load data is for every five minutes (except for some days). The load data used in this paper is for the first week of January 2016 (January 1, 2016 to January 7, 2016). This implies that we have 2045 normal samples. To generate the system state data from NYISO load pattern, each load bus of the test system is linked with one region of NYISO using the shown map in Table 3. Then, the load data is normalized to the initial real load of the corresponding test system. Table 3 represents load data character-
248
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
Fig. 3. Used test systems.
Fig. 4. NYISO map of 11 power grid regions in New York State, USA.
istics. Moreover, like the work in (Ntalampiras, 2016b), to mimic the effect of random errors that occur in nature, Gaussian noises with zero mean and the standard deviation of 0.02 are added to the measurements. 4.2. False data injection attack generation For attacking to a specific state variable, all dependent measurements on that state should be replaced with falsified ones. We have simulated attacks to each system state variable θ2 − θn . We also have simulated the simultaneous attacks to multiple buses. These attacks are orchestrated on system states θ 2 , θ 7 , θ 13 for the IEEE 14 bus system and θ 2 , θ 4 for the IEEE 9 bus system. For each attack, it is assumed that the adversary decreases or increases the state variable by 6 percent of its original value. In other words, for each state variable, incremental (106%) and decremental (94%) injection amounts are simulated. 94 percent means that the manipulated state variable is 6% smaller than the true value. 4.3. Attack detection results for the IEEE 14 bus system 4.3.1. Case Study A: detecting attacks without topology changes In this case, we have injected false data into the measurements between 12 a.m. of January 1, 2016 and 16:30 on that day (first day). Therefore, we have 200 samples for each attacking scenario. As mentioned previously, for quantifying the PDs of the system state vectors, new indexes are calculated. Fig. 5a and b show
the visualization of the system state data with injection of false data into the state variable θ 2 without employing the proposed approach. The axes of these plots are principal components (PCs) obtained after employing PCA on the state vectors [x2 , . . . , x14 ]T . As we can see, the normal operation data and the attack data are interwoven. After calculating the new measures for each system state vector, PCA is employed to reduce the dimensions of the data (7 → 2), and then the reduced data is visualized. Fig. 5c and d show the related results. These figures indicate that the grid operator can distinguish between the normal behavior of the system and the anomalous behavior. Fig. 6 shows the results for each targeted state variable for decremental FDI attack (94%). Figures of other attack scenarios are drawn in the Appendix A. As mentioned earlier, for showing the functionality of the proposed method without relying on human skills, we have utilized DBSCAN algorithm. DBSCAN clusters the data based on their density without having any knowledge about their label. For our problem, the aim was to cluster data into normal and attack groups. For comparison purpose, different classification algorithms such as Support Vector Machine (SVM), Multi-layer Perceptron (MLP) and K-Nearest Neighbor (KNN) were tested on this dataset. These algorithms have been used in (Esmalifalak, Liu, Nguyen, Zheng, & Han, 2014; Ozay et al., 2016) to detect FDI attacks. It should be mentioned that the classification models were built and evaluated using 10-fold cross validation. Table 4 presents the results of DBSCAN on the reduced data (i.e. the data visualized in Fig. 6) along with the results of classification algorithms. In this table, each row shows one system state variable that is targeted. Table 5 represents the detection rates of algorithms based on Table 4. Detection rate is the number of attacks detected by the method divided by the total number of attacks in the dataset. As it is clear from Table 4, DBSCAN can detect attacks in most cases. We can see that there is a relation between clustering results and visualization (visual separation) except in some cases. As it is clear from the results, DBSCAN was not successful in detecting attacks to the state variables θ 6 with injection amount 94% and also θ 3 with injection amount 106%. But we believe that the control center operators can distinguish between the normal and anomalous patterns in the visualized mode in Figs. 6e and 7. If we change the values of Eps and MinPts, the detection rate for these scenarios will increase but this setting will lead to a wrong clustering in other attacking scenarios. In other words, the presented detection rate for the proposed method is the minimum value that we can acquire automatically without human involvement. It is noteworthy that other clustering algorithms like Learning Vector Quantization (LVQ) (Kohonen, 1995), FCM and single-linkage (Murtagh & Contreras, 2012) were also tested but they showed high false positive rates in some attacking scenarios.
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
249
Fig. 5. Functionality of the proposed method. Table 4 Results of applying DBSCAN on PCs – Case A. Inject
Proposed method ∗
Org 0.94
θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ 10 θ 11 θ 12 θ 13 θ 14 θ2−7−13
SVM ∗
MLP
∗
Org 1.06
∗
Org 0.94
KNN
∗
Org 1.06
∗
Org 0.94
Org∗ 0.94
Org 1.06
Org∗ 1.06
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
199 199 197 196 115 195 195 196 196 195 200 198 194 200
99.5% 99.5% 98.5% 98% 57.5% 97.5% 97.5% 98% 98% 97.5% 100% 99% 97% 100%
199 24 197 198 198 200 200 200 200 200 198 198 198 198
99.5% 12% 98.5% 99% 99% 100% 100% 100% 100% 100% 99% 99% 99% 99%
155 200 200 200 200 200 200 200 200 200 200 200 200 200
77.5% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
166 199 200 200 200 200 200 200 200 200 200 200 200 200
83% 99.5% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
200 0 200 0 0 0 0 200 0 160 200 0 0 0
100% 0% 100% 0% 0% 0% 0% 100% 0% 80% 100% 0% 0% 0%
200 200 0 198 200 192 184 0 194 42 0 200 200 200
100% 100% 0% 99% 100% 96% 92% 0% 97% 21% 0% 100% 100% 100%
198 200 200 200 200 200 200 200 200 200 200 200 200 200
99% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
200 200 200 200 200 200 200 200 200 200 200 200 200 200
100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
Table 5 Performance of methods – Case A. Method
Detection rate
SVM MLP KNN Proposed method
98.57% 53.03% 99.96% 94.33%
4.3.2. Case Study B: detecting attacks after topology changes In this case, for evaluating the robustness of the proposed method in dealing with topology changes, false data are injected into the power system after a contingency. Contingencies can occur due to many reasons such as planned maintenance, cyber-attacks, etc.(Zonouz et al., 2014). A line (2–5) outage is simulated as a contingency. The mentioned line is depicted in the Fig. 3a with the red line. For this simulation, it is assumed that the system works well until 1 p.m. on the seventh day. The line outage occurs at this moment and system works under that contingency until 9 p.m.. But, the adversary manipulates the measurements for two hours from 6 p.m. to 8 p.m. and replaces these measurements with falsified measurements. From 9 p.m. to 12 p.m. the system recovers from that contingency and continues its normal operation. This means we have 24 attack samples for each state variable under that contingency. Fig. 8 shows the data with and without the proposed method for the state variable θ 2 for this simulation. Fig. 8a and b are the representation of applying PCA on the system state vectors. Fig. 8c and d show the effect of applying the proposed method.
Table 6 represents the results of DBSCAN and the classification algorithms on this dataset. As it is clear from the results, the preconstructed classification models were not successful in detecting and localizing attacks after topology changes since the underlying data distribution changes and old observations become irrelevant to the new ones. Table 6 also shows that DBSCAN was unsuccessful in detecting most of the attacks to θ 3 . However, Fig. 9 shows that visualization and relying on the monitoring abilities of the grid operators can outperform the clustering method in this attacking scenario. Table 7 shows the detection rate of each method for this case study. 4.3.3. Case Study C: detecting attacks after integration of renewable energy sources (RESs) In recent decades, utilization of RESs (wind, solar and etc.) in the smart grid system has grown rapidly (Singh & Sharma, 2017). Incorporation of RESs will increase the uncertainty and variability. This variability will lead to significant state shift in power system operations. Wind is one of the RESs that is used by wind turbines to generate electricity. In this case study, to test the performance of the proposed method in managing integration of RESs into the power system, false data are injected into the system after integration of a wind farm. For this simulation, it is assumed that the system works normally until day 6. On the sixth day, a wind farm is added into the system to provide at most 24 MW of the load on bus 4 as shown in Fig. 3a. The wind farm includes 24 turbines. All wind turbines installed in the system are of the same type and of 1 MW, with rated speed of 14 m/s, cut-in speed of 4 m/s, the rated power of the wind turbine of 10 0 0 kW and cut-out speed
250
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
Fig. 6. Results for decremental FDI attacks.
of 25 m/s like (Zakariazadeh, Jadid, & Siano, 2015). The real hourly wind speeds information have been taken from the Wunderground weather forecast website for Niagara Fall, NY (The Weather Company, 2017). The output power of a wind turbine is calculated by (8).
⎧ 0, ⎪ ⎨
(v
−v
)
Prated × (vrf −vcici ) ⎪ ⎩Prated 0,
0≤
v f ≤ vci vci ≤ v f ≤ vr vr ≤ v f ≤ vco vco ≤ v f
(8)
where vf is the predicted wind speed; Prated is the rated power of the wind turbine; vci , vr and vco are the cut-in speed, rated speed
and cut-off speed of the wind turbine, respectively. Output power of the wind farm is modeled as a negative demand in MATPOWER. For this case, it is assumed that the adversary manipulates the measurements from 6 PM to 10 PM on the last day (seventh day). This means attacks are orchestrated one day after installing the wind farm. Therefore, we have 48 attack samples for each attacking scenario. Fig. 10 shows the data with and without the proposed method for the state variable θ 2 for this case study. Figs. 10a and b show the effect of applying PCA on the system states. Figs. 10c and d represent the effect of applying the proposed method. Table 8 shows the results for this case study. Detection rate of each method is presented in Table 9.
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
251
Table 6 Results of applying DBSCAN on PCs – Case B (after a line outage). Inject
Proposed method Org∗ 0.94
θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ 10 θ 11 θ 12 θ 13 θ 14 θ2−7−13
SVM Org∗ 1.06
MLP
Org∗ 0.94
Org∗ 1.06
KNN
Org∗ 0.94
Org∗ 1.06
Org∗ 0.94
Org∗ 1.06
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
22 10–6 21 24 24 24 24 24 24 24 24 24 24 24
91.66% 41.66% 87.5% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
19 22 24 24 24 24 24 24 24 24 24 24 24 24
79.16% 91.66% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100% 100%
0 0 0 24 0 0 0 0 0 0 0 0 0 0
0% 0% 0% 100% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%
0 0 24 0 0 0 0 0 0 0 0 0 0 0
0% 0% 100% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%
0 0 0 0 0 0 0 0 0 24 24 0 0 0
0% 0% 0% 0% 0% 0% 0% 0% 0% 100% 100% 0% 0% 0%
0 0 0 24 24 24 24 0 0 0 0 24 24 24
0% 0% 0% 100% 100% 100% 100% 0% 0% 0% 0% 100% 100% 100%
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0%
Table 7 Performance of methods – Case B (after a line outage). Method
Detection rate
SVM MLP KNN Proposed method
7.14% 32.14% 0% 96.13%
At 1 p.m., the line is removed and system works under this situation until 9 p.m.. But the measurements between 6 p.m. and 8 p.m. are manipulated by the attacker. From 9 p.m. to 12 p.m. the system recovers from that contingency and continues its normal operation. Therefore, there are 24 attack samples. Table 12 represents the results of DBSCAN and testing pre-constructed classification models on this dataset. Like the results on the IEEE 14 bus system, the pre-constructed models are not able to predict samples correctly after topology changes. Tables 13 shows the detection rate of each method for this case study. 4.4.3. Case Study C: detecting attacks after integration of renewable energy sources (RESs) For this case, it is assumed that a wind farm with nominal generation capacity 25 MW is integrated into the system to provide at most 25 MW of the load on bus 7 as shown in Fig. 3b. This wind farm is introduced to the system on the sixth day. As an attack scenario for this case, the measurements between 6 PM and 10 PM on the last day are replaced with the attacked ones. In other words, the attack is orchestrated one day after installing the wind farm and there are 48 attack samples for each attacking scenario. Table 14 shows the results for this case study. Detection rate of each method is presented in Table 15. 4.5. Locating attacks
Fig. 7. Visualization of FDI attacks to θ 3 – Original value
∗
1.06.
4.4. Attack detection results for the IEEE 9 bus system 4.4.1. Case Study A: detecting attacks without topology changes In this case, it is assumed that the system continues its normal operation until the seventh day. The adversary attacks to the measurements of the last day and inject false data into all of them. This implies that there are 288 attack samples for each attacking scenario. Table 10 presents the results of DBSCAN and created classification models for this case study. Like Section 4.3.1, the models were built using 10-fold cross validation. Table 11 shows calculated detection rates based on Table 10. 4.4.2. Case Study B: detecting attacks after topology changes For this simulation, line 8–9 outage is used as the only contingency. This line is depicted in Fig. 3b with the red line. It is assumed that there is no contingency until 1 p.m. on the seventh day.
We treat the problem of locating the falsified system state, after detecting an attack, as finding an outlier. Therefore, the FCM algorithm which is widely used in the outlier detection context (Hodge & Austin, 2004) is employed on the attacked system state vector to cluster the system states into the attacked and normal states. Table 16 represents the results of locating attacks for the IEEE 14 bus system. The values in this Table indicate the number of correct located attacks. Correct locating means the manipulated state variable is exactly in one cluster and the other ones are in another cluster. Fig. 11a shows the cluster membership grades of the state vector when θ 12 is maliciously altered by injection amount 94%. As we can see, θ 12 is in one cluster and the other state variables [θ2 , . . . , θ11 , θ13 , θ14 ] are in another (second) cluster. θ 12 belongs to the first cluster with membership grade 1. This is called correct locating. On the other hand, Fig. 11b shows cluster membership grades of a state vector, which is classified as an unsuccessful locating. It is clear that [θ 3 , θ 12 ] are in one group and other state variables are in another cluster. The grid operator can easily notice that θ 12 is the attacked state variable as its membership grade is 1. In contrast, he can ob-
252
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
Table 8 Results of applying DBSCAN on PCs – Case C (after integration of a wind farm). Inject
Proposed Method
SVM
Org∗ 0.94
θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ 10 θ 11 θ 12 θ 13 θ 14 θ2−7−13
Org∗ 1.06
MLP
Org∗ 0.94
Org∗ 1.06
KNN
Org∗ 0.94
Org∗ 1.06
Org∗ 0.94
Org∗ 1.06
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
14 41 17 16 48 39 39 48 48 41 45 46 46 48
29.16% 85.41% 35.41% 33.33% 100% 81.25% 81.25% 100% 100% 85.41% 93.75% 95.83% 95.83% 100%
48 44 40 41 46 48 48 48 48 46 45 39 48 48
100% 91.66% 83.33% 85.41% 95.83% 100% 100% 100% 100% 95.83% 93.75% 81.25% 100% 100%
25 27 39 27 37 40 31 48 44 32 32 37 29 48
52.08% 56.25% 81.25% 56.25% 77.08% 83.33% 64.58% 100% 91.66% 66.66% 66.66% 77.08% 60.41% 100%
0 14 48 48 36 40 31 48 43 36 36 40 34 48
0% 29.16% 100% 100% 75% 83.33% 64.58% 100% 89.58% 75% 75% 83.33% 70.83% 100%
12 0 32 0 0 0 0 48 0 48 48 0 0 0
25% 0% 66.66% 0% 0% 0% 0% 100% 0% 100% 100% 0% 0% 0%
22 23 0 31 48 48 48 0 48 0 0 43 48 48
45.83% 47.91% 0% 64.58% 100% 100% 100% 0% 100% 0% 0% 89.58% 100% 100%
0 2 0 0 18 11 5 16 10 11 10 15 10 24
0% 4.16% 0% 0% 37.15% 22.91% 10.41% 33.33% 20.83% 22.91% 20.83% 31.25% 20.83% 50%
0 0 12 2 16 15 5 16 16 10 15 19 9 24
0% 0% 25% 4.16% 33.33% 31.25% 10.41% 33.33% 33.33% 20.83% 31.25% 39.58% 18.75% 50%
Fig. 8. Functionality of the proposed method in managing topology changes.
Table 9 Performance of methods – Case C (after integration of a wind farm). Method
Detection rate
SVM MLP KNN Proposed method
74.25% 44.27% 21.65% 87.27%
Table 11 Performance of methods – Case A.
serve that the membership grade of θ 3 (0.56) is not high enough to be classified as an attack. Therefore, the FCM is able to locate the manipulated state variable in all attacking scenarios, but it misclas-
Method
Detection rate
SVM MLP KNN Proposed method
89.4% 91.51% 91.35 92.88%
sifies a normal state as an attack along with the real manipulated state variable in some cases. However, the false alarms can be alleviated by looking at the membership grades. Table 17 shows the test results for locating attacks for the IEEE 9 bus system.
Table 10 Results of applying DBSCAN on PCs – Case A. Inject
Proposed Method Org∗ 0.94
θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ2−4
SVM Org∗ 1.06
MLP
Org∗ 0.94
Org∗ 1.06
KNN
Org∗ 0.94
Org∗ 1.06
Org∗ 0.94
Org∗ 1.06
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
286 286 232 218 286 286 286 241 286
99.30% 99.30% 80.55% 75.69% 99.30% 99.30% 99.30% 83.68% 99.30%
286 286 232 218 286 286 286 240 288
99.30% 99.30% 80.55% 75.69% 99.30% 99.30% 99.30% 83.33% 100%
288 288 204 179 288 288 288 214 288
100% 100% 70.83% 62.15% 100% 100% 100% 74.30% 100%
288 288 190 175 288 288 288 217 288
100% 100% 65.97% 60.76% 100% 100% 100% 75.34% 100%
288 288 212 203 288 288 288 225 288
100% 100% 73.61% 70.48% 100% 100% 100% 78.12% 100%
288 288 210 204 288 288 288 234 288
100% 100% 72.91% 70.83 100% 100% 100% 81.25% 100%
288 288 186 218 288 288 288 239 286
100% 100% 64.58% 75.69% 100% 100% 100% 82.98% 99.30%
286 288 186 218 288 288 288 239 286
99.30% 100% 64.58% 75.69% 100% 100% 100% 82.98% 99.30%
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
253
Table 12 Results of applying DBSCAN on PCs – Case B (after a line outage). Inject
Proposed method Org∗ 0.94
θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ2−4
SVM Org∗ 1.06
MLP
Org∗ 0.94
Org∗ 1.06
KNN
Org∗ 0.94
Org∗ 1.06
Org∗ 0.94
Org∗ 1.06
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
24 24 11 24 24 24 24 24 24
100% 100% 45.83% 100% 100% 100% 100% 100% 100%
24 24 17 23 24 24 24 24 24
100% 100% 70.83% 95.83% 100% 100% 100% 100% 100%
0 0 0 24 0 0 24 0 0
0% 0% 0% 100% 0% 0% 100% 0% 0%
0 0 24 23 24 0 0 0 0
0% 0% 100% 95.83% 100% 0% 0% 0% 0%
24 0 0 0 0 0 0 0 0
100% 0% 0% 0% 0% 0% 0% 0% 0%
0 0 0 0 0 0 0 0 0
0% 0% 0% 0% 0% 0% 0% 0% 0%
24 24 0 0 24 0 24 0 0
100% 100% 0% 0% 100% 0% 100% 0% 0%
0 0 0 24 0 4 0 0 0
0% 0% 0% 100% 0% 16.66% 0% 0% 0%
Table 15 Performance of methods – Case C (after integration of a wind farm). Method
Detection rate
SVM MLP KNN Proposed method
47.80% 7.63% 42.12% 69.79%
Table 16 Summary of test results for locating attacks – IEEE 14 bus system. Inject
θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ 10 θ 11 θ 12 θ 13 θ 14 θ2−7−13
Fig. 9. FDI attacks to θ 3 – After a line outage (94%). Table 13 Performance of methods – Case B (after a line outage). Method
Detection rate
SVM MLP KNN Proposed method
27.54% 5.55% 28.7% 95.13%
Original value∗ 0.94
Original value∗ 1.06
Located
%
Located
%
146 200 159 155 144 136 136 130 160 166 145 148 137 162
73% 100% 79.5% 77.5% 72% 68% 68% 65% 80% 83% 72.5% 74% 68.5% 81%
159 194 189 190 197 197 196 199 198 199 195 193 198 163
79.5% 97% 94.5% 95% 98.5% 98.5% 98% 99.5% 99% 99.5% 97.5% 96.5% 99% 81.5%
the supervised methods in detecting attacks after topology changes and integration of distributed generations. Note that the presented detection rate for the proposed method is the result of applying DBSCAN on the generated PCs. We believe that the presented detection rate is the minimum value that can be acquired. This is because in some cases, attack and normal patterns are distinguish-
4.6. Discussion Fig. 12 provides a summary of the numerical results. Experimental results show the superiority of the proposed method over Table 14 Results of applying DBSCAN on PCs – Case C (after integration of a wind farm). Inject
Proposed method Org∗ 0.94
θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ2−4
SVM Org∗ 1.06
MLP
Org∗ 0.94
Org∗ 1.06
KNN
Org∗ 0.94
Org∗ 1.06
Org∗ 0.94
Org∗ 1.06
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
Detected
%
48 37 8 8 34 26 33 0 48
100% 77.08% 16.66% 16.66% 70.83% 54.16% 68.75% 0% 100%
46 46 39 26 41 35 39 43 46
95.83% 95.83% 81.25% 54.16% 85.41% 72.91% 81.25% 89.58% 95.83%
48 48 0 0 25 48 40 4 0
100% 100% 0% 0% 52.08% 100% 83.33% 8.33% 0%
48 48 0 0 48 0 48 0 8
100% 100% 0% 0% 100% 0% 100% 0% 16.66%
18 0 0 0 0 0 0 48 0
37.5% 0% 0% 0% 0% 0% 0% 100% 0%
0 0 0 0 0 0 0 0 0
0% 0% 0% 0% 0% 0% 0% 0% 0%
36 33 0 0 17 39 39 1 8
75% 68.75% 0% 0% 35.41% 81.25% 81.25% 2.08% 16.66%
43 39 29 1 39 0 40 0 0
89.58% 81.25% 60.41% 2.08% 81.25% 0% 83.33% 0% 0%
254
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
Fig. 10. Functionality of the proposed method in dealing with integration of RESs.
12
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
2 3 4 5 6 7 8 9 10 11 12 13 14
1
Cluster Membership Grade
Cluster Membership Grade
1 0.9
12
0.9 0.8 0.7 0.6 0.5
3
0.4 0.3 0.2 0.1 0
2 3 4 5 6 7 8 9 10 11 12 13 14
Bus Number
Bus Number
(a) Successful locating of the attacked state variable 12
(b) Unsuccessful locating of the attacked state variable 12
Fig. 11. Determining the set of under attack nodes.
Table 17 Summary of test results for locating attacks – IEEE 9 bus system. Inject
Original value∗ 0.94
Original value∗ 1.06
Located
%
Located
%
θ2 θ3 θ4 θ5 θ6 θ7 θ8 θ9 θ2−4
281 287 263 248 269 287 287 251 225
97.56% 99.65% 91.31% 86.11% 93.40% 99.65% 99.65% 87.15% 78.12%
279 282 265 261 283 287 288 255 223
96.87% 97.91% 92.01% 90.62% 98.26% 99.65% 100% 88.54% 77.43%
able in the visualized mode but the clustering algorithm fails to cluster data correctly. The reason is that the clustering algorithms need parameter tuning and a fixed value for a parameter might not be proper for all situations. This is while visualization doesn’t require any parameter tuning. However, in the future grid, the frequent topological changes and intermittent generation (wind and solar farms) can lead to remarkable state variation in the grid operations (Weng et al., 2016). Hence, any deployed method against FDI attack should be able to address dynamically changing system configurations.
4.7. Threats to validity Two probable threats to validity of this study are discussed here. The first threat is when the adversary has injected false data into the system for a long time. In other words, there is no clean data when the method is deployed. This threat is also true for other detection based methods. However, we believe in that this is rarely practical in the real world. Moreover, it is worth noting that our proposed method is less vulnerable to this threat compared to other detection based methods. This is because the proposed method does not need supervision and will adapt itself with environment. But other detection based methods are more dependent on prior knowledge and historical data and will not be able to predict new samples correctly if they are developed based on attacked measurement samples. The second threat is discarding some PCs in the proposed method. As described previously, only the first two PCs are employed for visualization and other PCs are discarded. Although the third PC can be used in 3-D visualization, in some cases (datasets), information in other PCs will be omitted. This information might be helpful in distinguishing between attack and normal patterns. However, different case studies in this research showed that the proposed method has high accuracy in different situations (attack scenarios) and is also superior to classification algorithms.
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
255
Fig. 12. Detection rates over case studies.
5. Conclusion and future work In this paper, we proposed a visualization based anomaly detection method to detect false data injection attacks. We described a detection methodology that leverages the moments of a distribution to quantify the probability distributions of the system state vectors. Dimensionality of newly generated measures was reduced (from 7 to 2) using principal component analysis algorithm and then the obtained principal components were visualized. Visualization helps us to reveal data patterns and gain valuable insights over legitimate data patterns to detect anomalous patterns. But as another verification to test the performance of the proposed method without relying on the analytical capabilities of humans, DBSCAN was used to cluster the reduced dataset into the attack and normal groups. To determine the set of the nodes that are under attack, Fuzzy C-means was deployed where it was successful to some extent. The proposed unsupervised method has been tested using different attack scenarios. Test results show that the developed method can accurately detect and localize most of the attacks. The proposed method is also capable of detecting attacks after contingencies and integration of wind farms. During the case study C (incorporation of distributed generations), we found it a bit difficult to detect false data injection attacks on few certain state variables. Detecting these attack scenarios is a clear direction for future research. In the future, we also wish to evaluate the performance of the proposed method in de-
tecting other types of data integrity attacks in different power system networks. To this end, we first need to classify data integrity attacks and then examine attack and normal patterns using the proposed method. There might be need to add some new statistical features to detect attacks. In addition, we intend to adapt and extend our unsupervised method for detecting attacks in the smart grid distribution network. It is not easy to use supervised learning methods to detect attacks in the distribution grid because of its random nature. There, we are going to understand any potential limitations of the proposed method and try to address them. Carrying out FDI attacks on other processes and evaluating the detection accuracy of the proposed method accordingly is another future research direction. Petrochemical processes, processes in oil, gas, and food industry are some examples. These processes are usually controlled and monitored by a combination of Supervisory Control and Data Acquisition (SCADA) systems, Distributed Control Systems (DCS), and Process Control Systems (PCS), so detecting FDI attack on them can be a challenging issue. Appendix A The following figures are related to the attack scenarios: (1) incremental – Case A; (2) decremental – Case B; (3) incremental – Case B; (4) decremental – Case C; and (5) incremental – Case C (Figs. 13–17).
256
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
Fig. 13. Results for incremental FDI attacks – Case A.
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
Fig. 14. Results for decremental FDI attacks – Case B.
257
258
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
Fig. 15. Results for incremental FDI attacks – Case B.
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
Fig. 16. Results for decremental FDI attacks – Case C.
259
260
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261
Fig. 17. Results for incremental FDI attacks – Case C.
References Abdallah, A., & Shen, X. S. (2016). Efficient prevention technique for false data injection attack in smart grid. Paper presented at the IEEE international conference on communications (ICC). Abur, A., & Gomez Exposito, A. (2004). Power system state estimation: Theory and implementation. New York: CRC press. Ashok, A., Govindarasu, M., & Ajjarapu, V. (2016). Online detection of stealthy false data injection attacks in power system state estimation. IEEE Transactions on Smart Grid, PP(99), 1–1. Bezdek, J. C., Ehrlich, R., & Full, W. (1984). FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences, 10(2), 191–203.
Bi, S., & Zhang, Y. J. (2014). Graphical methods for defense against false-data injection attacks on power system state estimation. IEEE Transactions on Smart Grid, 5(3), 1216–1227. Bobba, R. B., Rogers, K. M., Wang, Q., Khurana, H., Nahrstedt, K., & Overbye, T. J. (2010). Detecting false data injection attacks on DC state estimation. Paper presented at the workshop on secure control systems (SCS), (April 12). Celenk, M., Conley, T., Willis, J., & Graham, J. (2010). Predictive network anomaly detection and visualization. IEEE Transactions on Information Forensics and Security, 5(2), 288–299. Chaojun, G., Jirutitijaroen, P., & Motani, M. (2015). Detecting false data injection attacks in AC state estimation. IEEE Transactions on Smart Grid, 6(5), 2476–2483. Corchado, E., & Corchado, Á. H. (2011). Neural visualization of network traffic data for intrusion detection. Applied Soft Computing, 11(2), 2042–2056. Dagle, J. (2009). North American SynchroPhasor initiative - an update of progress.
M. Mohammadpourfard et al. / Expert Systems With Applications 84 (2017) 242–261 Paper presented at the system sciences, 2009. HICSS ’09. 42nd Hawaii international conference on. Esmalifalak, M., Liu, L., Nguyen, N., Zheng, R., & Han, Z. (2014). Detecting stealthy false data injection using machine learning in smart grid. IEEE Systems Journal, PP(99), 1–9. Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. Paper presented at the proceedings of the second international conference on knowledge discovery and data mining (KDD-96). Fang, X., Misra, S., Xue, G., & Yang, D. (2012). Smart grid — the new and improved power grid: A survey. IEEE Communications Surveys & Tutorials, 14(4), 944–980. Giani, A., Bitar, E., Garcia, M., McQueen, M., Khargonekar, P., & Poolla, K. (2013). Smart grid data integrity attacks. IEEE Transactions on Smart Grid, 4(3), 1244– 1253. doi:10.1109/TSG.2013.2245155. Goodall, J. R. (2006). Visualizing network traffic for intrusion detection. Paper presented at the proceedings from the sixth annual IEEE SMC information assurance workshop. Grainger, J. J., & William, D. S. (1994). Power system analysis. New York: McGraw-Hill. Hippel, P. V. (2005). Mean, median, and skew: Correcting a textbook rule. Journal of Statistics Education, 13(2) n2. Hodge, V., & Austin, J. (2004). A survey of outlier detection methodologies. Artificial Intelligence Review, 22(2), 85–126. Ishii, Y., & Chakhchoukh, H. (2015). Coordinated cyber-attacks on the measurement function in hybrid state estimation. IEEE Transactions on Power Systems, 30(5), 2487–2497. Jolliffe, I. (2002 October). Principal component analysis (2nd ed.). New York: USA: Springer. Kohonen, T. (1995). Learning vector quantization self-organizing maps (pp. 175–189). Berlin, Heidelberg: Springer Berlin Heidelberg. Koike, H., Ohno, K., & Koizumi, K. (2005). Visualizing cyber attacks using IP matrix. Paper presented at the IEEE workshop on visualization for computer security. Kriegel, H. P., Kröger, P., Sander, J., & Zimek, A. (2011). Density based clustering. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3), 231–240. Liang, J., Sankar, L., & Kosut, O. (2016). Vulnerability analysis and consequences of false data injection attack on power system state estimation. IEEE Transactions on Power Systems, 31(5), 864–3872. Liu, Y., Ning, P., & Reiter, M. K. (2011). False data injection attacks against state estimation in electric power grids. ACM Transactions on Information Systems Security, 14(1), 1–33. Luo, B., & Xia, J. (2014). A novel intrusion detection system based on feature generation with visualization strategy. Expert Systems with Applications, 41(9), 4139–4147. Manandhar, K., Cao, X., Hu, F., & Liu, Y. (2014). Detection of faults and attacks including false data injection attack in smart grid using kalman filter. IEEE Transactions on Control of Network Systems, 1(4), 370–379. Mohsenian-Rad, H., & Rahman, M. A. (2013). False data injection attacks against nonlinear state estimation in smart power grids. Paper presented at the IEEE power & energy society general meeting. Moslemi, R., Mesbahi, A., & Mohammadpour Velni, J. (2017). A fast, decentralized covariance selection-based approach to detect cyber attacks in smart grids. IEEE Transactions on Smart Grid, PP(99), 1–1. doi:10.1109/TSG.2017.2675960.
261
Mousavian, S., Valenzuela, J., & Wang, J. (2015). A probabilistic risk mitigation model for cyber-attacks to PMU networks. IEEE Transactions on Power Systems, 30(1), 156–165. doi:10.1109/TPWRS.2014.2320230. Murtagh, F., & Contreras, P. (2012). Algorithms for hierarchical clustering: An overview. WIREs Data Mining Knowledge Discovery, 2(1), 86–97. Ntalampiras, S. (2016a). Automatic identification of integrity attacks in cyber-physical systems. Expert Systems with Applications, 58(1), 164–173. Ntalampiras, S. (2016b). Fault diagnosis for smart grids in pragmatic conditions. IEEE Transactions on Smart Grid, 1 PP(99). doi:10.1109/TSG.2016.2604120. NYISO. (2016). Load data profile from http://www.nyiso.com. Ozay, M., Esnaola, I., Vural, F. T., Kulkarni, S. R., & Poor, H. V. (2016). Machine learning methods for attack detection in the smart grid. IEEE Transactions on Neural Networks and Learning Systems, 27(8), 1773–1786. Rampurkar, V., Pentayya, P., Mangalvedekar, H. A., & Kazi, F. (2016). Cascading failure analysis for indian power grid. IEEE Transactions on Smart Grid, 7(4), 1951–1960. Ramsey, J., Newton, H., & Harvill, J. (2002). The elements of statistics: With applications to economics and the social sciences. Thomson Learning. Sedghi, H., & Jonckheere, E. (2015). Statistical structure learning to ensure data integrity in smart grid. IEEE Transactions on Smart Grid, 6(4), 1924–1933. Singh, B., & Sharma, J. (2017). A review on distributed generation planning. Renewable and Sustainable Energy Reviews, 76, 529–544. http://doi.org/10.1016/j.rser. 2017.03.034. The Weather Company, L. (Producer). (2017). Weather forecast & reports - long range & local Retrieved from https://www.wunderground.com/. Vukovic, O., Sou, K. C., Dan, G., & Sandberg, H. (2012). Network-aware mitigation of data integrity attack on power system state estimation. IEEE JSAC, 30(6), 1108–1118. ´ M. D. (2016). Robust data-driven state estiWeng, Y., Negi, R., Faloutsos, C., & Ilic, mation for smart grid. IEEE Transactions on Smart Grid, PP(99), 1–12. Yan, J., Zhu, Y., He, H., & Sun, Y. (2013). Multi-contingency cascading analysis of smart grid based on self-organizing map. Transactions on Information Forensics and Security, 8(4), 646–656. Yang, Q., Yang, J., Yu, W., An, D., Zhang, N., & Zhao, W. (2014). On false data-injection attacks against power system state estimation: modeling and countermeasures. EEE Transactions on Parallel and Distributed Systems, 25(3), 717–729. Zakariazadeh, A., Jadid, S., & Siano, P. (2015). Integrated operation of electric vehicles and renewable generation in a smart distribution system. Energy Conversion and Management, 89(1), 99–110. Zimmerman, R. D., Murillo-Snchez, C. E., & Thomas, R. J. (2011). MATPOWER: Steady-state operations, planning, and analysis tools. IEEE Trans. Power Syst, 26(1), 12–19. Zonouz, S., Davis, C. M., Davis, K. R., Berthier, R., Bobba, R. B., & Sanders, W. H. (2014). SOCCA: A security-oriented cyber-physical contingency analysis in power infrastructures. IEEE Transactions on Smart Grid, 5(1), 3–13. Zonouz, S., & Haghani, P. (2013). Cyber-physical security metric inference in smart grid critical infrastructures based on system administrators’ responsive behavior. Computers & Security, 39, 190–200.