Identifying household electricity consumption patterns: A case study of Kunshan, China

Identifying household electricity consumption patterns: A case study of Kunshan, China

Renewable and Sustainable Energy Reviews 91 (2018) 861–868 Contents lists available at ScienceDirect Renewable and Sustainable Energy Reviews journa...

952KB Sizes 0 Downloads 29 Views

Renewable and Sustainable Energy Reviews 91 (2018) 861–868

Contents lists available at ScienceDirect

Renewable and Sustainable Energy Reviews journal homepage: www.elsevier.com/locate/rser

Identifying household electricity consumption patterns: A case study of Kunshan, China

T



Ting Yang , Minglun Ren, Kaile Zhou School of Management, Hefei University of Technology, Hefei 230009, China

A R T I C LE I N FO

A B S T R A C T

Keywords: Electricity consumption patterns Load profiling Smart energy management Case study Smart grid

A case study of residential electricity consumption patterns mining and abnormal user identification using hierarchical clustering is presented in this paper. First, based on a brief introduction of hierarchical clustering, a process model and the specific steps of electricity consumption patterns mining in smart grid environment are proposed. Then, a case study using the daily electricity consumption data of 300 residential users in an eastern city of China, Kunshan, from November 16, 2014 to December 16, 2014, is presented. Through the implementation of hierarchical clustering, 9 abnormal users and 4 types of monthly electricity consumption patterns are successfully identified. The results show that most residential users in Kunshan city, nearly 81%, have a similar monthly electricity consumption pattern. Their average daily electricity consumption is about 7.73 kWh in the early winter with small fluctuations. Also, their daily electricity consumption is significantly associated with the temperature changes. However, it is worth noting that the special electricity consumption patterns of a small proportion of electricity users cannot be ignored, which is of great significance for the planning, operation, policy formulation and decision-making of smart grid.

1. Introduction In the Internet of Things (IoT) environment [1], with the wide application of Radio Frequency Identification (RFID), infrared sensors, Global Positioning System (GPS) and many other advanced sensing devices, the states, movement and operation related data of objects can be collected in near real time. Then based on the high-speed data processing and efficient data mining, intelligent monitoring, identification, tracking, positioning, control and management can be achieved in the IoT environment [2–4]. Smart grid [5–7] is an important application form of IoT in the electric power sector. It takes advantage of the advanced sensing, measuring and control technologies, as well as the two-way high-speed communication networks and agile decision-support systems. Thus, the safe, reliable and stable operation and the costeffective and environmentally friendly goals of power system can be realized. Advanced Metering Infrastructure (AMI) [8] is an advanced data collection and processing system of smart grid. It is composed of smart meters installed in the user-side, metering data management system located in power companies, and the communication system which connect both of them. Smart grid uses the AMI system to collect, measure, and store the consumers’ behavioral data. Then analysis and applications can be conducted with the user generated data. For



Corresponding author. E-mail address: [email protected] (T. Yang).

https://doi.org/10.1016/j.rser.2018.04.037 Received 13 March 2017; Received in revised form 16 October 2017; Accepted 14 April 2018 1364-0321/ © 2018 Elsevier Ltd. All rights reserved.

example, electricity consumers can be divided into groups with different electricity consumption behavioral characteristics using clustering algorithms [9–11]. Then the electricity consumption patterns of each group of consumers can be extracted from the data mining results, which are of great importance for the optimal operation of power system and marketing strategies development [12–14]. In addition, data preprocessing techniques can also be used for bad data identification [15], and supervised classification algorithms can be used to classify a certain electricity user into the corresponding group [16]. A lot of valuable knowledge can be discovered from the users’ electricity consumption data by using advanced data analysis techniques. This knowledge plays an important role in promoting the optimal operation and enhancing the reliability of power systems, achieving personalized electricity consumption and smart energy management. There have been considerable research efforts on the electric power load data analysis and consumers segmentation [17–21]. However, most of the existing studies have focused on the daily load profiles, and few have paid attention to the monthly electricity consumption patterns of residential users using the daily electricity consumption data within a month. Monthly electricity consumption patterns are different from the daily load patterns extracted from the hourly measured or 15-min measured load data. It is important for the medium and long term power system prediction and decision-makings. In addition, some

Renewable and Sustainable Energy Reviews 91 (2018) 861–868

T. Yang et al.

on the annual load profiles, Zhong and Tam [28] studied the extraction of characteristic attributes in frequency domain (CAFD) and the following load classification. Hopf et al. [29] used the annual electricity consumption data over one, three and five years respectively, and the customers’ addresses to infer household electricity consumption characteristics. Kwac et al. [30] investigated the household electricity segmentation using the 66,434,179 electricity consumption load profiles of Pacific Gas and Electric Company (PG&E) residential consumers at 1 h intervals. Granell et al. [31] explored impact of temporal resolution on the load profiles clustering, using a data set with an 8-s sampling rate and emulates lower resolution data sets. Vercamer et al. [32] aimed to assign new electricity consumers to specific groups using the smart meter measurements in 15-min intervals of 6975 consumers in Belgium.

studies focused on complex data mining methods, which may significantly lower the efficiency of big data processing in a smart grid environment. In addition, China is the largest developing country and one of the world's largest economies. The residential electricity users in China may have different electricity consumption patterns from the consumers in other countries. The outlier and noisy data are prevalent exist in the complex smart grid environment. The abnormal electricity use information may be either real data generated by consumers or misleading data led by system failures and external environmental influence. So it is necessary for power companies to identify these consumers with abnormal electricity use behaviors. Specifically, the consumers that have abnormal electricity consumption patterns are regarded as abnormal users in this study. The detection and identification of abnormal users can be used for supporting many decision makings of power distribution company, including the prediction of power requirement at a specific time and some demand side management (DSM) strategies development. The objective of this study is to explore the monthly electricity consumption patterns of residential users and attempt to identify abnormal users using the hierarchical clustering method, with daily electricity consumption data of residential users in Kunshan City, China. Finally, the monthly electricity consumption characteristics of different groups are extracted and the abnormal users are identified. The results of this study can be used for many DSM tasks in smart grid. For instance, the clustering results reveal that different groups of users have different volatility in electricity consumption profiles. The users that have high volatility in electricity consumption within a period are more tend to response to the price- and incentive-based Demand Response (DR) strategies. The results can also support the power distribution company to develop targeted DR strategies or predict electricity demand, according to the amount, trend and variance of typical electricity consumption profiles. The remainder of this paper is organized as follows. Section 2 presents a systematic review of related works. Section 3 introduces the hierarchical clustering method, and then proposes a process model and the specific steps for the exploring of electricity consumption patterns. Results and discussions are presented in Section 4. Finally, Section 5 provides the conclusions.

2.2. Clustering methods for electricity consumption patterns mining Currently, there have been many clustering algorithms and mathematical models that were used for electricity consumption patterns mining. Fidalgo et al. [17] proposed a new clustering algorithm, which related an obtained class profile with the consumers’ billing data, for load profiling. Figueiredo et al. [19] proposed an electricity user characterization framework, which included the load profiling module and the classification module. Räsänen et al. [20] proposed a self-organizing maps (SOM) based data mining technique for the analysis of customer load profile with hourly measured electricity use data of consumers in Finland. SOM was also used by Verdú et al. [33] to filter, classify, and extract patterns from distributor, commercializer, or customer electrical demand databases. For large electricity customer, Tsekouras et al. [34,35] proposed a pattern recognition methodology to determine the representative daily load profiles, and the method was used for the medium voltage customers of the Greek power system. Notaristefano et al. [36] used symbolic aggregate approximation (SAX) to reduce the features and data size of electrical load consumption data. To overcome the problem that cluster number should be predetermined in many clustering algorithm, Granell et al. [37] presented a Bayesian no-parametric model for load profiles grouping, and the so called “Chinese restaurant process” method was used to solve the model based on the Dirichlet-multinomial distribution. K-means is a popular and efficient clustering method, which has been widely used in load profiles grouping. Considering the DSM in a smart grid environment in Brazil, Macedo et al. [38] studied the typification of load curves using k-means clustering method. Al-Wakeel et al. [39] also applied k-means method to cluster domestic smart meter measurements. A combination of k-means++ and SOM method was used for clustering of load curves of a university campus by Panapakidis et al. [40]. Kim et al. [41] developed a k-means based repeated clustering method to estimate the quarter hourly load profiles of non-AMR customers. Fuzzy c-means (FCM) is also a popular clustering method that has been widely used for electricity profiles clustering [42–44]. In addition, there were also some other methods that have been used for electricity consumption data analysis, such as Mixture Model Clustering and Markov Models [45], deep-learning based framework [46], Ant Colony clustering [47], support vector machine (SVM) based model [48], feature-selection based supervised learning method [49] and subspace projection based method [50].

2. Related works In this section, we provide a systematic review of the related research works on household electricity consumption patterns from three dimensions, namely data, methods and applications. 2.1. Different data sources for load profiling In smart grid environment, many different sources and different types of data have been collected and used for identifying consumption patterns and analyzing user behavior. The publicly available Irish smart meter data from the smart metering trial carried out by Commission for Energy Regulation (CER) in Ireland has been wide used in many relevant studies [22,23]. Rhodes et al. [18] presented a clustering analysis of electricity use data from 103 homes in a city of USA at seasonally-resolved timescales, and discussed the drivers for variations in their electricity use. To reduce the number of data stored for each electricity consumer, Carpaneto et al. [21] provided an frequency-domain data definition in consumer classification. High-resolution residential electricity consumption data for detached houses in Sweden were used to fit the probability distributions in the research of Munkhammar et al. [24]. Hopf et al. [25] used the electricity consumption time series recorded in 30-min intervals for household classification. Räsänen and Kolehmainen [26] focused on the time series clustering of electricity use load curves using the real hourly measured data for 1035 customers during 84 days. Pillai et al. [27] investigated the generation of load profiles using publicly available load and weather data. Based

2.3. Applications of electricity consumption patterns There are also some research efforts focusing on the applications of electricity consumption patterns. Electricity consumption data clustering and pattern recognition can be used to support many decision making processes of power system [51–53]. DSM and DR are the most common application areas of electricity load profiles grouping [38,54–56]. In addition, Quilumba et al. [57] investigated the system level intraday load forecasting accuracy improvement based on the clustering of electricity consumers with similar consumption patterns. 862

Renewable and Sustainable Energy Reviews 91 (2018) 861–868

T. Yang et al.

Some other research works also investigated the application of electricity consumption in load forecasting [58–60]. Abreu et al. [61] aimed at identifying habitual behavior by pattern recognition techniques on the electricity consumption data. Albert and Rajagopal [62] proposed a framework to infer the occupancy status characterized by magnitude, duration and variability from the electricity consumption patterns. Orlando et al. [63] studied the sizing of an electric energy generation system based on generating electric load profiles. Though there have some research efforts on the data, methods and application of electricity consumption pattern mining, few have focused on the efficient clustering and intuitional display of clustering results. Also, daily records of electricity consumption within a month have been seldom used in related research works. In addition, the abnormal recognition during the clustering process deserves more attention. 3. Methodology 3.1. Hierarchical clustering Clustering [64–66] is an important content of data mining, artificial intelligence and statistical machine learning. Through the various clustering algorithms, a given data sets can be divided into a certain number of groups, such that the data objects in the same group are as similar as possible, and the data objects in different groups are dissimilar to the maximum extent. Hierarchical clustering is an important clustering method in pattern recognition [67]. It is easy to operate, efficient and practical. Also, the number of clusters is not to be predetermined, and the clustering results are easy to be interpreted. Currently, hierarchical clustering has been widely used in many areas, such as text mining [68], network analysis [69], and gene expression mining [70]. The basic idea of hierarchical clustering is the bottom-up aggregation or top-down split. For the aggregation type hierarchical clustering, each data object is regarded as a separate group initially. Then according to the two-two similarities calculated among all the data objects, the two data objects that having the maximum similarity are merged into the same group, and the new group is represented by their mean value. The two-two similarities among the new group and other groups are calculated again, and the most similar two groups are merged into one group. Iterate like this until all the data objects are clustered as a group. In comparison, the steps of split type hierarchical clustering are just opposite to those of aggregation type method. Namely, all the data objects together are seen as one group, and splitting operation is conducted until each data object is in one group separately. Practically, aggregation type hierarchical clustering is a more commonly used method.

Fig. 1. Process model of electricity consumption patterns mining in smart grid.

(1) Data preprocessing. Due to the deployment of smart meters in smart grid, electricity use data collection is more efficient, and the data quality has been improved. However, some factors, such as extreme weather, hardware and software failures, network communication failures, may also lead to the emergence of many noise data, incomplete data or outliers. Therefore, data preprocessing is required before the implementation of electricity consumption data clustering. (2) Clustering method selection. The efficient analysis and processing of electricity consumption data need advanced data mining techniques. The various clustering methods [71–73] are particularly necessary for the mining of electricity consumption patterns. In the smart grid environment, the scale of electricity use data is large, and the data are in near real time. Therefore, selecting an efficient clustering method is important for the mining of electricity consumption patterns. (3) Parameter settings. It has been demonstrated that some parameters in clustering method may have a significant effect on the clustering results, such as the initial cluster centers of k-means [74] and the fuzzifier parameter in fuzzy c-means (FCM) [75]. Therefore, determining appropriate values of the parameter of the selected clustering method is also necessary before implementing clustering operation. (4) Data clustering. Using the selected clustering method with the determined parameters, the preprocessed electricity use data can be clustered into several groups. The consumers that have similar electricity consumption patterns are divided into the same group. (5) Validation and evaluation. For a given data set, different clustering methods may generate different results. Even the same method with different parameters may also lead to different clustering results. Therefore, it is necessary to validate the results and to evaluate the partitions after clustering. (6) Patterns extraction and representation. Different groups of electricity users are obtained after clustering. For each group, the cluster center is usually used for representing the electricity consumption pattern of consumers in that group. The cluster center contains some key characteristics of that user group. The extracted patterns and quantified characteristic indicators are important for the various decision-makings in smart grid.

3.2. Process model As a kind of IoT application, smart grid achieved the integration of energy flow, information flow and business flow. According to the flow of information and the process of data collection, transmission, storage and analysis, there are four layers in the smart grid architecture, namely data acquisition layer, data transmission layer, data center layer, and data analysis and application layer. The mining of electricity consumption patterns is mainly conducted at the top layer, data analysis and application layer. This process mainly include six steps, namely data preprocessing, clustering method selection, parameter settings, data clustering, validation and evaluation, and patterns extraction and representation. The process model of electricity consumption patterns mining is shown in Fig. 1. 3.3. Specific steps The mining of electricity consumption patterns mainly include the following steps. 863

Renewable and Sustainable Energy Reviews 91 (2018) 861–868

T. Yang et al.

Electricity consumption (kWh)

350 300 250 200 150 100 50 0

11/20

11/25

11/30 Date

12/5

12/10

12/15

Fig. 4. Monthly electricity use profiles of abnormal uses.

Fig. 2. Monthly electricity use profiles of 300 residential users in Kunshan. Electricity consumption (kWh)

120

4. Results and discussions 4.1. Data The data used are actual electricity consumption data of 300 residential users collected by the smart meters in Kunshan City, Jiangsu Province, China. The daily electricity consumption data are collected from November 16, 2014 to December 16, 2014. The 31 days electricity use profiles of the 300 customers are shown in Fig. 2. Fig. 2 shows that the electricity consumption characteristics of these users are not the same, and there are some abnormal electricity consumption behaviors and some abnormal users. It is difficult to divided them into appropriate groups and identify all the abnormal users intuitively. Therefore, advanced data mining method is necessary.

100 80 60 40 20 0

11/20

11/25

11/30 Date

12/5

12/10

12/15

Fig. 5. Monthly electricity use profiles of 292 residential users in Kunshan.

higher than the average level of other users. No. 135, 141, 1, 131, 53 and 75 users also showed some abnormalities in daily electricity consumption or consumption trend. When the 8 abnormal users are eliminated, the monthly electricity use profiles of the remaining 292 residential users are presented in Fig. 5.

4.2. Data preprocessing and abnormal user identification We use the aggregation type hierarchical clustering to analyze the above monthly electricity use profiles. The parameters settings are as follow: The inter-class dissimilarity measure is between-groups linkage, and the dissimilarity measure between data objects is the most commonly used Euclidean distance. Therefore, the hierarchy tree of the above data clustering result is shown in Fig. 3. From the above hierarchy tree, we can find that there are obvious abnormal users in the original data set. The users with highest abnormality are No. 7 and No. 133 user on the rightmost of the hierarchy tree. Other abnormal users shown in the right half of the tree are No. 135, 141, 1, 131, 53 and 75. The monthly electricity consumption profiles of these eight abnormal uses are presented in Fig. 4. As Fig. 4 shows, the electricity consumption of No. 7 user in later November was still at normal level, but it surged since entering December. In the first half December, the highest daily electricity consumption of No. 7 user was 325.56 kWh on December 16, and the lowest was 214.65 kWh on December 10. The abnormality of No. 133 user was that its electricity consumption was always significantly

4.3. Clustering results We conduct aggregation type hierarchical clustering with the same parameter settings in Section 4.2 on the preprocessed 292 electricity users. The resulting hierarchy tree is shown in Fig. 6. Fig. 6 illustrates that there is a more pronounced discrimination in electricity consumption patterns among these 292 electricity consumers. According to the hierarchy tree shown in Fig. 6, the 292 users can be grouped into four groups. The monthly electricity use profiles of each group are given in Fig. 7. The clustering result shown in Fig. 7 demonstrates that there are four different electricity consumption patterns in the 292 residential users in Kunshan. Most users have similar monthly electricity consumption patterns, as shown in Group (c) and Group (d) in Fig. 7. There

Fig. 3. A hierarchy tree of clustering result of 300 users. 864

Renewable and Sustainable Energy Reviews 91 (2018) 861–868

T. Yang et al.

Fig. 6. A hierarchy tree of clustering result of 292 users.

can be provided in the demand side. Therefore, the optimal operation of smart grid and economical management of electricity market can be better achieved.

100 80 60 40 20 0

11/20 11/25 11/30 12/5 Date

4.4. Electricity consumption patterns extraction To extract and represent the monthly electricity consumption patterns of different user groups, cluster centers of the four groups are calculated and presented in Fig. 8. The cluster center of each group is the typical electricity consumption profile of users in that group. Since the results of Figs. 7 and 8 were extracted for users located in a distribution network, the information provided by the clustering results is helpful in supporting power requirement prediction of the generation side. Fig. 8 shows that users in the four groups have a similar characteristic, namely their daily electricity consumption showed an increasing trend after entering the month of December. To test the impact of temperature on the electricity consumption trend in winter of residential users in Kunshan, we

Electricity consumption (kWh)

Electricity consumption (kWh)

are only a small number of users that have special electricity consumption patterns, as shown in Group (a) and Group (b) in Fig. 7. This is not surprising, since the similar electricity consumption patterns that most consumers have represent the overall electricity consumption characteristics of residential users in that area. There are 237 users in Group (d), which account for 81.16% of the total. These users represent the normal electricity consumption patterns in Kunshan city. There are a small proportion of the users in Group (c), which also represent a group of electricity consumption patterns. The number of consumers that have electricity consumption patterns like those in Group (a) and Group (b) is small. However, more valuable knowledge can usually be discovered from the minority of users. This knowledge is of great significance for the operational management of power system and the formulation of targeted DR strategies. For example, according to the number of users that have special electricity consumption patterns and the trends of their daily electricity consumption, more effective power requirement prediction of the generation side can be made. Personalized DSM programs can be implemented and more electricity services

12/10 12/15

100 80 60 40 20 0

11/20 11/25 11/30 12/5 Date

(b) Electricity consumption (kWh)

Electricity consumption (kWh)

(a) 100 80 60 40 20 0

12/10 12/15

11/20 11/25 11/30 12/5 Date

12/10 12/15

100 80 60 40 20 0

11/20 11/25 11/30 12/5 Date

(c)

(d) Fig. 7. Monthly electricity user profiles clustering result. 865

12/10 12/15

Renewable and Sustainable Energy Reviews 91 (2018) 861–868

Electricity consumption (kWh)

Electricity consumption (kWh)

T. Yang et al.

100 80 60 40 20 0

11/20 11/25 11/30 12/5 Date

12/10 12/15

100 80 60 40 20 0

11/20 11/25 11/30 12/5 Date

(b)

100 80 60 40 20 0

11/20 11/25 11/30 12/5 Date

12/10 12/15

Electricity consumption (kWh)

Electricity consumption (kWh)

(a)

12/10 12/15

100 80 60 40 20 0

11/20 11/25 11/30 12/5 Date

(c)

12/10 12/15

(d) Fig. 8. Typical electricity consumption profiles of each group. Table 1 PCC values of four groups and the lowest temperature in Kunshan.

Group (a) Group (b) Group (c) Group (d) Temperature

Group (a)

Group (b)

Group (c)

Group (d)

Temperature

1 0.790** 0.927** 0.930** − 0.782**

0.790** 1 0.829** 0.793** − 0.742**

0.927** 0.829** 1 0.966** − 0.766**

0.930** 0.793** 0.966** 1 − 0.705**

− 0.782** − 0.742** − 0.776** − 0.705** 1

** Correlation is significant at the 0.01 level (2-tailed).

the daily electricity consumption of heating appliances is also gradually increasing. This is one of the main reasons for the increase of daily electricity consumption. It further supports the view that temperature is an important factor in electric load and electricity demand forecasting [77]. There are also significant differences among the electricity consumption patterns of the four groups. On the overall scale of electricity consumption, the electricity consumption of users in Group (a) is the highest in general, which was about 30–40 kWh daily. Consumers in Group (d) have the lowest electricity consumption. The average amount of electricity consumption of users in Group (c) is located between Group (a) and Group (d), and the volatility of daily electricity consumption of these three groups is relatively low. However, the electricity consumption pattern of users in Group (b) is significantly different from the other three groups. Its volatility in daily electricity use is very large, particularly in the first half of December. In the second half of November, its daily electricity consumption was lower than 10 kWh daily, while it surged to 85.95 kWh in December 4. Though there is a decrease subsequently, the daily electricity consumption level in December is significantly higher than that in November. Based on the typical electricity consumption profiles, the electricity

Fig. 9. Temperature changes of Kunshan from November 16, 2014 to December 16, 2014 [38].

collected the temperature changes of Kunshan from November 16, 2014 to December 16, 2014, as shown in Fig. 9. Pearson correlation coefficient (PCC) [76] is a statistical indicator that measures the correlation between two variables. For two n-dimensional variables X = {x1, x2 , ⋅⋅⋅,x n} and Y = {y1 , y2 , ⋅⋅⋅,yn } , their Pearson correlation coefficient is as follows.

r=

n ∑ x i yi − ∑ x i ∑ yi n ∑ x i2 − (∑ x i )2 n ∑ yi2 − (∑ yi )2

(1)

The PCC values of the four groups of electricity consumption patterns and the lowest temperature of Kunshan in the same period are presented in Table 1. From the PCC calculation results, the electricity consumption patterns of the four groups significantly correlated with the lowest temperature of Kunshan. Also, there are significant correlations among the four groups. Kunshan is located in eastern part of China, and the temperature shows significant decreases after entering winter. As a result, 866

Renewable and Sustainable Energy Reviews 91 (2018) 861–868

T. Yang et al.

Table 2 Characteristic indicators of four groups of electricity users. [9] Group

(a) (b) (c) (d)

# of users

Proportion (%)

6 1 48 237

2.05 0.34 16.44 81.17

Min. (kWh)

Max. (kWh)

Avg. (kWh)

Std.

25.34 6.13 13.30 5.82

57.29 85.95 33.68 11.30

36.54 24.47 21.70 7.73

9.54 21.28 7.69 1.77

[10] [11]

[12] [13]

consumption patterns of the four groups can be quantitatively measured. The characteristic indicators of the four groups are summarized in Table 2. From average values (Avg.) in Table 2, we can see that Group (a) has the highest average daily electricity consumption, while the average daily electricity consumption of Group (d) is the lowest. The standard deviation values (Std.) in Table 2 further demonstrate that Group (b) has the highest volatility.

[14]

[15] [16]

[17] [18]

5. Conclusions

[19]

The electricity consumption patterns of residential users in Kunshan City, China were explored in this study, using an efficient hierarchical clustering based model. At the same time, abnormal electricity use behaviors and abnormal users were identified. Based on a systematic review of related works, the hierarchical clustering based process model and specific steps of electricity consumption patterns mining in smart grid were proposed. The actual daily electricity consumption data of 300 residential users collected by the smart meters in Kunshan City, China, were used in the case study. Based on the process model and the clustering method, 9 abnormal users were successfully identified. Further, four residential user groups with significantly different monthly electricity consumption patterns were obtained. The results demonstrated that most residential users in Kunshan have a similar electricity consumption pattern, and their electricity consumption trend was significantly correlated with the temperature changes. However, though the proportion of users that have special electricity consumption patterns is relatively small, they cannot be ignored. Their electricity consumption patterns are important for the planning and operation of power systems, as well as the decision-makings and formulation of DSM strategies.

[20]

[21]

[22] [23]

[24]

[25]

[26] [27]

[28] [29]

Acknowledgements

[30] [31]

This work is supported by the National Natural Science Foundation of China under Grant nos. 71531008 and 71501056, Anhui Science and Technology Major Project under Grant no. 17030901024, Hong Kong Scholars Program under Grant no. 2017-167, and the Foundation for Innovative Research Groups of the National Natural Science Foundation of China under Grant no. 71521001.

[32]

[33]

[34]

References

[35]

[1] Atzori L, Iera A, Morabito G. The internet of things: a survey. Comput Netw 2010;54:2787–805. [2] Gubbi J, Buyya R, Marusic S, Palaniswami M. Internet of things (IoT): a vision, architectural elements, and future directions. Future Gener Comput Syst 2013;29:1645–60. [3] Miorandi D, Sicari S, De Pellegrini F, Chlamtac I. Internet of things: vision, applications and research challenges. Ad Hoc Netw 2012;10:1497–516. [4] Trappeniers L, Feki MA, Kawsar F, Boussard M. The internet of things: the next technological revolution. Computer 2013;46:0024–5. [5] Amin SM, Wollenberg BF. Toward a smart grid: power delivery for the 21st century. Power Energy Mag 2005;3:34–41. [6] Bhatt J, Shah V, Jani O. An instrumentation engineer's review on smart grid: critical applications and parameters. Renew Sustain Energy Rev 2014;40:1217–39. [7] Blumsack S, Fernandez A. Ready or not, here comes the smart grid!. Energy 2012;37:61–8. [8] Karnouskos S, Terzidis O, Karnouskos P. An advanced metering infrastructure for

[36]

[37] [38]

[39] [40]

[41]

867

future energy networks: new technologies, mobility and security. Netherlands: Springer; 2007. Chicco G. Overview and performance assessment of the clustering methods for electrical load pattern grouping. Energy 2012;42:68–80. K-l Zhou, S-l Yang, Shen C. A review of electric load classification in smart grid environment. Renew Sustain Energy Rev 2013;24:103–10. Panapakidis I, Alexiadis M, Papagiannis G. Evaluation of the performance of clustering algorithms for a high voltage industrial consumer. Eng Appl Artif Intell 2015;38:1–13. Jota PR, Silva VR, Jota FG. Building load management using cluster and statistical analyses. Int J Electr Power Energy Syst 2011;33:1498–505. Mahmoudi-Kohan N, Moghaddam MP, Sheikh-El-Eslami M. An annual framework for clustering-based pricing for an electricity retailer. Electr Power Syst Res 2010;80:1042–8. Benítez I, Quijano A, Díez J-L, Delgado I. Dynamic clustering segmentation applied to load profiles of energy consumption from Spanish customers. Int J Electr Power Energy Syst 2014;55:437–48. Zhang X, Cheng Q, Zhou Q. Dynamic intelligent cleaning for dirty electric load data based on data mining. Autom Electr Power Syst 2010;38:143–52. Chicco G, Napoli R, Piglione F, Postolache P, Scutariu M, Toader C. Load patternbased classification of electricity customers. IEEE Trans Power Syst 2004;19:1232–9. Fidalgo JN, Matos MA, Ribeiro L. A new clustering algorithm for load profiling based on billing data. Electr Power Syst Res 2012;82:27–33. Rhodes JD, Cole WJ, Upshaw CR, Edgar TF, Webber ME. Clustering analysis of residential electricity demand profiles. Appl Energy 2014;135:461–71. Figueiredo V, Rodrigues F, Vale Z, Gouveia JB. An electric energy consumer characterization framework based on data mining techniques. IEEE Trans Power Syst 2005;20:596–602. Räsänen T, Voukantsis D, Niska H, Karatzas K, Kolehmainen M. Data-based method for creating electricity use load profiles using large amount of customer-specific hourly measured electricity use data. Appl Energy 2010;87:3538–45. Carpaneto E, Chicco G, Napoli R, Scutariu M. Electricity customer classification using frequency–domain load pattern data. Int J Electr Power Energy Syst 2006;28:13–20. Mcloughlin F, Duffy A, Conlon M. A clustering approach to domestic electricity load profile characterisation using smart metering data. Appl Energy 2015;141:190–9. Haben S, Singleton C, Grindrod P. Analysis and clustering of residential customers energy behavioral demand using smart meter data. IEEE Trans Smart Grid 2016;7:136–44. Munkhammar J, Rydén J, Widén J. Characterizing probability density distributions for household electricity load profiles from high-resolution electricity use data. Appl Energy 2014;135:382–90. Hopf K, Sodenkamp MA, Kozlovskiy I, Staake T. Feature extraction and filtering for household classification based on smart electricity meter data. Energieinformatik 2014. Räsänen T, Kolehmainen M. Feature-based clustering for electricity use time series data. Lect Notes Comput Sci 2009;5495:401–12. Pillai GG, Putrus GA, Pearsall NM. Generation of synthetic benchmark electrical load profiles using publicly available load and weather data. Int J Electr Power Energy Syst 2014;61:1–10. Zhong S, Tam KS. Hierarchical classification of load profiles based on their characteristic attributes in frequency domain. IEEE Trans Power Syst 2015;30:2434–41. Hopf K, Sodenkamp M, Kozlovskiy I, Staake T. Household classification using annual electricity consumption data. D-A-Ch Energieinformatik Conference; 2016. Kwac J, Flora J, Rajagopal R. Household energy consumption segmentation using hourly data. IEEE Trans Smart Grid 2013;5:420–30. Granell R, Axon CJ, Wallom DCH. Impacts of raw data temporal resolution using selected clustering methods on residential electricity load profiles. IEEE Trans Power Syst 2015;30:3217–24. Vercamer D, Steurtewagen B, Poel DVD, Vermeulen F. Predicting consumer load profiles using commercial and open data. IEEE Trans Power Syst 2016;31:3693–701. Verdú SV, Garcia MO, Senabre C, Marín AG, Franco FG. Classification, filtering, and identification of electrical customer load patterns through the use of self-organizing maps. IEEE Trans Power Syst 2006;21:1672–82. Tsekouras G, Kotoulas P, Tsirekis C, Dialynas E, Hatziargyriou N. A pattern recognition methodology for evaluation of load profiles and typical days of large electricity customers. Electr Power Syst Res 2008;78:1494–510. Tsekouras GJ, Hatziargyriou ND, Dialynas EN. Two-stage pattern recognition of load curves for classification of electricity customers. IEEE Trans Power Syst 2007;22:1120–8. Notaristefano A, Chicco G, Piglione F. Data size reduction with symbolic aggregate approximation for electrical load pattern grouping. Iet Gener Transm Distrib 2013;7:108–17. Granell R, Axon CJ, Wallom DC. Clustering disaggregated load profiles using a Dirichlet process mixture model. Energy Convers Manag 2015;92:507–16. Macedo MN, Galo JJ, Almeida LA, Lima AC. Typification of load curves for DSM in Brazil for a smart grid environment. Int J Electr Power Energy Syst 2015;67:216–21. Al-Wakeel A, Wu J, Jenkins N. k -means based load estimation of domestic smart meter measurements. Appl Energy 2016. Panapakidis IP, Papadopoulos TA, Christoforidis GC, Papagiannis GK. Pattern recognition algorithms for electricity load curve analysis of buildings. Energy Build 2014;73:137–45. Kim YI, Ko JM, Song JJ, Choi H. Repeated clustering to improve the discrimination

Renewable and Sustainable Energy Reviews 91 (2018) 861–868

T. Yang et al.

of typical daily load profile. J Electr Eng Technol 2012;7:281–7. [42] Hossain MJ, Kabir ANME, Rahman MM, Kabir B. Determination of typical load profile of consumers using fuzzy C-means clustering algorithm. Int J Soft Comput Eng 2011:1. [43] Zhou K, Yang C, Shen J. Discovering residential electricity consumption patterns through smart-meter data mining: a case study from China. Util Policy 2017;44:73–84. [44] Zhou K, Yang S, Shao Z. Household monthly electricity consumption pattern mining: a fuzzy clustering-based model and a case study. J Clean Prod 2017;141:900–8. [45] Labeeuw W, Deconinck G. Residential electrical load model based on mixture model clustering and markov models. IEEE Trans Ind Inform 2013;9:1561–9. [46] Varga ED, Beretka SF, Noce C, Sapienza G. Robust real-time load profile encoding and classification framework for efficient power systems operation. IEEE Trans Power Syst 2015;30:1897–904. [47] Chicco G, Ionel OM, Porumb R. Electrical load pattern grouping based on centroid model with ant colony clustering. IEEE Trans Power Syst 2013;28:1706–15. [48] Wang Y, Chen Q, Kang C, Xia Q, Luo M. Sparse and redundant representationsbased Smart meter data compression and pattern extraction. IEEE Trans Power Syst 2016:1. [49] Piao M, Ryu KH. Subspace frequency analysis–based field indices extraction for electricity customer classification. Acm Trans Inf Syst 2016;34:12. [50] Piao M, Shon HS, Lee JY, Ryu KH. Subspace projection method based clustering analysis in load profiling. IEEE Trans Power Syst 2014;29:2628–35. [51] Ferreira AMS, Cavalcante CAMT, Fontes CHO, Marambio JES. A new method for pattern recognition in load profiles to support decision-making in the management of the electric sector. Int J Electr Power Energy Syst 2013;53:824–31. [52] Panapakidis IP, Alexiadis MC, Papagiannis GK. Load profiling in the deregulated electricity markets: a review of the applications. Eur Energy Mark 2012:1–8. [53] Wang Y, Chen Q, Kang C, Xia Q. Clustering of electricity consumption behavior dynamics toward big data applications. IEEE Trans Smart Grid 2016;7:2437–47. [54] Cao HA, Beckel C, Staake T Are domestic load profiles stable over time? An attempt to identify target households for demand side management campaigns. Industrial Electronics Society, IECON 2013 - Conference of the IEEE; 2013. p. 4733–38. [55] Kang J, Lee JH. Electricity customer clustering following experts' principle for demand response applications. Energies 2015;8:12242–65. [56] Wang Y, Chen Q, Kang C, Zhang M, Wang K, Zhao Y. Load profiling and its application to demand response: a review. Tsinghua Sci Technol 2015;20:117–29. [57] Quilumba FL, Lee WJ, Huang H, Wang DY, Szabados RL. Using smart meter data to improve the accuracy of intraday load forecasting considering customer behavior similarities. IEEE Trans Smart Grid 2015;6:911–8. [58] Nourbakhsh G, Eden G, Mcveigh D, Ghosh A. Chronological categorization and decomposition of customer loads. IEEE Trans Power Deliv 2012;27:2270–7. [59] Espinoza M, Joye C, Belmans R, Moor BD. Short-term load forecasting, profile

[60] [61] [62] [63]

[64] [65] [66] [67]

[68] [69]

[70] [71]

[72] [73] [74] [75] [76]

[77]

868

identification, and customer segmentation: a methodology based on periodic time series. IEEE Trans Power Syst 2005;20:1622–30. Viegas J. Fuzzy clustering and prediction of electricity demand based on household characteristics. Mathw Soft Comput 2015;22:28. Abreu JM, Pereira FC, Ferrão P. Using pattern recognition to identify habitual behavior in residential electricity consumption. Energy Build 2012;49:479–87. Albert A, Rajagopal R. Smart meter driven segmentation: what your consumption says about you. IEEE Trans Power Syst 2013;28:4019–30. Orlando AF, Málaga MP, Huamani MM. Methodology for generating electric load profiles for sizing an electric energy generation system. Energy Build 2012;52:161–7. Jain AK, Murty MN, Flynn PJ. Data clustering: a review. ACM Comput Surv (CSUR) 1999;31:264–323. Jain AK, Dubes RC. Algorithms for clustering data. Englewood Cliffs: Prentice Hall; 1988. Xu R, Wunsch D. Survey of clustering algorithms. IEEE Trans Neural Netw 2005;16:645–78. MacQueen J. Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. Oakland, CA, USA; 1967. p. 281–97. Steinbach M, Karypis G, Kumar V. A comparison of document clustering techniques. Boston, MA: KDD workshop on text mining; 2000. p. 525–6. Bandyopadhyay S, Coyle EJ An energy efficient hierarchical clustering algorithm for wireless sensor networks. In: Proceedings of the INFOCOM 2003 twenty-second annual joint conference of the IEEE computer and communications IEEE Societies. IEEE; 2003. p. 1713––23. Herrero J, Valencia A, Dopazo J. A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics 2001;17:126–36. Kriegel H-P, Kröger P, Zimek A. Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Trans Knowl Discov Data (TKDD) 2009;3:1. Huang Z. Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min Knowl Discov 1998;2:283–304. Ozturk C, Hancer E, Karaboga D. Dynamic clustering with improved binary artificial bee colony algorithm. Appl Soft Comput 2015;28:69–80. Khan SS, Ahmad A. Cluster center initialization algorithm for K-means clustering. Pattern Recognit Lett 2004;25:1293–302. Zhou K, Fu C, Yang S. Fuzziness parameter selection in fuzzy c-means: the perspective of cluster validation. Sci China Inf Sci 2014;57:1–8. Adler J, Parmryd I. Quantifying colocalization by correlation: the Pearson correlation coefficient is superior to the Mander's overlap coefficient. Cytom Part A 2010;77:733–42. Chen B-J, Chang M-W, Lin C-J. Load forecasting using support vector machines: a study on EUNITE competition 2001. IEEE Trans Power Syst 2004;19:1821–30.