Evaluation of classification methodologies and Features selection from smart meter data

Evaluation of classification methodologies and Features selection from smart meter data

          Availableonline onlineatatwww.sciencedirect.com www.sciencedirect.com Available Energy Procedia ...

697KB Sizes 6 Downloads 31 Views

       

  Availableonline onlineatatwww.sciencedirect.com www.sciencedirect.com Available Energy Procedia 00 (2017) 000–000

ScienceDirect ScienceDirect

www.elsevier.com/locate/procedia

Energy Procedia 142 Energy Procedia 00(2017) (2017)2250–2256 000–000 www.elsevier.com/locate/procedia

9th International Conference on Applied Energy, ICAE2017, 21-24 August 2017, Cardiff, UK

Evaluation of classification methodologies and Features The 15th International Symposium on District Heating and Cooling selection from smart meter data Assessing the feasibility of ausing the heata demand-outdoor Maher Azaza* , Fredrik Wallin temperature function for a long-term district heat demand forecast Future Energy Center Department of Energy, Building and Environment

a b Västerås, Sweden I. Andrića,b,c*, A. Pinaa, Mälardalens P. FerrãoUniversity, , J. Fournier ., B. Lacarrièrec, O. Le Correc a

IN+ Center for Innovation, Technology and Policy Research - Instituto Superior Técnico, Av. Rovisco Pais 1, 1049-001 Lisbon, Portugal b Veolia Recherche & Innovation, 291 Avenue Dreyfous Daniel, 78520 Limay, France c Département Systèmes Énergétiques et Environnement - IMT Atlantique, 4 rue Alfred Kastler, 44300 Nantes, France

Abstract

The choice of the classification algorithm to map the feature vector to a known labelled database signature is an important step toward loads identification in non-intrusive load monitoring NILM. In this paper, we investigate the quality of load recognition when using various smart features and the commonly used classification algorithms. A low Abstract error rate is observed when using classification tree DT, k-NN and support vector machine SVM classifier, the error rate ranges between 20 % and 29 %. Among the smart meter features, the current waveform, the active/reactive power District heating networks are commonly addressed in the literature as one of the most effective solutions for decreasing the and the transient features have higher interesting recognition results when associated with a specific classifier.

greenhouse gas emissions from the building sector. These systems require high investments which are returned through the heat sales. Due to the changed climate conditions and building renovation policies, heat demand in the future could decrease, © 2017 The Authors. Published by Elsevier Ltd. prolonging the investment return period. Peer-review under responsibility of the scientific committee of the 9th International Conference on Applied Energy. The main scope of this paper is to assess the feasibility of using the heat demand – outdoor temperature function for heat demand forecast. The district of Alvalade, located in Lisbon (Portugal), was used as a case study. The district is consisted of 665 Keywords: Loads recognition, NILM, Smart meter, Feature selection buildings that vary in both construction period and typology. Three weather scenarios (low, medium, high) and three district renovation scenarios were developed (shallow, intermediate, deep). To estimate the error, obtained heat demand values were compared with results from a dynamic heat demand model, previously developed and validated by the authors. 1.Introduction The results showed that when only weather change is considered, the margin of error could be acceptable for some applications (the error in annual demand was lower than 20% for all weather scenarios considered). However, after introducing renovation NILM is disaggregation of the total electricity consumption profile ofrenovation a household into individual loads signals. scenarios, thethe error value increased up to 59.5% (depending on the weather and scenarios combination considered). The disaggregation process is not intrusive and carried out without using physical power meters on each individual The value of slope coefficient increased on average within the range of 3.8% up to 8% per decade, that corresponds to the loads. Instead electrical loads hours signatures are used to identify which type(depending of load have been used at specific instant decrease in thethe number of heating of 22-139h during the heating season on the combination of weather and orrenovation during specified frame. NILM an important provision providing a better understanding of the(depending electric power scenariostime considered). On theisother hand, function intercept increased for 7.8-12.7% per decade on the usage andscenarios). cost effective tool tosuggested strengthen the be linkused between the electric utility companies customers. It was coupled The values could to modify the function parameters for and the their scenarios considered, and improve the accuracy of heat demand estimations. © 2017 The Authors. Published by Elsevier Ltd. Peer-review under responsibility of the Scientific Committee of The 15th International Symposium on District Heating and * Corresponding author. Tel.: +46 73-6621308. Cooling. E-mail address: [email protected].

Keywords: Heat demand; Forecast; Climate change 1876-6102 © 2017 The Authors. Published by Elsevier Ltd. Peer-review under responsibility of the scientific committee of the 9th International Conference on Applied Energy.

1876-6102 © 2017 The Authors. Published by Elsevier Ltd. Peer-review under responsibility of the Scientific Committee of The 15th International Symposium on District Heating and Cooling.

1876-6102 © 2017 The Authors. Published by Elsevier Ltd. Peer-review under responsibility of the scientific committee of the 9th International Conference on Applied Energy . 10.1016/j.egypro.2017.12.626

2

Maher Azaza et al. / Energy Procedia 142 (2017) 2250–2256 Author name / Energy Procedia 00 (2017) 000–000

2251

shown that loads specific feedback can provide annual energy saving up to 12 % [1]. Moreover, with the large rollout of smart meters NILM methods has risen rapidly giving more consumer behaviour understanding and enabling more efficient demand side management. Smart meter can analyse and identify electrical loads signature and forward this information to utility. Electric load signatures might be used for load monitoring, load diagnostics and for power quality control on the low voltage substation. NILM research was firstly introduced by George hart in 1990 [2]. The technique is essentially divided into three stages: Features extraction, event detection, and load identification model. The extraction step is to convert the information from the measured signals (current and voltage) into a set of features (e.g. active power, reactive power, current harmonics, etc.) which are used for the load recognition and classification by assigning an electrical signature to each load. The detected events are correlated to a loads signature database to identify the loads status (on or off) and to track the appliances usage over a time period. The identification of the load is performed by using machine learning algorithms to assign a given pattern to a specific load class. In the literature different classification algorithm have been used with various appliances electrical signature but still an evaluation of different classification algorithms is lacking. That can be formulated in the following research question: what classification algorithms are the most efficient towards higher recognition rate of loads signature. What are the best features that may be representative of a given loads in order to save computation time and gain more accuracy on appliances pattern recognition. Various classification methodologies have been proposed in NILM research field. In fact, in [3] carried out the classification of features extracted from active and reactive power using support vector machine SVM and k nearest neighbours’ methods k-NN. It was found that k-NN performs well in term of accuracy but still a complete validation is required since the number of appliances used is very small. [4] proposed a low frequency disaggregation algorithm and a performance comparison when using two approaches namely decision trees DT classification step and dynamic time warping DTW. [5,6] presented an experimental study on electrical signature identification using k-NN classification algorithm showing a good accuracy level of 90 %. [7, 8] have shown the applicability of large margin classifiers artificial neural network ANN, SVM and adaptive boosting Ada-Boost in the loads recognition context. [9] carried out the loads classification by incorporating a Bayes classifier to selforganized map SOM. The reported recognition rate was 100% in the training phase and over 90% when using unknown loads for testing. [10] makes use of Bayes classifier to recognize specific states of a given appliance using (P, Q) features. However, only a handful of loads have been used for evaluations. [11,12] have shown the applicability and the performance of ANN and Hidden Markov Models HMM. However, it was stated that the complexity may increase exponentially as the number of target loads increases which limits the use of HMM for loads identifications. ANN has shown good performance but exhaustive computation time during the training phase is required [13]. [14,15] used a SVM classifier on harmonics and low frequency features. It can be seen from the literature that various classification algorithm has been used for loads recognition with different accuracy rate. A second important point consist on the high dependency of the classifiers performance on the set of load feature. However, a performance comparison of different classifier that can be used in the NILM context is still lacking. In this paper, we assess the performance of classification algorithm using different input features extracted from the publicly available smart meter data PLAID [16]. First the set of the commonly used features at high frequency sampling rate is described. The commonly classification algorithms are investigated toward appliances classification step. The results highlight the performance of various classification algorithms toward loads classification in the NILM context. 2.Residential Appliance characteristics 2.1.Residential appliance types Despite their variety, residential appliances may be classified into one of a few elemental load types based on their characteristics on consuming power in alternating current. Four basic loads types may be identified as either capacitive, resistive, inductive or non-linear loads. Moreover, residential loads may be categorized depending on the number of operational states and their power consumption drawing when they are switched on; The first type of appliances has two specifics states on or off so-called on/off load and it has one power level when active. This type is generally covering resistive loads such as water kettle or hair dryer. The second type of residential loads are multistates having more than one operation state such as washing machine or dishwasher. This type of appliances has an operation cycle where each state may have a specific power level. For both the aforementioned type, the power

Maher Azaza et al. / Energy Procedia 142 (2017) 2250–2256 Author name / Energy Procedia 00 (2017) 000–000

2252

3

variation from one state to another is clearly observable with a specific time step. The third type of loads is characterized by infinite-operation state (e.g. light-dimmers, TVs) since the power consumption varies continuously with no defined step change. Fig.1 shows the power profile of different load types.     





    

 

 







Fig.1 Residential appliances types

2.2.Appliances signatures Electric load signatures provide a unique device specific information which can be used to recognize this device among all appliances at the aggregated level. A taxonomy of appliances features is introduced in [3] and [4] where two categories of loads signatures may be pointed out; steady state signatures and transient signatures. 2.2.1. Steady state signatures The steady-state signatures are the set of electrical characteristics extracted from analysing the device power behaviour at steady state operation and not transitioning between two operational states. Different power quality features can be thus driven and can be classified as follows; (i) Active and reactive power variation: the changes in active power dP and reactive power dQ of a multi-state device may carry out useful knowledge about the load characteristics; the plotted variation of the active power and reactive power in the (P, Q) plan may reveal information about the device type and thus can be used for identification purpose. The challenging aspect when using the dP/dQ features is the close similarity of certain device making an overlap issues on their dP/dQ features which affect the recognition accuracy of these devices that tends to cluster around the origin such as the ventilator and the Fluorescent lamp as shown in Fig.2. 1400

Reactive power Q [VAR]

1200 1000 800

Air Conditioner Fluorescent Lamp Ventilator Fridge Hair dryer Stove Light bulb Laptop Microwave Vacuum cleaner Washing machine

600 400 200 0 -200

0

500

1000

1500

Active power P [W]

2000

Fig.2 Active / Reactive power feature of appliances.

2500



Fig. 3. Current waveform feature.



(ii) Current waveform: The features overlapping issues of certain devices when using PQ diagram may be solved by adding more electrical characteristics about the appliances. This information can be driven from the current signal such as the current waveform which may reflect more detailed information about the type of appliances whether it is resistive, inductive or nonlinear as depicted in Fig. 3. (iii) Voltage- Current trajectory: The shape of the V-I trajectory can provide useful information for further load characterization. Among the parameter that can be used through V-I trajectory are the asymmetry of the V-I curve, the enclosed V-I area and the looping direction. Fig.4 shows the V-I trajectory of different residential loads and the corresponding binary image of the signature.

Author name / Energy 00 (2017)142 000–000 Maher Azaza et al. /Procedia Energy Procedia (2017) 2250–2256

4

Microwave

15

1

Current [A]

Current [A]

5 0

0 -1

-5 -10 -200

-100

100

200

-2 -200



1

5

Current [A]

Current [A]

10

0

Voltage [V] Fridge

0 -5 -10 -200

-100

0

Vacuum cleaner

2

10

Voltage [V]

100

2253

-100

0

100

200

0

100

200

Voltage [V] Laptop



0.5 0 -0.5 -1 -200

200

-100

Voltage [V]

Fig. 4. Voltage-current trajectory feature.



(iv) Harmonics: The harmonic content features could be obtained by using the fast Fourier transform FFT. The harmonic describes the features in the frequency domain and could be useful features to add more knowledge on the appliances type.

Fig.5. Harmonics feature of different load.

2.2.2. Transient state signatures Transient signatures are defined as nonlinear power transition between appliance operation states, such as on/off states, and requires higher sampling frequency to be detected. The loads states transition is mainly marked by three parameters; the over /under shoot power amplitude when turned on, the rise /fall time and the settling time. 180

Settling Time Vacuum cleaner

160

settling time signal mid cross settling point upper boundary upper state lower boundary mid reference upper boundary lower state lower boundary

Active power P [W]

140 120 100 80 60 40 20 0 -20

0

0.2

0.4

0.6

0.8

1

Time t [s]

1.2

1.4

Fig.6. Transient features.

1.6

1.8

2254

Maher Azaza et al. / Energy Procedia 142 (2017) 2250–2256 Author name / Energy Procedia 00 (2017) 000–000

5

3.Features and Classifiers evaluation Each smart meter feature has been tested toward the accuracy of the classifier under evaluation, then a combination of the inputs features has been assessed to check the hypothesis whether the NILM classifiers performance may be further enhanced if more features are used as inputs for the classifiers. The outcomes of this evaluation may be useful towards saving effort when processing smart meter features by ranking the importance of the smart meter features and selecting the appropriate post-processing tools of electricity meters in the NILM context. Furthermore, the performance of each classifier in terms of classification accuracy versus the inputs data is evaluated to point out the suitable classifier regarding a pre-defined set of features. Initial believe that the more the number of input variables characterizing the loads operations is increased the more accurate the recognition rate of the loads is achieved. However, the classifier performance is not really dependent on the amount of input data available or the numbers of input information, instead it depends on the relevance of this information. Therefore, among the disposition input features, it often seems that only few of them can be sufficient to contain the useful and required information describing the observations of interest. These relevant features may be sufficient to highlight the differences in characteristics and to classify the observations into distinguishable classes. Contrariwise, some other input variables may fail to distinguish differences in the observations which may degrade and distort the learning results. That makes these variables detrimental to the accuracy of the classifiers. Other inputs, may not affect the classifier performance but seem unnecessary or determinant for the classification, mostly they reflect a redundancy or a correlation in somehow with the relevant variables. In the NILM context, identifying theses variables may forward a better understanding of the key features that are more likely to encode the characteristics of the loads and to avoid logging or processing unnecessary features. That makes the identification and ranking the smart meter features paramount for time computation and storage savings. 4. Results and discussion In this section different classification algorithms and the commonly used smart meter features are evaluated to point out the relevant algorithms that have more accuracy level toward loads classification and identification task. The most commonly used classifiers have been compared versus different features that may be extracted from the smart meter readings. For instance, five classifiers have been tested; the k-nearest neighbors k-NN, the Naïve Bayes NB, the support vector machine SVM, the artificial neural network ANN and the decision tree classifier DT. The performance of these classifiers have been tested with different set of inputs features extracted from the smart electricity meter. The commonly extracted features used to recognize and to track residential loads usage, as discussed in the previous section, are the steady state features (active/reactive power, the current waveform, the harmonics content, and the voltage-current V-I trajectory) and the transient features (the over/undershoot amplitude, the rise /fall time and the settling time). 0.9 0.7 0.6 0.5 0.4 0.3

at Fe d

nt

ne

Fe

at

ur

ur

es

es

SVM

Co m bi

sie an Tr

Tr -I V

ta d an

aj

ge

ec

to

W

ry

F

s on m ar nt re

Naive Bayes

Cu r

ow ep iv ct Re a e/

iv ct A

ANN

ic

er

F W nt rre Cu

Decision tree

ol

k-NN

0.1

V

0.2

H

Recognition Error

0.8

Fig.7. Classification algorithms performance of smart meter features.

The results of the evaluation, as depicted in Fig.7, shows that k-NN and DT algorithms have the lower error recognition rate among the investigated classification algorithms for various input features. In particular, DT classifier shows further better performance than the k-NN classifier for almost all the features. The recognition error tends to be minimal when all the features are combined, the low error rate is observed when using DT, k-NN and SVM

6

Maher Azaza et al. / Energy Procedia 142 (2017) 2250–2256 Author name / Energy Procedia 00 (2017) 000–000

2255

classifier. The error rate ranges between 20 % and 29 %. Further, it could be seen that a similar result may be achieved when using only the active/reactive power features with the k-NN classifier which implies less computation time. A similar result may be observed when transient features are used with both DT and ANN classifier. When evaluating the most important feature toward a better load recognition rate, it seems that the current waveform, the active/reactive power and the transient features have higher interesting recognition results when associated with a specific classifier. Moreover, combining all the features seems to have insignificant impact on the classification quality. It is to be noted that the recognition error represents the average classification error of all the appliances included in the dataset. The detailed result of classification was evaluated when using k-NN classifier and the active/reactive power as input feature to have more insights on the recognition error per appliance, as shown in the confusion matrix Fig.8. Among all the appliances, the air conditioner and the washing machine present higher recognition error, 66% and 57%. That could be explained due their complex nonlinear power consumption profile and multistate type. Further, that could be due the low number of measurement on those two appliances compared to the rest of loads. Confusion matrix

11 1.0%

4 0.4%

3 0.3%

1 0.1%

4 0.4%

0 0.0%

1 0.1%

2 0.2%

4 42.3% 0.4% 57.7%

0 143 7 0.0% 13.3% 0.7%

1 0.1%

1 0.1%

0 0.0%

1 0.1%

31 2.9%

0 0.0%

0 0.0%

0 77.7% 0.0% 22.3%

Ventilator

19 1.8%

5 0.5%

72 6.7%

3 0.3%

3 0.3%

1 0.1%

16 1.5%

12 1.1%

0 0.0%

0 0.0%

2 54.1% 0.2% 45.9%

Fridge

9 0.8%

0 0.0%

0 0.0%

16 1.5%

4 0.4%

0 0.0%

0 0.0%

0 0.0%

1 0.1%

0 0.0%

1 51.6% 0.1% 48.4%

Hair dryer

1 0.1%

0 0.0%

0 0.0%

5 133 14 0.5% 12.4% 1.3%

1 0.1%

0 0.0%

6 0.6%

2 0.2%

4 80.1% 0.4% 19.9%

Stove

0 0.0%

0 0.0%

0 0.0%

0 0.0%

5 0.5%

12 1.1%

0 0.0%

0 0.0%

3 0.3%

0 0.0%

0 60.0% 0.0% 40.0%

Light bulb

5 0.5%

3 0.3%

17 1.6%

5 0.5%

1 0.1%

1 0.1%

90 8.4%

6 0.6%

0 0.0%

0 0.0%

5 67.7% 0.5% 32.3%

Laptop

0 0.0%

24 2.2%

3 0.3%

1 0.1%

5 0.5%

0 0.0%

2 122 2 0.2% 11.4% 0.2%

0 0.0%

0 76.7% 0.0% 23.3%

Microwave

6 0.6%

0 0.0%

0 0.0%

0 0.0%

1 0.1%

4 0.4%

0 0.0%

1 125 0 0.1% 11.6% 0.0%

6 87.4% 0.6% 12.6%

Vacuum cleaner

1 0.1%

0 0.0%

3 0.3%

2 0.2%

0 0.0%

2 0.2%

0 0.0%

0 0.0%

0 0.0%

33 3.1%

0 80.5% 0.0% 19.5%

Washing machine

3 0.3%

0 0.0%

2 0.2%

1 0.1%

0 0.0%

0 0.0%

0 0.0%

0 0.0%

1 0.1%

1 0.1%

4 33.3% 0.4% 66.7%

Air Conditioner

Predicted

Fluorescent Lamp

22 2.0%

0 0.0%

33.3% 81.7% 62.6% 42.1% 85.3% 34.3% 78.9% 70.9% 89.9% 86.8% 15.4% 71.9% 66.7% 18.3% 37.4% 57.9% 14.7% 65.7% 21.1% 29.1% 10.1% 13.2% 84.6% 28.1%

ng

p

ne

r

hi

ne

ac

ea

m

e

cl

av

m

b

ul

w

hi

as

W

ro

uu

ic

ac

V

M

tb

er

ry

op

pt

La

e

gh

Li

rd

ov

er

am

on

tL

iti

en

or

at

ge

id

ai

St

H

Fr

til

en

V

nd

sc

re

Co

uo

ir

Fl

A

Actual

Fig.8. k-NN recognition performance for household’s loads.

5.Conclusion This paper presented an evaluation of the loads recognition when using different smart meter features as inputs for the classification algorithm. Various commonly used classification algorithms were investigated. The evaluation results show that the quality of classification does not depend on the number of the input features but on the relevance of the features themselves. Among the smart meter features, the current waveform, the active/reactive power and the transient features have higher interesting recognition results when associated with a specific classifier. These features seem to be sufficient to distinguish differences in the loads characteristics and to recognize them with lower error rate. The k-NN and DT algorithms have the lower error recognition rate among the investigated classification algorithms. Acknowledgements The authors would like to thank the Knowledge Foundation (KK-stiftelsen) for the financial support. References [1] K. C. Armel, A. Gupta, G. Shrimali, and A. Albert, “Is disaggregation the holy grail of energy efficiency? The case of electricity,” Energy Policy, vol. 52, pp. 213–234, 2013. [2] G. Hart, “Nonintrusive appliance load monitoring,” Proceedings of the IEEE, vol. 80, no. 12, pp. 1870–1891, 1992. [3] M. Figueiredo, A. D. Almeida, and B. Ribeiro, “Home electrical signal disaggregation for non-intrusive load monitoring (NILM) systems,” Neurocomputing, vol. 96, pp. 66–73, 2012. [4] J. Liao, G. Elafoudi, L. Stankovic, and V. Stankovic, “Non-intrusive appliance load monitoring using low-resolution smart meter data,” 2014 IEEE International Conference on Smart Grid Communications (SmartGridComm), 2014.

2256

Maher Azaza et al. / Energy Procedia 142 (2017) 2250–2256 Author name / Energy Procedia 00 (2017) 000–000

7

[5] M. B. Figueiredo, A. D. Almeida, and B. Ribeiro, “An Experimental Study on Electrical Signature Identification of Non-Intrusive Load Monitoring (NILM) Systems,” Adaptive and Natural Computing Algorithms Lecture Notes in Computer Science, pp. 31–40, 2011. [6] S. Gupta, M. S. Reynolds, and S. N. Patel, “ElectriSense,” Proceedings of the 12th ACM international conference on Ubiquitous computing Ubicomp 10, 2010. [7] H. Murata and T. Onoda, “Applying Kernel Based Subspace Classification to a Non-Intrusive Monitoring for Household Electric Appliances,” Artificial Neural Networks — ICANN 2001 Lecture Notes in Computer Science, pp. 692–698, 2001. [8] J. Liang, S. K. K. Ng, G. Kendall, and J. W. M. Cheng, “Load Signature Study—Part II: Disaggregation Framework, Simulation, and Applications,” IEEE Transactions on Power Delivery, vol. 25, no. 2, pp. 561–569, 2010. [9] L. Du, J. A. Restrepo, Y. Yang, R. G. Harley, and T. G. Habetler, “Nonintrusive, Self-Organizing, and Probabilistic Classification and Identification of Plugged-In Electric Loads,” IEEE Transactions on Smart Grid, vol. 4, no. 3, pp. 1371–1380, 2013. [10] A. Marchiori, D. Hakkarinen, Q. Han, and L. Earle, “Circuit-Level Load Monitoring for Household Energy Management,” IEEE Pervasive Computing, vol. 10, no. 1, pp. 40–48, 2011. [11] A. G. Ruzzelli, C. Nicolas, A. Schoofs, and G. M. P. Ohare, “Real-Time Recognition and Profiling of Appliances through a Single Electricity Sensor,” 2010 7th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks (SECON), 2010. [12] T. Zia, D. Bruckner, and A. Zaidi, “A hidden Markov model based procedure for identifying household electric loads,” IECON 2011 - 37th Annual Conference of the IEEE Industrial Electronics Society, 2011. [13] S. Srivastava, J. R. P. Gupta, and M. Gupta, “PSO & neural-network based signature recognition for harmonic source identification,” TENCON 2009 - 2009 IEEE Region 10 Conference, 2009. [14] T. Kato, H. S. Cho, D. Lee, T. Toyomura, and T. Yamazaki, “Appliance Recognition from Electric Current Signals for Information-Energy Integrated Network in Home Environments,” Lecture Notes in Computer Science Ambient Assistive Health and Wellness Management in the Heart of the City, pp. 150–157, 2009. [15] G.-Y. Lin, S.-C. Lee, J. Y.-J. Hsu, and W.-R. Jih, “Applying power meters for appliance recognition on the electric panel,” 2010 5th IEEE Conference on Industrial Electronics and Applications, 2010. [16] PLAID: the plug load appliance identification dataset. A public dataset of high resolution for load identification research. Plaidplug.com