Hidden Markov Models revealing the household thermal profiling from smart meter data

Hidden Markov Models revealing the household thermal profiling from smart meter data

Energy and Buildings 154 (2017) 127–140 Contents lists available at ScienceDirect Energy and Buildings journal homepage: www.elsevier.com/locate/enb...

4MB Sizes 8 Downloads 62 Views

Energy and Buildings 154 (2017) 127–140

Contents lists available at ScienceDirect

Energy and Buildings journal homepage: www.elsevier.com/locate/enbuild

Hidden Markov Models revealing the household thermal profiling from smart meter data Anatoli Paul Ulmeanu a,∗ , Vlad Stefan Barbu b , Vladimir Tanasiev a , Adrian Badea a,c a b c

Department of Energy Generation and Use, Polytechnic University of Bucharest, Romania Laboratoire de Mathématiques Raphaël Salem, Université de Rouen, France Academy of Romanian Scientists (AOSR), Bucharest, Romania

a r t i c l e

i n f o

Article history: Received 30 December 2016 Received in revised form 18 July 2017 Accepted 16 August 2017 Available online 23 August 2017 Keywords: Hidden Markov chain Emission probability matrix Sequence observation Building thermal load profile

a b s t r a c t This work describes a methodology based on Hidden Markov Models (HMMs) that are applied for revealing household thermal load profiles which are not available to direct observation. This research is motivated by the necessity of reducing the energy consumption for cooling and heating in residential buildings. Our methodology uses data that is becoming readily available at households – hourly energy consumption records collected from smart electricity meters, as well as hourly outdoor air temperature records. The heat transfer regime, namely the states corresponding to lower or higher building hourly thermal loads related to the outdoor air temperatures, will be considered as the underlying mechanism affecting the generation of observations. We aggregate the observed data to obtain a certain number of clusters. The problem of HMM estimation is addressed and the subsequent HMMs are compared on the basis of information criteria, like Akaike and Bayesian Information Criteria. Our goal is to reveal the dynamic of building thermal load (heating/cooling) under the uncertainties induced by the residents’ behavior. Consequently, we present examples of thermal load profiles generated using our best HMM on a testing facility located in the Polytechnic University of Bucharest campus, namely the UPB’s passive building house. © 2017 Elsevier B.V. All rights reserved.

1. Introduction Considering that half of the European Union (EU) energy consumption comes from heating and cooling spaces, finding a good solution to reduce it without compromising the user’s comfort remains an open challenge [1]. Current solutions designed for the final consumer are based on the following key parameters: efficiency of building spaces, complexity and scalability of a given solution to multiple building spaces. Due to the heterogeneity of buildings operation, using a customized solution can be costly and inefficient. One way to address this problem is to implement a more flexible supply and demand scheme, a reduction in the demand and an evaluation of the response mechanisms. The current research is motivated by the necessity of reducing the energy consumption for cooling and heating in residential buildings by using a data-driven model. In this paper we propose a

∗ Corresponding author at: 313 Splaiul Independentei, EG-EH Buildings, 060042 Bucharest, Romania. E-mail address: [email protected] (A.P. Ulmeanu). http://dx.doi.org/10.1016/j.enbuild.2017.08.036 0378-7788/© 2017 Elsevier B.V. All rights reserved.

discrete model based on Hidden Markov Models (HMMs). A strong motivation for implementing HMMs in this direction is given nowadays by the following challenges in the energy sector: a very large amount of data recorded by the smart meters/energy boxes in every building, for every customer and the need for intelligent data mining techniques. Today, the technology automates the mining process, integrates it with the commercial data and presents it in a relevant way for business and energy savings scheme. The big data analytics is recognized as the main engine of the energy sector, moving it from the well-known periods of lack of data/poor data quality into a highly connected world, many data streams and predictive markets. Following the interest for Markov models in the energy sector, the effort on the applications of HMMs in this field was initiated and continued. HMMs provide general-purpose models for univariate and multivariate time series, including the seasonal and cyclic time series, as the case in many applications in the energy field. It is worth mentioning that a predictive controller based on a HMM with its parameters calibrated for the behavior of the building/weather/user comfort may reduce energy demand for heating/cooling in different building construction types. First, we have a discrete, time dependent, series of data on heating or cooling heat losses. The HMM states are generated based on

128

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140

the available process records history. The methodology of thermal profiling for the testing facility under Romanian climate condition is based on the estimation of the HMM parameters and on the decoding of the most probable state sequence. In the scientific literature there is a wide range of applications based on HMMs. Jafarzadeh et al. applied HMM and Viterbi algorithms for an hour ahead wind power prediction for power systems which showed good correlation between actual and predicted data [2]. So far the studies related to the residential buildings confirmed the use of HMMs in the following fields: • Behavior characteristics. An important research direction focuses on estimating the probability of the occupants’ presence in the building but also on determining the occupancy profile in residential buildings. The importance of the user occupancy is related to both energy consumption and user behavior. Richardson et al. [3] presented a detailed analysis regarding occupancy in UK households. The authors used Markov chains (MC) techniques to generate synthetic data at a 10-min resolution using a different scenario for weekdays and weekends and validated them by using 2000 Time Use Survey (TUS) data. The approach indicated that the model performs very well in terms of producing data with statistical characteristics similar to the original TUS data. Andersen et al. [4] used Markov chains for dynamic modeling of occupants’ presence toward achieving reliable simulation of energy consumption in buildings. In [4] the authors compared a large data survey for domestic building occupancy, collected with a 10-min resolution, to synthetic data generated by Markovchains techniques. The comparisons indicated that the model performs very well in terms of producing results with statistical characteristics similar to the original data. Bicego et al. [5] investigated if the electrical energy consumption of a user can be a distinctive behavioral biometric trait, which is a novelty in the literature. Lu et al. [6] used simple sensing technology in order to automatically detect occupancy patterns in a house and used these patterns for saving energy by automatically turning off the Heating, Ventilation & Air Conditioning (HVAC) system of that house. The main tool used in this paper is a Hidden Markov Model with three states: Away, Active and Sleep. • Load profiles. In the scientific literature the load demand is strongly related to user activities. In [7] the authors combined the behavior characteristics using Monte Carlo (MC) method with aggregated load for developing a methodology for low voltage load models. The electric load profiling analysis was also highlighted by Richardson et al. [8], a combination of occupancy patterns and the generated electricity demand of all major appliances found in the domestic environment, with a resolution of 1 min. The approach was based on constructing Markov-chain occupancy profile and appliance-activity mapping. The study was validated on 22 dwellings by showing good daily correlation between the approach and collected data [8]. Other relevant research was done by Widén et al. using a model with three states: absent, at home and sleeping, at home and active, associated with nine energy-user activities. The authors converted the occupancy profile to lighting demand, thus obtaining a well-functioning Markov chain model [9]. In [10], the authors were able to build a thermal profile of energy consumption using Hidden Markov Models for demandresponse (DR) programs that focus on the temperature-sensitive part of residential electricity demand. A Hidden Markov Model is also used in [10] in order to identify the electric loads. This problem of energy disaggregation, i.e., the decomposing of the household’s energy consumption into individual appliances in order to improve energy utility efficiency, has been addressed by many researchers (cf., e.g., [11–17]). Different math-

ematical techniques have been used for this purpose: Factorial Hidden Markov Models [11–13], clustered regression [14], and conditional random fields [15]. Note also the work of Kelly and Knottenbelt [16] on a metadata schema for energy disaggregation and the valuable survey of Zoha et al. [18] on non-intrusive load monitoring methods and techniques for disaggregated energy. In [17], Ghosh et al. are interested in models for consumer demand response estimation in a smart grid. A stochastic model based on Markov chains is introduced; this model differentiates two components of consumers demand response behavior: a long-term steady component and a short-term dynamic component. An interesting work is the one of Ardakanian et al. [19] that also addressed the problem of electricity consumption profiles based on a tractable time-series autoregressive model (Periodic Auto Regression with eXogenous variables – PARX) that is able to isolate the effect of external temperature on electricity consumption. • Forecasting energy consumption. Important scientific literature is dedicated to forecasting the energy consumption based on various stochastic models. Thus, Ardakanian et al. [20] made use of Markov models for predicting the home electricity consumption, while Bondu and Dachraoui [21] used Markov chains combined with co-clustering models in order to simulate individual electricity consumption. State space models are used in [22–24] in order to predict short-term power consumption for smart grids. A significant number of researchers (see, e.g., [17,19–21,25,26]) uses Grey Markov models in order to forecast the energy consumption. Another research direction for estimating energy consumption in buildings presented in [27] is based on a recently developed stochastic model, called Conditional Restricted Boltzmann Machine. This method is compared with recurrent Artificial Neural Networks (ANNs) and Hidden Markov Models. It is worth mentioning that a better accuracy can be obtained in a refined model that uses a series of predictors in cascade [28]. While the effort to obtain such system is almost np times more complicated, where np is the number of successive predictors, the time for retraining remains almost the same since each predictor could run on a separate machine, these devices being parallelized. Genetic algorithm could also be employed to enhance the ANN training for data predictions. But this way is time consuming. In [29] there is a detailed comparison between training methods that includes a genetic approach. Finally, let us also cite the monograph of Berk [30] dedicated to stochastic methods for modeling and predicting electricity demand. The aim of this paper is to highlight the concept of HMM as a strategy to reveal details about the hidden states and to discover the insights they might provide about the energy demand in general, and the household thermal profiling in particular. Our research has been applied to one of the Passive House located in the University Politehnica of Bucharest (UPB) campus. The data acquisition system is described in Section 2. Section 3 defines the problem and the assumptions which will be later used in our research. Section 4 reveals the approach we propose and the algorithm. We leverage this approach in Sections 5 and 6 to more effectively detect the optimal HMMs, together with an estimator of energy demand, based on the concept of effective energy rate. Anomaly detection algorithm based on likelihood estimation is presented into a machine learning perspective in Section 7. In the last section we present the main conclusions, discussion and future work. 2. The data acquisition system We conduct our research in Politehnica Passive House equipped with a monitoring and data acquisition system. The building com-

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140

129

Fig. 1. Politehnica passive house: (a) southern side; (b) northern side.

Fig. 2. The HVAC schema for the “East House”. Left: airstreams and the position of the temperature sensors. Right: 1 – EAHE; 2 – condensation tower; 3 – by-pass; 4 – MVHR; 5 – electric radiant panels.

prises in the same envelope two houses separated by a common wall, having a total surface of 140 m2 . From an architectural point of view, the two houses are similar. Each house includes a hall, a living room, a small bathroom, a kitchen, and a technical room on the ground floor and two bathrooms, two bedrooms, an office and a hall on the first floor. The only difference between the two houses is given by the design of HVAC systems. Both HVAC systems are based on renewable energy sources like geothermal and solar energy and earth-to-air heat exchanger. To take advantage of the solar energy, the building has a dedicated architecture, based on a southern orientation (Fig. 1). The building has large windows on the southern side and small windows on the north. One of the houses has an eastern orientation (called “East House”), while the other has a western orientation (called “West House”). The research presented in this paper was performed for the “East House”, especially designed for benchmarks. An Earth-to-Air Heat Exchanger (EAHE) solution has been investigated and implemented in the HVAC system operating in the “East House”. The researchers modeled several EAHEs and suggested that the best design EAHE for the “East House” and the local climate conditions would use 39 m of pipe with a diameter of 0.20 m, buried 2 m deep, with up to 200 m3 /h air flow rate. A Mechanical Ventilation with Heat Recovery (MVHR) system has been also proposed to supply the fresh air and to remove the exhausted air. The airstreams run through a device called a heat exchanger that allows the outgoing air to pass most of its heat to the incoming air without the two airstreams actually mixing together. The schema for the HVAC system of the “East House” is presented in Fig. 2. The MVHR system relies also on the electrical resistance shown in Fig. 2, that can frost up the exchange surfaces during extremely cold winters. The air

quality within a room is controlled based on a CO2 sensor associated to the rate of ventilation flowing the fresh air introduced in the room. In Table 1 the thermal properties of the buildings elements that compose the eastern elevation of the house, namely “East House”, are summarized. The thermal envelope of the building complies with the testing facility standard with respect to thermal transmittance, Uexterior.walls < 0.15 [W/m2 /K] and Uexterior.windows < 0.8 [W/m2 /K]. The related works of Badea et al. [31,32] are worth mentioning here, as their purpose was to build a mathematical model in Passive House – UPB, based on the analysis of the life-cycle cost. Fourteen types of houses were also analyzed in [33], each house being differentiated by the type of renewable solution used. The “East House” has a web-based control system that allows the meter readings. The system gathered indoor environmental data at certain acquisition rates (Table 2), including ambient temperature, relative humidity (RH) and CO2 concentration in selected locations of the house (Fig. 3). Real-time results were made available online so that were able to access them remotely. The monitoring campaign lasted 12 months (January 2014–December 2014), to assess the thermal behavior of the house during different seasons. There were certain gaps during the monitoring period due to malfunctioning of the sensors; however, as such a long term monitoring was carried out, these gaps do not compromise the analysis of the overall performance of the house. The technical information about the web-monitoring solution and the description of the wireless sensors network in the Politehnica house can be found in [34–36]. The main requirements of the monitoring system are listed in Table 2.

130

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140

Table 1 Thermal properties of the building elements [31]. Envelope component

Surface [m2 ]

Thickness [mm]

Thermal conductivity,  [W/m/K]

Walls

Roof

Plaster Reinforced concrete Mineral wool

96

22 130 400

0.8 1.74 0.04

U-value 0.107 [W/m2 /K] Walls

Exterior

Interior plaster Cellular concrete Mineral wool

182.52

22 250 300

0.8 0.27 0.04

Ground floor

Parquet OSB board Lightly reinforced mortar EPS high density Reinforced concrete XPS polystyrene Lightly reinforced mortar

94.40

22 8 50 150 120 180 50

0.2 0.13 1.1 0.04 1.74 0.04 1.1

Party wall to neighbor

Plaster Solid brick Plaster

86.72

22 250 22

0.8 0.27 0.8

U-value 0.122 [W/m2 /K] Walls

U-value 0.114 [W/m2 /K] Walls

Element

Component

Surface [m2 ]

G-value

U-value [W/m2 /K]

Windows

Low-E glass Frame

29.87

0.5 –

0.6 0.78

Door

Door

2.19



0.8

Table 2 The requirements for the monitoring solution. Reliability

Scalability

Third party app

Access to the monitoring A feature which starts automatically Capability of the solution to and configure the devices available process a growing amount of system using web services in the network data

Time step

Maneuverability

Data collection at specified time rate (1, 5, 10, 15, 30, 60 min)

Data export in various formats

Fig. 3. Position of sensors shown on ground (left) and first floor (right) of Politehnica house.

The DataLayer solution is used to store the collected data into a common database. A conceptual model was developed for collecting the data (see Fig. 4). The storage model is built on six SQL tables. For each reading session a sessionID and a deviceGroupID are assigned. The sessionID is linked to a table where the date of the reading is stored while the deviceGroupID is linked to a table where the hardware platform information is stored. All collected data are stored in the DataStore table. The DataLogger solution is a software module which reads, decodes and describes the collected data from sensors and stores the information into the common database. The data retrieval is made using two separate threads: one for comfort analysis (internal and external temperature, CO2 concentration, relative humidity) and the second thread for consumption analysis. The secured con-

nection with the database is ensured by using a separate third thread as shown in Fig. 5. The trigger builder sets the time to “wake up” and to schedule a job that is handled by the code reading the sensors recorded data. The electric parameters are read every second, while the other comfort parameters are read every 10 min. The electric parameters are first collected as *.csv files on raspberry pi through SMX software, an open source code. The stored files are later stored in the common database after being downloaded and parsed. The electric parameters are collected using a time step of 1 s. The DataViewer solution is used to export in known formats (*.pdf, *.xlsx, *.docx) in an organized way. The exported data can be

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140

131

Fig. 4. Conceptual model database diagram for monitoring system – Politehnica Passive House.

Fig. 5. In house developed monitoring solution.

reached either through web portal or web services, allowing third party. 3. Assumptions and preliminaries A discrete-time Hidden Markov Model is a stochastic process generated by two interrelated probabilistic mechanisms (cf., e.g., Rabiner [37,38], Ephraim and Merhav [38]). These are: (a) an underlying unobserved Markov chain (MC) and (b) a set of random functions, each of them associated with its respective state. At discrete instants of time, the process is assumed to be in a certain state and an observation is generated by the random function corresponding to the current state. The underlying MC then changes its state according to its transition matrix. An observer sees only the output of the random functions associated with each state and cannot directly observe the states of the underlying MC. Consequently, the MC is said to be hidden in the observations. The objective is to estimate the states of the chain, given the observations. To be more specific, a HMM is described by: 1 An unobserved Markov chain J = (Jn )n ∈ N of state space E = {1, 2, . . ., M}, where N is the set of nonnegative integers; the behavior of the Markov chain is described by the probability transition matrix P = (pij )i,j ∈ E where pij = P(Jn+1 = j|Jn = i), n ∈ N, and by its initial distribution denoted by ˛ = (˛(i))i ∈ E , where ˛(i) = P(J0 = i). 2 An observed sequence of random variables Y = (Yn )n ∈ N with state space A = {1, 2, . . ., Q}. The random variables (Yn )n ∈ N are assumed to be conditionally independent, given (Jn )n ∈ N . That is to say the following conditional independence holds true. P(Yn = a|Yn−1 = .; ..; Y0 = .; Jn = i; Jn−1 = .; ..; J0 = .) = P(Yn = a|Jn = i) = Ri;a

(1)

For the given state spaces E and A, let us denote by ME×A the set of nonnegative 2-dimensional matrices on E × A. So, R = (Ri;a ; i ∈ E, a ∈ A) ∈ ME×A is the conditional distribution of the chain Y, given the unobserved underlying Markov chain J. This is called the emission probability matrix.

Let us now specify the state spaces of the processes (Yn )n ∈ N and (Jn )n ∈ N in our case. First, in our current HMM study the states E of the MC represent different levels of dynamic (hourly) building thermal loads (heating/cooling). As it is well known, the choice of the number of states of the hidden process in a HMM is not a problem that has a clear theoretical solution. As we will detail in the next section, we will consider 2, 3, 4 or 5 states for the Markov chain. The choice of the number of states will be based on information criteria. Second, our observed data represents the electric energy (hourly) consumed by HVAC system together with the outdoor air temperature. In the following section we will show how we can group our data in order to obtain a discrete state space A for the observed process Y such that: (i) the grouping of the states is done in such a way that it captures some important, crucial information in the data; (ii) we get a tractable Hidden Markov Model. We will consider the case with either two, three or four types of observations. Throughout this paper, we will assume that:  The HMM is homogeneous with respect to time, that is to say that its transition probabilities do not depend on the time index n and also, the conditional distribution of Yn , given Jn , does not depend on n as well.  The Markov chain (Jn )n ∈ N is ergodic.  The underlying Markov chain is stationary; that means that the initial distribution is the stationary one. In other words, the process under study is assumed to be in an equilibrium position. Note that a non-stationary HMM is completely determined by its parameters  = (˛, P, R), while a stationary HMM is completely determined by its parameters  = (P, R). Note also that the time parameter is discrete and the state spaces of these two probabilistic mechanisms are both finite and discrete. In the following section we will have to solve several problems, namely: estimate the parameter , given the observation (via the Baum–Welch algorithm); decode the most probable sequence of states (via the Viterbi algorithm) and choose an optimal model, according to an information criterion (like AIC or BIC). The reader

132

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140

is also referred to Cappé et al. [39] for an overview of statistical and information-theoretic aspects of HMMs. 3.1. Indoor and outdoor environment parameters of buildings and the data used in our HMM model One of the largest end-uses of energy is thermal conditioning (HVAC) in response to outdoor temperatures – heating/cooling of premises – which makes up for around 30% of the total energy consumed in Romania. Most of the current debates in energy consumption versus buildings models encompass the analysis of the actual and theoretical building model performance, i.e. matching of measured and simulated energy consumption. As we have mentioned, the data about actual performance are collected quantitatively: indoor temperature (Ti ), outdoor temperature (T), air relative humidity (RH) and concentration of carbon dioxide (CO2 ). Continuous monitoring of these indoor environmental parameters is conducted for 30–60 days and a Coefficient of Variation of the Root Mean Square Error (CV − RMSE) is usually proposed to evaluate the agreement between the measured and the simulated data. Real weather data are also proposed to be acquired from a weather station in proximity of building, instead of using historical data for the area. These kinds of building-calibration approaches are currently used for the testing facilities in order to calibrate the thermal and the ventilation system, as well as the solar shading devices (roughly speaking the HVAC system). The residents’ behavior could be considered as the most important source of uncertainty during this calibration. In our approach, we consider that this process of calibration has been already done. Our goal is to reveal the dynamic of building thermal (hourly) load (heating/cooling) underlying the HVAC loads, under the uncertainties induced by the residents’ behavior. The values of building thermal (hourly) loads are not accessible through direct observations. 4. HMM strategy A central contribution of this paper is a novel method of applying HMMs to achieve more accurate insights about the household thermal profiling. The relevant approaches usually give an estimate of the building thermal load (defined as heating/cooling) at a certain space time window. The parameters to be considered are: the influence of heat transfer, ventilation, solar radiation, relative humidity, a comfort zone where at least 80% of the residents are satisfied, temperatures in any adjacent unheated spaces. The uncertainties in the involved parameters are considerable and difficult to be determined. Taking this consideration as a starting point, HMMs are applied in this work to fill the gap of unknown thermal hourly load (heating/cooling) levels. The heat transfer regime, namely the states corresponding to lower or higher building hourly thermal loads related to the outdoor air temperature, will be considered as the underlying mechanism affecting the generation of observations. The thermal load (heating/cooling) is considered to be changing with time (spatial variation is not considered) and therefore it might achieve significantly different values. Thus, discrete levels of the underlying thermal loads are hypothesized and sought. The number of these discrete levels generating HVAC thermal loads is also unknown and has to be revealed. By applying different HMMs (in terms of the number of the states) and further comparing them, we infer this number as the number referring to the best fitted HMM. According to previous indications, there is a relation between the HVAC thermal loads and the thermal load levels on the causative relation. For this reason, the observations are classified based on the two dimensional data (HVAC thermal load; outdoor air temperature).

Fig. 6. Energy consumption characteristics of HVAC system, for two observation types, HMM fitted to the testing facility located in UPB, Romania, for data recorded between January and December 2014 (8760 h). The first observation type is represented in black, the second observation type is represented in cyan. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

As previously mentioned, the observations consist of the electric energy hourly consumed by the HVAC system together with the outdoor air temperature. In order to obtain a simple, tractable and flexible model, we will group these data in such a way that we make sure we have relevant information. More precisely, we will organize the data to obtain a certain number of clusters (we did it for two, three and four clusters). These clusters will represent the state space A of the observed process Y. The number of hidden states has also to be addressed. We will consider several possible hidden states (2, 3, 4 or 5) and choose the number of states that better fits the data.

4.1. Two observation types We firstly consider that the observation space is divided into two domains, as shown in Fig. 6. The classification is based on the k-means clustering method, as these data form clusters around the respective nearest mean values. Firstly, the existence of two hidden states underlying the observations are considered. At this point, we apply the Baum–Welch algorithm in order to solve the estimation problem for the two-state HMM. The fitting procedure is iterative so that initial values have to be chosen. The initial values of the parameters can be chosen in a number of ways, we can chose from a uniform distribution, or we can incorporate some prior knowledge, or we can even select them randomly. In each case the parameters need to be initialized with values near to those giving the global maximum of the likelihood function, otherwise convergence may only reach a local maximum. Throughout this paper, the initial values were randomly selected many times and the fit with the greatest likelihood was adopted. Although this method does not guarantee a global maximum, it provides some degree of confidence that the global maximum has been reached. Then, we assume the existence of three underlying hidden states and we follow the same procedure as we did for the previous two hidden states. We proceed in the same way by considering three, four and five hidden states. In order to select an optimal model among these four models with 2 observation types, we will use an information criterion like Akaike’s information criterion (AIC) of the Bayesian Information Criterion (BIC). The best model will be selected by means of the AIC, defined by AIC = −2 · Log(L) + 2 · m,

(2)

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140 Table 3 HMM with 2 observation types.

Iteration h of the step E

Model

m

Log(L)

AIC

BIC

2 state HMM 3 state HMM 4 state HMM 5 state HMM

4 9 16 25

−5347.67 −1680.10 −1694.64 −1674.29

10,703.36 3378.21 3421.28 3398.59

10,731.67 3441.91 3534.53 3575.54

where Log(L) is the estimated maximum log-likelihood function and m is the number of model’s free parameters, or by means of the BIC, defined by BIC = −2 · Log(L) + m · Log(n),

(3)

where n is the number of observations (in our case, n = 8760). Taking into account the restriction of a MC transition matrix (each line sums one) and also the restrictions on Ri;a (sum over a is one), it is clear that the number of the free parameters are m = M · (M − 1) + M · (Q − 1),

(4)

where M is the number of the MC states and Q is the number of the observation types (Q = 2 in this case). Note also that for a sequence of observations y1 , y2 , . . ., yn , the likelihood function of these observations is given by L(y1 , . . ., yn ; ) =

 

P(Y1 = y1 , J1 = j1 , . . ., Yn = yn , Jn = jn )

j1 ,...,jn ∈ E

=

133

(1) Forward • Initialization: (h) (h)  We have p(h) = (puv )u,v ∈ E , (Ru;a )u ∈ E,a ∈ A (for h = 1 we have to choose them).  We take an initial value of P(J1 = v | y1 ,  (h) ), v ∈ E (we can take the initial distribution of p(h) ) • For t = 2, . . ., n we compute:  P(Jt = v | y1t−1 ,  (h) ) = fnct(P(Jt−1 = v | y1t−1 ,  (h) )) (predictive equation) (filtering  P(Jt = v | y1t ,  (h) ) = fnct(P(Jt = u | y1t−1 ,  (h) )) equation) (2) Backward • Initialization:  We obtained P(Jn = v | y,  (h) ) at the end of the step Forward. • For t = n, . . ., 2 we compute (smoothing equation):  P(Jt−1 = u, Jt = v | y,  (h) ) = fnct(P(Jt = v | y,  (h) ))  P(Jt−1 = u | y,  (h) ) = fnct(P(Jt = v | y,  (h) ), v ∈ E) Iteration h of the step M: Choose  (h+1) = argmax ∈  Q( |  (h) ) 1 P(Jt = v | y,  (h) ) n n

˛(v)(h+1) =

(6)

t=1

(5) ˛(j1 )Rj1 ;y1 pj1 j2 Rj2 ;y2 . . .pjn−1 jn Rjn ;yn .

(h+1)

puv

j1 ,...,jn ∈ E

Note that, in order to compute the estimated value of this likelihood, we need to replace  by an estimator, i.e. to replace the initial distribution ˛, the transition probabilities P and the emission probabilities R by the corresponding estimators. The associated formulas are given in Eqs. (6)–(8) below and the values of these estimators in our case are provided below, in Eqs. (9) and (10). In Table 3 we present, for each HMM taken into account, the number m of free parameters, the estimated maximum loglikelihood, the estimated corresponding AIC and BIC information criteria. Note that both criteria AIC and BIC choose the same model, namely the HMM with 2 observation types and 3 hidden states; we will denote this model by M2;3 . The estimation of the parameters is carried out through the Baum–Welch algorithm applied to Hidden Markov Models, the socalled EM algorithm. The idea of the algorithm is as follows: let  be the parameter defining the model and  (h) an iterative value of the parameter; instead of maximizing L(y1 , . . ., yn ;), the likelihood function of the observations y1 , y2 , . . ., yn , we maximize the conditional expectation

(h+1) Rv;a

=

n P(Jt−1 = u, Jt = v | y,  (h) ) t=2 n (h) t=2

P(Jt−1 = u | y, 

t=1

P(Jt = v | y, 

)

n P(Jt = v | y,  (h) )1{Yt =a} t=1 = n (h) )

(7)

(8)

For the model M2;3 , the estimated transition probability matrix Pˆ estimated here by the Baum–Welch algorithm, is the following:

(9) ˆ determined by the solution The emission probability matrix R, of the estimation problem in the EM (Baum–Welch) algorithm, is

(10) The fitted HMM indicates that the first and the second level of the thermal load (heating/cooling) emits observations mainly of the second type, while the third level of the thermal load (heating/cooling) produces observations mainly of the first type.

Q ( |  (h) ) = E(log L(y1 , . . ., yn , j1 , . . ., jn ; ) | y1 , . . ., yn ,  (h) ),

4.2. Three observation types

where L(y1 , . . ., yn , j1 , . . ., jn ;) represents the complete likelihood function of the observations y1 , y2 , . . ., yn and hidden part j1 , j2 , . . ., jn . The iteration h of the algorithm has the following structure: starting from an initial value  (1) and knowing the present value  (h) , compute iteratively Q( |  (h) ) through the so-called E step and update the parameter, by maximizing Q( |  (h) ), through the so-called M step. Thus we get the update  (h+1) of the parameter. It can be shown that the recurrent computation of Q( |  (h) ) is crucially based on three equations, respectively called predictive, filtering and smoothing equation, that allow to recurrently evaluate the probabilities P(Jt = v | y1t−1 ,  (h) ), P(Jt = v | y1t ,  (h) ), P(Jt−1 = u, Jt = v | y,  (h) ) and P(Jt−1 = u | y,  (h) ). Thus, the structure of the EM algorithm is as follows.

We consider now that the observation space divided into three observation types, as shown in Fig. 7. As previously, the classification is based on the k-means clustering method. As we did before, we will consider several possible hidden states (2, 3, 4 or 5) and choose the number of states that better fits the data, according to information criteria (BIC and AIC). In Table 4 we present, for each HMM taken into account, the number m of free parameters, the estimated maximum loglikelihood, the estimated corresponding AIC and BIC information criteria. Note that both criteria AIC and BIC choose the same model, namely the HMM with 3 observation types and 3 hidden states; we will denote this model by M3;3 .

134

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140

Fig. 7. Energy consumption characteristics of HVAC system, for three observation types, HMM fitted to the testing facility located in UPB, Romania, for data recorded between January and December 2014. The first observation type is represented in purple, the second observation type is represented in orange, and the third is represented in brown. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

Fig. 9. Energy consumption characteristics of HVAC system, for four observation types, HMM fitted to the testing facility located in UPB, Romania, for data recorded between January and December 2014. The first observation type is represented in pink, the second in gray, the third in yellow, and the fourth in magenta. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

Table 4 HMM with 3 observation types.

Table 5 HMM with 4 observation types.

Model

m

Log(L)

AIC

BIC

Model

m

Log(L)

AIC

BIC

2 state HMM 3 state HMM 4 state HMM 5 state HMM

6 12 20 30

−8955.99 −3082.97 −3082.83 −3097.81

17,923.98 6189.94 6205.66 6255.62

17,966.45 6274.88 6347.22 6467.96

2 state HMM 3 state HMM 4 state HMM 5 state HMM

8 15 24 35

−10,964.92 −4637.93 −4682.28 −3860.36

21,945.85 9305.87 9412.57 7790.73

22,002.47 9412.04 9582.44 8038.46

ˆ estimated here The estimated transition probability matrix P, by the Baum–Welch algorithm, is the following:

(11) ˆ determined by the solution The emission probability matrix R, of the estimation problem, is

In Table 5 we present, for each HMM taken into account, the number m of free parameters, the estimated maximum loglikelihood, the estimated corresponding AIC and BIC information criteria. Note that both criteria AIC and BIC choose the same model, namely the HMM with 4 observation types and 3 hidden states; we will denote this model by M4;3 . ˆ estimated here The estimated transition probability matrix P, by the Baum–Welch algorithm, is the following:

(12) In Fig. 8 we can see the diagram of the fitted HMM M3;3 , together with the estimated Markov transition probabilities and the emission probabilities. The fitted HMM indicates that the first level of the thermal load (heating/cooling) emits observations of the second type with a probability of 0.9882, the second level of the thermal load (heating/cooling) emits observations of the first type with a probability of 0.9855, whereas the third level of the thermal load (heating/cooling) emits observations of the third type with a probability of 0.9874. Based on these results, we are able to see, with high probability, that the HMM leads with accuracy to the identification and extraction of three seasonal components, i.e. three hidden states, which are otherwise not apparent in the energy demand time series plot itself.

(13) ˆ determined by the solution The emission probability matrix R, of the estimation problem, is

(14) The fitted HMM indicates that the first level of the thermal load (heating/cooling) emits mainly observations of the fourth type with a probability of 0.9816; the second level of the thermal load (heating/cooling) emits observations of the first type with a probability of 0.689 and of the second type with a probability of 0.311, whereas the third level of the thermal load (heating/cooling) emits mainly observations of the third type with a probability of 0.9718. 5. Results and optimal model

4.3. Four observation types We consider now that the observation space divided into four observation types, as shown in Fig. 9. As previously, the classification is based on the k-means clustering method. As we did before, we will consider several possible hidden states (2, 3, 4 or 5) and choose the number of states that better fits the data, according to information criteria (BIC and AIC).

According to what we presented in the previous section, it is important also to make the remark that in all three cases we took into account (that is, with 2 or 3 or 4 observation types), the best models selected by both criteria AIC and BIC are the models with 3 hidden state spaces. In our opinion, this fact is of crucial importance because it tells us that, independently of the classification we consider for the observations, the hidden models we develop are

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140

135

Fig. 8. Diagram of the fitted HMM M3;3 . The first observation type is represented in purple, the second observation type is represented in orange, and the third is represented in brown, as in Fig. 7. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

able to capture the same feature of the hidden part, namely that 3 states are relevant. Note also that for comparing different HMMs in terms of their number of states a necessary condition is that the models have the same observation space. Nevertheless, in the cases M2;3 and M3;3 – with 2 or 3 observation types and 3 hidden state spaces respectively – we observe in (10) and (12) that the emission probability matrices Rˆ point out an approximation of “one-to-one” correspondence between the hidden states and the observation types. A possible interpretation that could be suggested is the following one: the model appears to be ideal in the sense that it reveals almost surely its hidden states given the observations, removing thus its inherent hidden feature. As pointed in Table 4, the model M3;3 has the minimum AIC/BIC, and consequently we will consider in the sequel, as the “optimum” model. We used also a 11-fold cross-validation on a rolling basis to evaluate the quality of the fitted HMMs, as a function of M – the number of the MC states. This method starts with a small subset of data (first month) for training purposes and with a small subset of data (second month) for testing/validation purposes. A HMM is fitted to this training dataset and the predictive accuracy is estimated using the validation dataset: fold 1: training[1], test[2] Then, the subset of data corresponding to first and to the second month is used for training purposes and the third month dataset is proposed for testing purposes. Again, a HMM is fitted to this training dataset and the predictive accuracy is estimated using the validation/testing dataset: fold 2: training[1,2], test[3] The same forecasted data points are then included as part of the next training dataset and subsequent data points are forecasted: fold 3: training[1,2,3], test[4] ... foldk: training[1,2,. . .,k], test[k + 1] ... fold 11: training[1,2,3,. . .,10,11], test[12] We computed the average of the AIC and BIC scores and proposed them as predictive accuracy indices:

 ˆ = 1 · AIC AIC k 11 11

k=1

Table 6 Performance of the HMMs with different number of states M by comparison to a model with Markov chain, following 11-fold cross-validation. Model

ˆ score Normalized AIC

ˆ score Normalized BIC

Markov chain M3;2 M3;3 M3;4 M3;5

0.7325 0.5974 0.5884 0.5935 0.5977

0.7475 0.6049 0.5972 0.6121 0.6159

 ˆ = 1 · BIC BIC k 11 11

k=1

For each model, these values are given in Table 6, for 3 observation types and for M = 2, 3, 4, 5. Not unusually, in Table 6, the HMMs have lower predictive uncertainty and lower normalized AIC, BIC scores than the Markov chain model (where lower scores correspond to better predictive ability). The normalized AIC decreases from M = 2 to M = 3, and then levels out. The normalized BIC score reaches its minimum at M = 3. Given that both scores indicate that M = 3 is a reasonable choice for M, we have indeed correctly estimated this point. Concerning the decoding problem of the most likely sequence of hidden states, this can be seen in Figs. 10–21. In these figures, for every time period that we considered (0–120 hrs, 1400–1496 h, and 4000–4096 h), we present, firstly, three measures related to the model (the variation of the outdoor air temperature, the variation of HVAC energy consumption and the effective energy response rate in kWh/◦ C and, secondly, the decoded level of building thermal loads for the optimal model M3;3 . In Figs. 10, 14 and 18, the first observation type is represented in purple, the second observation type is represented in orange, and the third is represented in brown, as shown in Fig. 8. Also, the first hidden state is represented in red, the second hidden state is represented in blue, and the third hidden state is represented in green, as shown also in Fig. 8. We may also illustrate the intuition behind our HMM models by introducing now a parameter estimating the energy response rate. This parameter captures the rate (a) of change in energy (E) hourly consumed by HVAC system – with the change of the mean

136

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140

Fig. 13. Effective energy response rate, Politehnica house (time interval: 0–120 h, year 2014).

Fig. 10. The decoded levels of dynamics building thermal loads for the model M3;3 , Politehnica house (time interval: 0–120 h, year 2014).

Fig. 11. Variation of the outdoor air temperature, Politehnica house (time interval: 0–120 h, year 2014).

Fig. 12. Variation of HVAC energy consumption, Politehnica house (time interval: 0–120 h, year 2014).

Fig. 14. The decoded levels of dynamics building thermal loads for the model M3;3 , Politehnica house (time interval: 1400–1496 h, year 2014).

Fig. 15. Variation of the outdoor air temperature, Politehnica house (time interval: 1400–1496 h, year 2014).

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140

Fig. 16. Variation of HVAC energy consumption, Politehnica house (time interval: 1400–1496 h, year 2014).

Fig. 17. Effective energy response rate, Politehnica house (time interval: 1400–1496 h, year 2014).

137

Fig. 19. Variation of the outdoor air temperature, Politehnica house (time interval: 4000–4096 h, year 2014).

Fig. 20. Variation of HVAC energy consumption, Politehnica house (time interval: 4000–4096 h, year 2014).

Fig. 21. Effective energy response rate, Politehnica house (time interval: 4000–4096 h, year 2014).

hourly outside air temperature T, for a given indoor air temperature Ti (thermal comfort):



a=

Fig. 18. The decoded levels of dynamics building thermal loads for the model M3;3 , Politehnica house (time interval: 4000–4096 h, year 2014). (For interpretation of the references to color in the text, the reader is referred to the web version of the article.)

E/(T − Ti ) E/(Ti − T )

for building cooling regime T > Ti ; for building heating regime T < Ti .

The dynamic of the energy response rate is presented in Figs. 13, 17, and 21. We have to mention during our experimental setup at Politehnica Passive House in 2014, the HVAC setting point has been chosen for cooling regime at Ti = 26 ± 0.5 ◦ C, whereas for heating regime Ti = 20 ± 0.5 ◦ C. During the winter season, the HMM ‘stays’ mainly in the second and third levels of the building thermal load, as shown, for a couple of days in January, in Figs. 10–13. After that, early spring comes

138

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140

Fig. 22. Histogram of effective energy response factor, Politehnica house (time interval: 1–8760 h, year 2014). (For interpretation of the references to color in the text, the reader is referred to the web version of the article.)

Fig. 23. Dissagregation of the effective response factor, for the model M3;3 , Politehnica house (time interval: 1–8760 h, year 2014). (For interpretation of the references to color in the text, the reader is referred to the web version of the article.)

with the HMM expected to ‘stay’ mainly in the first level of the building thermal load. An exemplification is shown in Figs. 14–17 for a couple of days in February. Finally, during the summer season, the HMM is expected to ‘stay’ mainly in the second level of the building thermal load. An exemplification for a couple of days in June is presented in Figs. 18–21.

Fig. 24. The loglikelihood values, ranging from 01 January 2014 through 31 December 2014, Politehnica Passive House, for the dataset and model M3;3 . The red circles represent the consumption anomalies identified by the HMM. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

of the effective energy rate (0–0.04 kWh/◦ C) is observed while the building “stays” in the hidden state 1 (shown in red in Fig. 23). Also, the observation type 2 (shown in orange in Fig. 7) is most likely linked with the hidden state 1. The observation type 2 implies a moderate amount of energy consumption, as presented in Fig. 7. Finally, a narrow variation of the effective energy rate (0–0.02 kWh/◦ C) is observed while the building “stays” in the hidden state 2 (shown in blue in Fig. 22). Also, the observation type 1 (shown in purple in Fig. 8 is most likely linked with the hidden state 2. The observation type 1 implies a minimum amount of energy consumption, as presented in Fig. 8. To summarize, the hidden states 1 and 3 (shown red, respectively green, in Figs. 9, 13 and 22) are characteristic for thermal building profiles during the winter and the early spring/late autumn seasons. The hidden state 2 (in blue in Figs. 18 and 22) is characteristic for thermal building profile during the summer season. A hidden state is identified as a season variable. We may reason that the set of season variables is unique to every building and mainly depends on weather conditions, users’ comfort/behavior and building’s construction type. We trust that we have provided here a methodology on how the HMM may be used in practice to understand the energy consumption and its components in buildings, mainly driven by the response to outdoor temperature.

6. Effective energy factor response 7. Anomaly detection in energy consumption data We consider that the problem of HMM is now solved. A new insight on the dynamic behavior of the energy consumption (HVAC system) is now possible based on effective energy factor histogram and its disaggregation. In Fig. 23 we observe the entire analyzed time interval 1–8760 (h) reveals an energy response factor with a wide interval of variation, from 0 (HVAC off) to 0.08 kWh/◦ C. For instance, assuming Ti = 20 ◦ C, and an outdoor air temperature T =−10.5 ◦ C (the mean of the last hour records), we have an expected (hourly) HVAC consumption not greater than E = 0.08 kWh/◦ C × (20 + 10.5) ◦ C = 2.44 kWh. In the sequel, we consider the optimum model, i.e. three states HMM and three observation types (the model denoted by M3;3 ). Based on the state decoding, we may disaggregate the effective energy factor response, as shown in Fig. 23. Our objective/goal is now achieved by the identification of the building thermal profile. As shown in Fig. 22, we can infer the three building thermal profiles; each of them corresponds to a unique hidden state. A wide range of variation of the effective energy rate (0–0.06 kWh/◦ C) is observed while the building “stays” in the hidden state 3 (shown in green in Fig. 23). But, the observation type 3 (shown in brown in Fig. 7) is almost surely linked with the hidden state 3. The observation type 3 implies a higher amount of energy consumption, as presented in Fig. 7. A moderate range of variation

To leverage our work and to support the analysts in understanding the energy consumption data, we will demonstrate in this section how to highlight the consumption anomalies in daily life. The anomaly detection is the first step of the data analysis, specifically relevant for training the forecasting models. There are several anomalies in the energy demand time series for households, such as data processing errors, faulty meter measurements, human errors, naïve disaggregation (stuck meter) and outages. For large volumes of data, there is a need for algorithmic process. Our approach determines the probability of a set of data to being anomalous. This is done using the presented Hidden Markov Model and a binomial distribution of the residuals. The classifier calculates the maximum likelihood of the data points given the prior clusters based on (5)and use the likelihood values to distinguish between true consumption and anomalous values (known or unknown sources/causes). The anomaly detection results on the dataset Politehnica Passive House are presented in Fig. 24 with the anomalies found labeled from A to E. The contribution of the proposed Hidden Markov Models is their ability to incorporate domain knowledge in techniques developed to detect anomalies efficiently in energy time series datasets.

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140

139

Thus, the method developed in this article offers the possibility of obtaining the likelihood of the model given the data, and allows for computing the likelihood within a sliding time window. This fact allows for anomaly pattern detection in hourly energy consumption of the building, by comparing two successive sliding windows contained in the time period of the study. Hidden Markov Models are able to determine a state sequence from collected data in an unsupervised manner, this means without the intervention of human experts. However, as we pointed out, the state transitions in the HMMs can have different meanings. Therefore, the expert judgment will continue to play a crucial role in determining a better energy efficiency in private and public buildings, while the HMMs could be used as a tool able to provide an early warning indicating that something is changing in the building energy efficiency status.

The research work of Vladimir Tanasiev and Adrian Badea has been funded by the project no. 47/2014, Partnerships in priority areas – PN II, carried out by Minister of National Education and Research – Executive Agency for Higher Education, Research, Development and Innovation. The research work of Vlad Stefan Barbu was partially supported by the projects XTerM-Complex Systems, Territorial Intelligence and Mobility (2014–2018) and MOUSTIC-Random Models and Statistical, Informatics and Combinatorics Tools (2016–2019), within the Large Scale Research Networks from the Region of Normandy, France. The primary data used in this article has been collected and provided by the Passive House – UPB research team. The authors would like to thank the anonymous reviewers for their valuable comments.

8. Conclusions

References

This article presents a real application of HMM in the building sector. We used the collected data of the Politehnica house to better understand how a stochastic model can bring understanding to energy consumption of the test building and so to buildings in general. Hourly energy signatures can create more robust energy consumption benchmarks and provide additional insight into energy demand patterns compared to daily, monthly or weekly signatures, although requiring large number of records. The classic concept of heating/cooling degree-days relies on the assumption of a linear relationship between consumption and outdoor temperature which is not always the case. The use of hourly consumptions and temperatures in HMM states permits a more fine-grained analysis. After establishing the assumptions of the model, both information criteria AIC and BIC indicated that the model with three HMM states is better suited than the other models. The energy disaggregation presented in Fig. 23 is due to energy consumption of the HVAC only. The model can identify various consumers as long as HMM states are increased through a calibration algorithm. Note that the need to have different models for different seasons determined us to make use of Hidden Markov Models, because of their ability of detecting different homogeneous (seasonal) regions, that is to say, regions that have different behaviors, like Summer–Winter or different occupancy behaviors. As pointed out in Fig. 23 and the comments related to it, the developed HMMs identified different states. Technically speaking, such states are identified as season variables. It is natural to assume that these season variables are unique to every building and depend mainly on weather conditions, user behavior and building construction. As one can argue, in our study there are many factors to play with – as in a real case. This is the reason why, as explained before, only a model like a HMM can be used, because neither GAs, nor ANNs do not have this specification to deal with piecewise homogeneous (seasonal) process. Finally, we trust that a controller based on a Hidden Markov Model with the parameters  = (˛, P, R) calibrated for the behavior of the building/weather/user comfort can reduce energy demand for heating and improve the thermal comfort of occupants in different building construction types.

[1] E. Commission, An EU Strategy on Heating and Cooling, COM(2016), vol. 15, 2016. [2] S. Jafarzadeh, S. Fadali, C.Y. Evrenosoglu, H. Livani, Hour-ahead wind power prediction for power systems using Hidden Markov Models and Viterbi Algorithm, IEEE PES General Meeting (2010) 1–6, http://dx.doi.org/10.1109/ PES.2010.5589844. [3] I. Richardson, M. Thomson, D. Infield, A high-resolution domestic building occupancy model for energy demand simulations, Energy Build. 40 (8) (2008) 1560–1566, http://dx.doi.org/10.1016/j.enbuild.2008.02.006. [4] P.D. Andersen, A. Iversen, H. Madsen, C. Rode, Dynamic modeling of presence of occupants using inhomogeneous Markov chains, Energy Build. 69 (2014) 213–223, http://dx.doi.org/10.1016/j.enbuild.2013.10.001. [5] M. Bicego, F. Recchia, A. Farinelli, S.D. Ramchurn, E. Grosso, Behavioural biometrics using electricity load profiles, 2014 22nd International Conference on Pattern Recognition (2014) 1764–1769, http://dx.doi.org/10.1109/ICPR. 2014.310. [6] J. Lu, T. Sookoor, V. Srinivasan, G. Gao, B. Holben, J. Stankovic, E. Field, K. Whitehouse, The smart thermostat: using occupancy sensors to save energy in homes, in: Proceedings of the 8th ACM Conference on Embedded Networked Sensor Systems, SenSys’10, ACM, New York, NY, USA, 2010, pp. 211–224, http://dx.doi.org/10.1145/1869983.1870005. [7] A.J. Collin, G. Tsagarakis, A.E. Kiprakis, S. McLaughlin, Development of low-voltage load models for the residential load sector, IEEE Trans. Power Syst. 29 (5) (2014) 2180–2188, http://dx.doi.org/10.1109/TPWRS.2014. 2301949. [8] I. Richardson, M. Thomson, D. Infield, C. Clifford, Domestic electricity use: a high-resolution energy demand model, Energy Build. 42 (10) (2010) 1878–1887, http://dx.doi.org/10.1016/j.enbuild.2010.05.023. [9] J. Widén, A.M. Nilsson, E. Wäckelgård, A combined Markov-chain and bottom-up approach to modelling of domestic lighting demand, Energy Build. 41 (10) (2009) 1001–1012, http://dx.doi.org/10.1016/j.enbuild.2009.05.002. [10] T. Zia, D. Bruckner, A. Zaidi, A hidden Markov model based procedure for identifying household electric loads, IECON 2011 – 37th Annual Conference of the IEEE Industrial Electronics Society (2011) 3218–3223, http://dx.doi.org/ 10.1109/IECON.2011.6119826. [11] A. Zoha, A. Gluhak, M.A. Imran, S. Rajasegarar, Non-intrusive load monitoring approaches for disaggregated energy sensing: a survey, Sensors 12 (12) (2012) 16838–16866, http://dx.doi.org/10.3390/s121216838. [12] L. Wang, X. Luo, W. Zhang, Unsupervised energy disaggregation with factorial hidden Markov models based on generalized backfitting algorithm, 2013 IEEE International Conference of IEEE Region 10 (TENCON 2013) (2013) 1–4, http://dx.doi.org/10.1109/TENCON.2013.6718469. [13] J.Z. Kolter, T. Jaakkola, Approximate inference in additive factorial HMMs with application to energy disaggregation, Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS-12) (2012) 1472-–1482. [14] H.H. Chen, P.F. Wang, C.T. Sung, Y.R. Yeh, Y.J. Lee, Energy disaggregation via clustered regression models: a case study in the convenience store, 2013 Conference on Technologies and Applications of Artificial Intelligence (2013) 37–42, http://dx.doi.org/10.1109/TAAI.2013.21. [15] P. Heracleous, P. Angkititrakul, N. Kitaoka, K. Takeda, Unsupervised energy disaggregation using conditional random fields, in: IEEE PES Innovative Smart Grid Technologies, Europe, 2014, pp. 1–5, http://dx.doi.org/10.1109/ ISGTEurope.2014.7028933. [16] J. Kelly, W. Knottenbelt, Metadata for energy disaggregation, 2014 IEEE 38th International Computer Software and Applications Conference Workshops (2014) 578–583, http://dx.doi.org/10.1109/COMPSACW.2014.97. [17] S. Ghosh, X.A. Sun, X. Zhang, Consumer profiling for demand response programs in smart grids, IEEE PES Innovative Smart Grid Technologies (2012) 1–6, http://dx.doi.org/10.1109/ISGT-Asia.2012.6303309. [18] A. Zoha, A. Gluhak, M. Nati, M.A. Imran, Low-power appliance monitoring using Factorial Hidden Markov Models, 2013 IEEE Eighth International

Acknowledgments The author Anatoli Paul Ulmeanu would like to thank Prof. Nikolaos Limnios, head of the department “Mathématiques Appliquées”, Université de Technologie de Compiègne, France, for his hospitality during the period of research.

140

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

A.P. Ulmeanu et al. / Energy and Buildings 154 (2017) 127–140 Conference on Intelligent Sensors, Sensor Networks and Information Processing (2013) 527–532, http://dx.doi.org/10.1109/ISSNIP.2013.6529845. O. Ardakanian, N. Koochakzadeh, R.P. Singh, L. Golab, S. Keshav, Computing electricity consumption profiles from household smart meter data, EDBT/ICDT Workshops (2014) 140–146. O. Ardakanian, S. Keshav, C. Rosenberg, Markovian models for home electricity consumption, in: Proceedings of the 2nd ACM SIGCOMM Workshop on Green Networking, GreenNets’11, ACM, New York, NY, USA, 2011, pp. 31–36, http://dx.doi.org/10.1145/2018536.2018544. A. Bondu, A. Dachraoui, Realistic and very fast simulation of individual electricity consumptions, 2015 International Joint Conference on Neural Networks (IJCNN) (2015) 1–8, http://dx.doi.org/10.1109/IJCNN.2015.7280339. X. Ma, H. Li, S. Djouadi, Stochastic modeling of short-term power consumption for smart grid: a state space approach and real measurement demonstration, 2011 45th Annual Conference on Information Sciences and Systems (2011) 1–5, http://dx.doi.org/10.1109/CISS.2011.5766246. Z. Kang, M. Jin, C.J. Spanos, Modeling of end-use energy profile: an appliance-data-driven stochastic approach, IECON 2014 – 40th Annual Conference of the IEEE Industrial Electronics Society (2014) 5382–5388, http://dx.doi.org/10.1109/IECON.2014.7049322. J. Wang, J. Kang, Y. Sun, D. Liu, Load forecasting based on GM – Markov chain model, 2010 Second Pacific-Asia Conference on Circuits, Communications and System, vol. 1 (2010) 156–158, http://dx.doi.org/10.1109/PACCS.2010. 5627058. H. Zhou, W. Wang, W. Niu, X. Xie, Forecast of residential energy consumption market based on grey Markov chain, 2008 IEEE International Conference on Systems, Man and Cybernetics (2008) 1748–1753, http://dx.doi.org/10.1109/ ICSMC.2008.4811541. N. Dongxiao, W. Yanan, L. Jianqing, X. Cong, W. Junfang, Analysis of electricity demand forecasting in Inner Mongolia based on gray Markov model, 2010 International Conference on E-Business and E-Government (2010) 5082–5085, http://dx.doi.org/10.1109/ICEE.2010.1275. E. Mocanu, P.H. Nguyen, M. Gibescu, W.L. Kling, Comparison of machine learning methods for estimating energy consumption in buildings, 2014 International Conference on Probabilistic Methods Applied to Power Systems (PMAPS) (2014) 1–6, http://dx.doi.org/10.1109/PMAPS.2014.6960635. E. Dobrescu, D.I. Nastac, E. Pelinescu, Short-term financial forecasting using ANN adaptive predictors in cascade, Int. J. Process Manag. Benchmarking 4 (4) (2014) 376–405, http://dx.doi.org/10.1504/IJPMB.2014.065519.

[29] A. Costea, I. Nastac, Assessing the predictive performance of ANN classifiers based on different data preprocessing methods, Int. J. Intell. Syst. Account. Finance Manag. 13 (4) (2005) 217–250, http://dx.doi.org/10.1002/isaf.269. [30] K. Berk, Modeling and Forecasting Electricity Demand, BestMasters, Springer Spektrum, 2015, http://dx.doi.org/10.1007/978-3-658-08669-5. [31] A. Badea, T. Baracu, C. Dinca, D. Tutica, R. Grigore, M. Anastasiu, A life-cycle cost analysis of the passive house “POLITEHNICA” from Bucharest, Energy Build. 80 (2014) 542–555, http://dx.doi.org/10.1016/j.enbuild.2014.04.044. [32] T. Baracu, V. Tanasiev, T. Mamut, C. Streche, A. Badea, A transient thermal analysis by thermal networks of the passive house “POLITEHNICA” from Bucharest, Int. J. Sustain. Build. Technol. Urban Dev. 4 (2) (2013) 146–159, http://dx.doi.org/10.1080/2093761X.2013.777682. [33] C. Ionescu, T. Baracu, G.-E. Vlad, H. Necula, A. Badea, The historical evolution of the energy efficient buildings, Renew. Sustain. Energy Rev. 49 (2015) 243–253, http://dx.doi.org/10.1016/j.rser.2015.04.062. [34] D.S. Tudose, A. Voinescu, M.T. Petrareanu, A. Bucur, D. Loghin, A. Bostan, N. Tapus, Home automation design using 6lowpan wireless sensor networks, 2011 International Conference on Distributed Computing in Sensor Systems and Workshops (DCOSS) (2011) 1–6, http://dx.doi.org/10.1109/DCOSS.2011. 5982181. [35] A.G. Marin, D.S. Tudose, Energy independent wireless sensor network design, 2015 20th International Conference on Control Systems and Computer Science (2015) 267–272, http://dx.doi.org/10.1109/CSCS.2015.94. [36] V. Tanasiev, H. Necula, G. Darie, A. Badea, Web service-based monitoring system for smart management of the buildings, 2016 International Conference and Exposition on Electrical and Power Engineering (EPE) (2016) 025–030, http://dx.doi.org/10.1109/ICEPE.2016.7781296. [37] L.R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE 77 (2) (1989) 257–286, http://dx.doi.org/10. 1109/5.18626. [38] Y. Ephraim, N. Merhav, Hidden Markov processes, IEEE Trans. Inf. Theory 48 (6) (2002) 1518–1569, http://dx.doi.org/10.1109/TIT.2002.1003838. [39] T.R. Olivier Cappé, E. Moulines, Inference in Hidden Markov Models, Springer Series in Statistics, Springer-Verlag New York, 2005, http://dx.doi.org/10. 1007/0-387-28982-8.