Baseline building energy modeling and localized uncertainty quantification using Gaussian mixture models

Baseline building energy modeling and localized uncertainty quantification using Gaussian mixture models

Energy and Buildings 65 (2013) 438–447 Contents lists available at ScienceDirect Energy and Buildings journal homepage: www.elsevier.com/locate/enbu...

2MB Sizes 0 Downloads 72 Views

Energy and Buildings 65 (2013) 438–447

Contents lists available at ScienceDirect

Energy and Buildings journal homepage: www.elsevier.com/locate/enbuild

Baseline building energy modeling and localized uncertainty quantification using Gaussian mixture models Abhishek Srivastav a,∗ , Ashutosh Tewari a , Bing Dong b a b

United Technologies Research Center, 411 Silver Lane, East Hartford, CT 06118, United States Department of Mechanical Engineering, University of Texas, One UTSA Circle, San Antonio, TX 78249-0670, United States

a r t i c l e

i n f o

Article history: Received 11 March 2013 Received in revised form 14 May 2013 Accepted 25 May 2013 Keywords: Baseline building energy modeling Gaussian Mixture Models Uncertainty quantification Retrofit analysis

a b s t r a c t Uncertainty analysis of building energy prediction is critical to characterize the baseline performance of a building for impact assessment of energy saving schemes that include fault detection and diagnosis (FDD) systems, advanced control policies and retrofits among others. This paper presents a novel approach based on Gaussian Mixture Regression (GMR) for modeling building energy use with parameterized and locally adaptive uncertainty quantification. The choice of GMR is motivated by two key advantages (1) the number of unique operational patterns of a building can be identified using an information-theoretic criteria in a data-driven manner and (2) confidence bounds on baseline prediction are localized and their estimation is integrated with the modeling process itself. The proposed GMR approach is applied to two cases (1) one year synthetic data set generated by Department of Energy (DoE) reference model for a supermarket in Chicago climate and (2) one year field data for a retail store building located in California. The results from GMR model are compared with some prevalent multivariate regression models for baseline building energy use. © 2013 Elsevier B.V. All rights reserved.

1. Introduction The total energy consumption for US commercial buildings was 17.43 quads in 2003, which amounts to approximately 18% of the total U.S. energy consumption [2]. The Department of Energy (DoE), the International Energy Agency (IEA), the Intergovernmental Panel on Climate Change (IPCC) and other agencies have declared a need for commercial buildings to become 70–80% more energy efficient. Although energy-efficient building technologies are emerging, a key challenge is how to effectively maintain building energy performance over the life-cycle of the building. It is well known that most buildings lose most of their designed energy efficiency shortly after they are commissioned or re-commissioned. As a result, achieving persistent low-energy performance is critical for realizing the energy, environmental, and economic goals. Before any advanced control, fault detection and diagnostics (FDD) and retrofit technologies can be applied to improve energy efficiency, a high-fidelity baseline energy performance model is often needed to help understand building operation. For instance, a baseline model developed using pre-retrofit data can be used to estimate energy use in the post-retrofit conditions. Thereafter, the difference between the

∗ Corresponding author. Tel.: +1 860 610 7580. E-mail addresses: [email protected], [email protected], [email protected] (A. Srivastav), [email protected] (A. Tewari), [email protected] (B. Dong). 0378-7788/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.enbuild.2013.05.037

estimated and the actual energy use can be attributed to the retrofit changes. There are many approaches for baseline energy modeling Fels et al. [6] utilized variable-base degree-day method to estimate residential retrofitting energy use. Kissock [9] developed a regression methodology to measure retrofitting energy use in commercial buildings. Krarti et al. [10] utilized neural networks to estimate energy and demand savings from retrofits of commercial buildings. Dhar et al. [4] generalized the Fourier series approach to model hourly energy use in commercial buildings. In addition, in most practical cases, utility bill data are used because they are widely available and inexpensive to obtain and process [12]. Reddy et al. [12] presented a formal baselining methodology at the whole building level based on monthly utility bills and took outdoor dry-bulb temperature as the only model regressor. Internal Performance Measurement and Verification Protocol (IPMVP) [5,1], also provide rigorous approach to develop baseline models for estimating energy savings after retrofits. As apparent from the above examples, building energy use models are becoming increasing complex. However, total building energy usage as a function of external conditions such as dry-bulb temperature and humidity have been adequately captured by linear or piecewise linear data-driven models. This is the primary reason behind the practical success of univariate change-point method [12] and multi-variate linear models [17,8]. In a changepoint model, the building performance is partitioned into different operating conditions and a linear model is fit to each of the

A. Srivastav et al. / Energy and Buildings 65 (2013) 438–447

Nomenclature X random variable (r.v.) for regressors instantiation of X x Y random variable for response variable instantiation of Y y yˆ estimated value of y Z joint variable for concatenation [X, Y] instantiation of Z z g map of regressors to response variable ˇ parameters of g F(Y, X ; ) joint distribution of (Y, X) with parameters  E(Y|X ; ) expected value of Y given X with parameters  d number of regressor (z ; ) Gaussian mixture probability density with parameters   mixing parameter for Gaussian mixture components Gaussian (normal) probability density  mean value of Gaussian density  ˙ covariance matrix of Gaussian density  standard deviation of Gaussian density K number of Gaussian mixture components number of data samples N M number of free parameters of the model ˛ confidence value ∈[0, 1] erf error function Superscripts K Kth component of Gaussian mixture model Subscripts ith data sample i Y|X corresponding to the joint density F(Y|X ; ) X part of vector corresponding to r.v. X part of matrix corresponding to r.v. X in both rows XX and columns Y part of vector corresponding to r.v. Y YY part of matrix corresponding to r.v. Y in both rows and columns XY part of matrix corresponding to r.v. X in rows and r.v. Y in columns YX part of matrix corresponding to r.v. Y in rows and r.v. X in columns

operating modes [12]. However, identification of these distinct operating modes is based on domain knowledge and building control operation. Non-linear models such as Artificial Neural Networks (ANN) [19,16] have become common place in datadriven building energy modeling literature to model non-linear transitions between linear regimes using a single overall model. Increasingly complex models have been proposed that capture this non-linear behavior but it comes at the price of computational complexity and lack of physical understanding of the model itself. Another shortcoming of prevalent modeling approaches for building energy is their limited ability to quantify uncertainty in predictions. Usually, error estimates are made for the overall model at the global level. This approach not only leads to a conservative estimate of modeling errors, but also smears the uncertainty across the data range. Therefore, locally a model maybe much less (or much more) accurate than global error band estimate. In essence, this amounts to assuming an identical distribution for prediction error that is independent of the input conditions, which is inconsistent with most real scenarios. A localized quantification

439

of uncertainties that captures the dependence of error on external conditions is necessary to accurately compute the impact of retrofits or new control and operation strategies. Recently, Subbarrao et al. [14] have correctly pointed out the risks of using global confidence intervals for model predictions and proposed a nearest neighborhood method to compute local uncertainty estimates. For a given input condition, a set of similar conditions are selected using a distance metric and a chosen cut-off radius; error statistics are then computed based only on this local set of information. This model-less approach to local uncertainty quantification might be inaccurate and unstable due to sparse population of data in local neighborhoods. Also, the degree of similarity of a set of k nearest-neighbors for an input condition may not be comparable for different input conditions, leading to inconsistent error statistics. While increasing the neighborhood size might resolve the stability and accuracy issue, a larger local neighborhood defeats the purpose of localized uncertainty quantification unless local neighborhoods are sufficiently dense in data. Heuristics have been proposed in [14], however, there is no formal and rigorous way to balance this trade-off and select an optimal neighborhood size. The problem of confidence interval (CI) estimation in local neighborhoods with poor data density can be tackled using a model-based approach so as to limit the number of parameters required to be learned. The choice of the model depends on the expected characteristics of the data-set, desired learning and runtime complexity. For instance, in the case of predictive models for buildings, the conditional distribution of model errors is assumed to have a certain parametric form where the parameters may or may not be functions of given condition. For example, based on the expected qualitative behaviors of the prediction errors, a suitable model e.g. a Gaussian or a log-normal distribution can be imposed on model errors and parameters estimation can be done in the maximum likelihood sense. This is a good approach when the biases incurred by making a model choice are expected to be outweighed by the accuracy gains in error quantification. Using a model based approach, Heo and Zavala [7] have recently presented a Gaussian Processes (GP) based approach for building energy prediction and localized uncertainty quantification. With a suitable choice of a kernel for the data covariance matrix in a GP, non-linear response surfaces can be modeled with few parameters. However, the both the learning and run-time complexity of the GP based models scales cubically with the number of training samples used; which can quickly become impractical as the size of the data-set becomes larger. In summary, the prevalent practical techniques in building energy prediction suffer from the following drawbacks (1) high complexity in an attempt to create a single model for the overall system and (2) global uncertainty quantification, leading to conservative and unrealistic error estimates. We propose a regression approach based on Gaussian Mixture Models for building energy prediction and uncertainty quantification that has the following advantages 1. Integrated response surface modeling and local uncertainty quantification: Conditional probability density of prediction errors is used for the estimation of both the response variable and the associated uncertainty; model parameters are learned from the entire data in the maximum likelihood sense; and a secondary process of localized or global confidence estimation does not need to be performed. 2. Low impact of correlated regressors: Correlated regressors, such as dry-bulb temperature and humidity, lead to ill-conditioned models that results in model parameter estimates that are highly sensitive to noise and changes in the data. Models that do not take into account the dependence between explanatory

440

A. Srivastav et al. / Energy and Buildings 65 (2013) 438–447

variables might be less generalizable and the significance of regressors can be grossly erroneous. 3. Less sensitivity to data sparsity: Being a parametric approach to modeling the building performance and uncertainty quantification, the GMR approach is less sensitive to low density of data. 4. Formal model structure selection: Identification of the number of modes in the Gaussian Mixture Regression is done using a information criteria (Bayesian Information Criteria) that is counter-weighted by the model complexity to choose the optimal number of modes; modes of the GMR model correspond to unique operational patterns of the building under consideration. 2. Technical approach In regression analysis, we seek a functional map from a set of inputs (regressors), X, to a set of outputs (response variables, Y). For multiple-input–single-output system, such as a building’s baseline energy model, the output is a scalar function of inputs, y = g(x ; ˇ), parameterized by ˇ. If the functional form of g is known, the parameters, ˇ, can be obtained from observed data by solving an optimization problem. The complexity of parameter estimation depends on the choice of the mapping function. In this paper, we advocate solving a larger problem of estimating the joint probability distribution of the response variable and the regressors. Once an accurate joint distribution, F(Y, X ; ), is learnt from data, the conditional mean value, E(Y|X ; ), can be used to map the regressors to the response variable. The conditional mean value, can either be linear or non-linear in parameters, , depending on the parametric form of the joint probability distribution F. If F(Y, X) is unimodal Gaussian distribution, the conditional mean, E(Y|X ; ), remains linear in  and coincides with the linear regression estimate. However, for real world scenarios, where the data is typically generated under different operating conditions, a unimodal Gaussian assumption can be quite restrictive. As argued in Section 1, a piecewise linear model has been shown to be an adequate and robust model for building energy performance predicted based on measured external conditions. Therefore, we propose using a mixture of Gaussian distribution functions to represent the joint probability distribution F(Y, X), where each mode (or mixture component) will represent a locally linear regime of building performance. 2.1. Gaussian Mixture Model (GMM) Let (Y, X) be represented as a multivariate random variable Z for brevity. Let z = [y, x1 , . . ., xd ]T , an instantiation of Z, be a column vector of length (d + 1), where d is the number of regressors predicting a scalar value y. A Gaussian mixture density (z ; ), with K components, that describes the probability distribution of Z with parameters  can be written as follows (z; ) =

K 

k k (z; k , ˙ k )

(1)

k=1

where each Gaussian component k is parameterized by the mean vector k of the same length as z and a (d + 1) × (d + 1) positive definite covariance matrix ˙ k . The expression of a Gaussian density function is given in Eq. 2. k (z; k , ˙ k ) =

1 (2)d/2 |˙ k |1/2

 1

T

exp − (z − k ) [˙ k ] 2

−1



(z − k )

in maximum likelihood estimation is to obtain parameter values that maximize the likelihood of observing a given dataset. Given N i.i.d samples {z (i) }N , in the training data set, the logarithm of the 1 likelihood function is defined as, (|{z (i) }N 1) =

N 

log( (z (i) ; ))

(3)

i=1

Eq. (3) when maximized under the aforementioned constraints on the covariance matrices (˙ k ) and the mixing proportions (k ), yields the desired solution. This optimization is typically carried out using Expectation-Maximization (EM) algorithms, which is a widely studied area, with numerous commercial software implementation available. For this work, we used MATLAB’s Statistical toolbox to learn the Gaussian Mixture Models. Also, good reviews on EM algorithm and its variants can be found in [11,18]. 2.2. Model selection A key task involved in the GMM approach to regression is the identification of the optimal number of components (K) to be included in the model. To this end, we propose using Bayesian Information Criterion (BIC) (given in Eq. (4)) as a metric for selecting the number of components. In a rigorous study [13], BIC was shown to outperform other methods such as DIC (Deviance Information Criterion), ICL (Integrated Completed Likelihood) and AIC (Akaike Information Criterion) for GMMs in a wide array of application domains. BIC = −(|{z (i) }N 1)+

M log(N) 2

(4)

In Eq. (4), M represents the total number of free parameters, which for a GMM with K components can be obtained as M = K(d + 1) +

1

  2 mean

K(d + 1)(d + 2) + (K − 1).



covariance

(5)

  mixing

The first term in Eq. (4) is the logarithm of observed data likelihood, the expression of which is given by Eq. (3). The value of K that minimizes BIC is chosen as the number of components for the Gaussian Mixture Model. 2.3. Gaussian Mixture Model for regression In this section, we describe a GMM based approach to estimate the response variable (and the estimation error) given observed values of regressors. The approach is highly efficient as it involves evaluation of a set of algebraic equations. A more detailed treatment on the use of GMMs for regression analysis can be found elsewhere [15]. As mentioned earlier, we are interested in estimating the conditional mean, E(Y|X ; ), of the response variable Y given the regressors. For a GMM, the conditional density of response variable is given by Eq. (6).

(y; (x)) =

K  k=1

⎛ 1 1 k (x) × √ exp ⎝− 2 2Yk |X (x)



y − kY |X (x) Yk |X (x)

2 ⎞ ⎠ (6)

(2)

The scalar k is the non-negative mixing proportion of the kth comK k ponent such that  = 1. Thus, the parameter set  of a GMM k=1 is comprised of {k , k , ˙ k }, with 1 ≤ k ≤ K. The estimation of GMM parameters is done in the maximum likelihood setting. The goal

As evident, Eq. (6) also represents a univariate GMM parameterized by (x) = {k (x), kY |X (x), Yk |X (x)}. The argument x in (x) signifies that these parameters depend on the given value x of regressors X. It is easy to show that conditional mean and variance

A. Srivastav et al. / Energy and Buildings 65 (2013) 438–447

of random variable Y (with respect to the probability density in Eq. (6)), can be written as shown in Eqs. (7) and (8) respectively.



˛  k (x) 1− = 2 2 K

k (x)kY |X (x)

(7)

k=1

Var (Y ) =

K 

k

 (x)



Yk |X (x)2

+ kY |X (x)2





K 

k=1

2 k



k=1

(x)kY |X (x) (8)

For an observed set of values x of regressors X, we first obtain the parameters, (x), of the conditional GMM. Thereafter Eqs. (7) and (8) are used to compute the point estimate and the variance of the response variable respectively. This process is repeated every time a new set of values of regressors are observed. The parameters, (x), of univariate conditional GMM can be retrieved directly from the parameters, , of the multivariate GMM. Before providing the analytical expressions for (x), we introduce some more terms. Based on the regressors and response variable, the mean vector and the covariance matrix of the kth Gaussian component can be partitioned into subcomponents as shown in Eqs. (9) and (10). Using this decomposition, we can define the marginal Gaussian density of the kth component for the response variable and regressors as k ) and k (x; k , ˙ k ) respectively. Yk (y; kY , ˙YY X X XX



k

=

 ˙k

=

kY





with sizes

kX k ˙YY

k ˙YX

k ˙XY

k ˙XX

1×1



(9)

d×1



 with sizes

1×1

1×d

d×1

d×d

 (10)

Using the notation above, the equations to obtain (x) = {k (x), kY |X (x), Yk |X (x)}, for an observed value of x are given in Eqs. (11)–(13). k (x) =

k ) k × Xk (x; kX , ˙XX

K

i=1





1 + erf

k=1

K

yˆ (x) = E (Y |X = x) =

441

(11)

i ) i × Xi (x; iX , ˙XX

k k kY |X (x) = kY + ˙YX (˙XX )

−1

(x − kX )

2

k k k (Yk |X (x)) = ˙YY − ˙YX (˙XX )

−1

(12)

k ˙XY

(13)

2.4. GMR confidence intervals

y − kY |X (x) √ Yk |X (x) 2

(16)



k k k k In summary, first a Gaussian Mixture Model k   (z ;  , ˙ ) for response variable Y and the regressors X is learned from the data k , ˙k using the EM-algorithm. Then the parameters kY , kX , ˙YY XX k for GMM component k, are obtained from k and ˙ k using and ˙YX Eqs. (9) and (10). For a given value of x, conditional means kY |X (x)

and variances Yk |X (x) for each component k of the GMM are computed using Eqs. (12) and (13) respectively; and the mixing weights of the GMM components k (x) are computed using Eq. (11). The expected value of the response variable yˆ (x) is computed using Eq. (7). The local confidence intervals are obtained Eqs. (15) and (16). Steps involved in the Gaussian Mixture Regression approach are summarized as a pseudo-code in Algorithm 2.1. Algorithm 2.1. Let 

Gaussian Mixture Regression (X, {k , k , ˙ k }Kk=1 )

idY = Index of the response variable

idX = Indices of the regressors Obtain ⎧ k (Eqs.k (9) and (10)) Y ←  (idY )

⎪ ⎪ ⎪ k k ⎪ ⎪ ⎨ X ←  (idX )

˙ k ← ˙ k (idY : idY )

YY ⎪ ⎪ k ⎪ ← ˙ k (idX : idX ) ˙ ⎪ XX ⎪ ⎩ k k

˙YX ← ˙ (idY : idX ) Estimating the Response Variable along with the 100(1 − ˛) % CI for i ← 1 to N ⎧ x = X (i)

⎪ ⎪ ⎪ k ⎪ ) k × Xk (x; kX , ˙XX ⎪ ⎪ k (x) = K ⎪ ⎪ i × i (x; i , ˙ i ) ⎪  X X XX ⎪ i=1 ⎪ ⎪ k (x) = k + ˙k (˙k )−1 (x − k ) ⎪ Y |X Y YX XX X ⎪ ⎪ ⎪ 1/2 ⎪ ⎪ k k k k −1 k ⎪ ⎪ Y |X (x) = (˙YY − ˙YX (˙XX ) ˙XY ) ⎪ ⎨ K  do ˆ = k (x)kY |X (x) y (x) ⎪ ⎪ ⎪ ⎪ k=1  ⎪     ⎪ K ⎪  k y − kY |X (x) ⎪ arg ˛  (x) ⎪ ⎪ yˆ LL (x) = 1 + erf =0 − √ ⎪ 2 2 y ⎪ Yk |X (x) 2 ⎪ ⎪ k=1  ⎪     ⎪ K ⎪  ⎪ y − kY |X (x) arg ˛ k (x) ⎪ ⎪ 1 + erf =0 √ ⎪ 2 ⎩ yˆ UL (x) = y 1 − 2 − Yk |X (x) 2 k=1

return yˆ , yˆ LL , yˆ UL

Here we provide equations to compute the confidence intervals (CI) to quantify the uncertainty in the GMR estimates of the response variable. Eq. (14) gives the cumulative distribution function of the response variable obtained by integrating the conditional density given in Eq. (6) between −∞ to y. (y; (x)) =

K  k (x)



2



1 + erf

k=1

y − kY |X (x) √ Yk |X (x) 2



(14)

For a specified significance level, ˛, the lower and the upper limits of the 100(1− ˛) % CI can be obtained by solving equations (15) and (16) respectively with respect to y. Since, the error function, erf(), has a continuous derivative, we can use Newton–Raphson method to efficiently solve these equations in an iterative fashion. The error bars reported in the experimental section are obtained using this approach. ˛  k (x) = 2 2 K

k=1



1 + erf



y − kY |X (x) √ Yk |X (x) 2



(15)

3. Experiment and results The proposed methodology for modeling of building energy use and localized confidence estimation was applied to two data set – (1) one year simulation data generated using DOE reference model for a supermarket in Chicago climate and (2) one year of field data obtained from a retail store building. For each case GMR modeling results are compared with a multivariate linear regression model (MLR) for prediction accuracy and ability to quantify uncertainty. In addition, GMR-based localized confidence estimation is compared with the model-less approach [14] applied to MLR model. 3.1. Simulated data The simulated data used in paper is generated from DoE super market reference model for ASHRAE 90.1-2004 [3]. It is simulated as a single story, six-zone building with total area of 4181 m2 . Building envelop thermal properties vary with the climate according

442

A. Srivastav et al. / Energy and Buildings 65 (2013) 438–447

30

Mean trend

OAT oC

20 10 0 −10 −20 Feb

Mar

May

Jun

Aug

Oct

Nov

Humidity ratio

0.02

Mean trend

0.015 0.01 0.005

Feb

Mar

May

Jun

Aug

Oct

Nov

Fig. 1. Yearly temperature and humidity ratio profile for Chicago area.

to ASHRAE 90.1-2004. In this study, Chicago was selected as the climate location that is representative of a cold ASHRAE climate zone. Simulations were carried out for cooling electrical energy consumption only. The weather information is from TMY3 data, the yearly weather profile for Chicago is shown in Fig. 1. For the simulated study a retrofit energy savings analysis was also performed. A retrofit was simulated to occur mid-year on July 1st, where the windows glazing type and internal equipments were retrofitted to ASHRAE 90.1-2010 standard. Therefore, the windows’ glazing type changed from ASHRAE-90.1-2004 type to ASHRAE 90.1-2010 type. The U-value of the window is changed from 0.57 to 0.48, SHGC from 0.49 to 0.4. The internal equipment values are also decreased. It must be noted that the window areas are kept the same between pre- and post-retrofit periods. This set-up therefore generated six months worth of pre- and post-retrofit data from simulations. Baseline model developed on pre-retrofit data was used to quantify the energy savings and confidence bound around the energy savings achieved due to the retrofit effort. Energy savings, using both the MLR and GMR based baseline models, were computed and compared.

3.1.1. GMR modeling Outside dry-bulb air temperature (OAT), outside air humidity ratio (OAHR) and direct solar radiation were selected to be the regressors, while cooling electrical energy consumption was chosen to be the response variable. The cumulative daily values of energy use and the mean daily values for the three regressors were used to create both the MLR and GMR baseline models for predicting the daily energy use. To ensure a consistent comparison between Gaussian mixture and multivariate linear regression, the model selection for both

cases was based on the Bayesian Information Criteria. Specifically, the choice of the number of components for GMM was based on the discussion presented in Section 2.2. For the MLR model, the term M in the BIC equation (4) corresponds to the number of coefficients in the model and the log-likelihood of the observed data was computed as follows: POLY (|{z (i) }N 1) =

 1 2 (yi − yˆi ) N−M+1 N

2 =

(18)

i=1

The plot of BIC versus model order is shown in Fig. 2. Model order for the MLR model is the degree of the polynomial used, while for the GMR approach it is number of components in the mixture model. It can be seen that for the MLR model BIC is minimized by a 3rd degree polynomial, while for the GMR model the minimum BIC occurs for a 4-mode mixture model. For GMR, the 95% confidence interval around the estimated response variable is obtained as discussed in Section 2.4. For the MLR model, since the estimation error is assumed to be i.i.d with a zero mean normal distribution, the 95% CI consists of the range ±1.954 2 around the estimated value yˆi ; the error variance  2 is computed using Eq. (18). The method proposed by Subbarrao et al.

Polynomial Regression

3800

b

2600

BIC

2500

3200

2400 2300

2900

2200

2600 2

4 6 8 Number of components

10

(17)

where N represents a univariate normal distribution with parameters  i = {i ,  2 }, consisting of the means and the variance. The mean is nothing but the estimated response value by the MLR model and hence is indexed with a superscript i. The variance is computed globally by Eq. (18), and is identical for all data samples.

3500

BIC

log(N(yi ;  i ))

i=1

GMR

a

N 

0

2

4

6

Polynomial Degree

Fig. 2. Model selection for MLR (3rd order) and Gaussian Mixture Regression (4-component mixture).

A. Srivastav et al. / Energy and Buildings 65 (2013) 438–447

Regressor: OAT 2001.6 1501.2 1000.8 500.4 0 0

10

20

30

Regressors: OAT, OAHR 2001.6 1501.2 1000.8 500.4 0 0

10

OAT oC

20

30

Cooling electric energy (MJ)

Polynomial Regression Cooling electric energy (MJ)

Cooling electric energy (MJ)

Predicted w/ 95% CI Measured

443

Regressors: OAT, OAHR, Solar 2001.6 1501.2 1000.8 500.4 0 0

10

OAT oC

20

30

OAT oC

Fig. 3. Multivariate linear regression (MLR) baseline modeling for 3 different regressor sets (a) outside-air temperature (OAT), (b) OAT and outside-air humidity ratio (OAHR) and (c) OAT, OAHR & solar radiation. (For interpretation of the references to color in text, the reader is referred to the web version of the article.)

Table 1 Overall prediction errors for MLR and GMR models for three different regressor sets.

[14] to obtain the local uncertainty bounds was also implemented for the MLR model and compared with localized CI estimates from the GMR model.

3.1.2. Results For the MLR model, Fig. 3 shows the measured (red squares) and predicted values (blue stars) of daily energy use against average outside-air temperature (OAT) for three different regressor sets – (a) OAT, (b) OAT and OAHR, (c) OAT, OAHR & solar radiation. Global confidence bounds computed using 1.954 2 are shown as vertical error bars. It can be seen that the overall error in prediction decreases with the inclusion of more explanatory regressors. Also, as argued earlier, the 95% confidence bounds can be seen to be overconservative for low energy use values, while being too small when energy use is higher. Therefore, while 95% of the observed data does fall within the confidence bounds globally, the uncertainty is smeared across the data range making local confidence bounds inconsistent with observations. Fig. 4 shows a similar plot for the GMR model, with daily measured (red squares) and predicted (blue stars) energy use plotted against daily average OAT for the same three regressor sets. While the trend of overall lower prediction errors with more number of regressors is the same as for the MLR model, the CI can be seen to be locally adaptive. The error bars around areas of good prediction are significantly tighter as compared to areas of low prediction accuracy. The overall R2 prediction errors for the three different regressor sets for both MLR and GMR models is given in Table 1. As mentioned in Section 2, the GMR approach solves the larger problem of estimating the joint probability density F(Y, X ; ) given the training data in the maximum-likelihood setting. Therefore the task of uncertainty estimation and model prediction, as the conditional mean value E(Y|X ; ), are integrated together as one

1501.2 1000.8 500.4 0 10

20 o

OAT C

MLR

OAT OAT & OAHR OAT, OAHR & solar radiation

0.9467 0.9493 0.9505

0.8941 0.9213 0.9357

modeling task. Fig. 5 shows a contour plot of the joint probability density of the data estimated by the GMR modeling process. Since GMR model is 4-dimensional with 3 regressors and one response variable, 2D contour plots are shown for 6-unique pairs of variables. It can be seen that learned GMM was able to adequately capture the distribution of the training data. Moreover, the correlation among regressors is naturally captured by the GMM model; this is evident from GMM density contours over the scatter plot of OAHR vs. OAT, in Fig. 5. Figs. 6 and 7 show the measured (red squares) and estimated daily energy use (blue line) for the GMR and MLR models for one year respectively. Data from first that half of the year (January–June, marked) was used for training both the models, while prediction were made for the entire data set (January–December). Not only the GMR model has a significantly better performance in prediction accuracy, it also has locally adaptive confidence intervals, this is shown as blue bands around the data for both models. Uncertainty quantification is necessary to find confidence bounds around estimated energy savings predicted using the baseline model. The energy savings ei = yˆi − y calculated by comparing the MLR baseline model with the measured data and the uncertainty around it are shown in Fig. 6. It can be seen that the energy savings show a sharp increase right after the retrofit in July. Both global and localized confidence intervals are shown. Even though the localized confidence intervals, computed using

30

(b) Regressors: OAT, OAHR 2001.6 1501.2 1000.8 500.4 0 0

10

20 o

OAT C

30

Cooling electric energy (MJ)

2001.6

0

GMR

GMR

(a) Regressor: OAT Cooling electric energy (MJ)

Cooling electric energy (MJ)

Predicted w/ 95% CI Measured

Regressors

(c) Regressors: OAT, OAHR, Solar 2001.6 1501.2 1000.8 500.4 0 0

10

20

30

OAT oC

Fig. 4. GMR based baseline modeling for 3 different regressor sets (a) outside-air temperature (OAT), (b) OAT and outside-air humidity ratio (OAHR) and (c) OAT, OAHR & solar radiation. (For interpretation of the references to color in text, the reader is referred to the web version of the article.)

444

A. Srivastav et al. / Energy and Buildings 65 (2013) 438–447

Fig. 5. Contours of marginal distributions of model variables taken two at a time and scatter plots of data (red points). It can be seen that dependence of the predicted variable (energy) on regressors and correlation between regressors is well captured by the GMR model of 4-components. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

nearest-neighbor approach [14], are more consistent than the global CIs, it appears to be more noisy, especially for low energy regions. As mentioned earlier, this instability in CI is a result of low data density in local neighborhoods of a given point. Fig. 7 shows the energy savings ei = yˆi − y and uncertainty for GMR baseline model. Confidence intervals for this case are tighter for regions of good prediction (low energy use) while they adapt locally for different conditions based on the uncertainty present in

training data. Also, it can be seen that the GMR confidence intervals are less noisier and adapt smoothly to changes in outside conditions, as compared to the nearest neighbor based CI presented earlier. For the MLR baseline models, the total energy savings for postretrofit time period is the range −9.3 to +133.5 MJ using global CI estimates, it is between −14.4 and +126.4 MJ using localized CI estimates. While for the GMR model the energy savings were in

Electric Energy (MJ)

Polynomial Regression 3000 Pre−retrofit

1000

meas baseline

(MJ)

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Global Uncertainty Quantification

500

−E

Feb

2000 1500

E

Estimated Measured

First 6 months used for model training

0 Jan

Ebaseline−Emeas (MJ)

Post−retrofit

2000

Global CI

1000

0 −500 Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

KNN Uncertainty Quantification 2000 NN−based local CI

1500 1000 500 0 −500 Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Fig. 6. Polynomial baseline-model prediction compared with measured data pre- and post-retrofit (top); and estimated energy savings based on the global error estimation (middle) and kNN-based localized error estimation (bottom). (For interpretation of the references to color in text, the reader is referred to the web version of the article.)

A. Srivastav et al. / Energy and Buildings 65 (2013) 438–447

445

Electric Energy (MJ)

GMR 3000

Estimated Measured

First 6 months used for model training

1000 0 Jan

Ebaseline−Emeas (MJ)

Post−retrofit

Pre−retrofit

2000

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

GMR Uncertainty Quantification 2000 GMR−based local CI

1500 1000 500 0 −500 Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

CI around predicted Electric energy (MJ)

400 300 200 100 0 −100 −200 −300 −400

CI around predicted Electric energy (MJ)

Fig. 7. GMR baseline-model prediction compared with measured data pre- and post-retrofit (top); and localized uncertainty quantification based on the GMR baseline model. (For interpretation of the references to color in text, the reader is referred to the web version of the article.)

400 300 200 100 0 −100 −200 −300 −400

Jan

Jan

MLR Global CI MLR k−NN CI−LB MLR k−NN CI−UB

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov Dec MLR Global CI GMR CI−LB GMR CI−UB

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Fig. 8. Uncertainty quantification comparison: Comparison of global estimate of CI for MLR (blue band) with k-NN based localized CI for MLR (top); and comparison of MLR global CI estimate with GMR based localized CI (bottom). Localized CI upper bound (CI-UB) and lower bound (CI-LB) are shown as red and blue solid lines respectively in both top and bottom plots. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

Electric Power (MJ)

Hourly Electricity Consumption (GMR) 200 150 100

2

R = 0.91 ± 0.038

50 0 06/03

Cooling Electric Power (MJ)

95% CI Estimated Measured

06/04

06/05

06/06

06/07

06/08

06/09

06/10

06/09

06/10

Hourly Electricity Consumption (MLR) 200 150 100

95% CI Estimated Measured

2

R = 0.90 ± 0.030

50 0 06/03

06/04

06/05

06/06

06/07

06/08

Fig. 9. Hourly model predictions (black line), measured data (squares) and confidence intervals (blue bands) based on GMR (top) and MLR (bottom) based baseline models. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

446

A. Srivastav et al. / Energy and Buildings 65 (2013) 438–447 Outside−air temperature (Field data) 100

o

OAT ( F)

Mean trend 80 60 40 Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Sep

Oct

Nov

Dec

Jan

Outside−air humidity (Field data)

Humidity (%)

100 80 60 Mean trend

40 20 Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Fig. 10. Outside conditions for 2011 for retail store used in field study.

Electricity Consumption (MW)

Electricity Consumption (MW)

GMR (Field data) 110

GMR−based local CI Estimated Measured

100 90 80

2

R = 0.941

First 6 months used for model training

70 60 Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Polynomial Regression (Field data) 110 100 90

Global CI Estimated Measured

2

First 6 months used for model training

R = 0.938

80 70 60 Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Fig. 11. Comparative evaluation of GMR (top) and MLR (bottom) baseline models using 2011 field data from a California retail store.

a much tighter range +5.4 and +102.7 MJ. A comparison of uncertainty quantification using (1) global estimation of CI, (2) k-NN based localized CI and (3) GMR based localized CI is shown in Fig. 8. To demonstrate the performance of the GMR model at a finer time-scale, we compare the GMR and MLR models for hourly prediction of cooling electric energy use. We use the same set of regressor and response variables as used for the daily models. The results for a week worth of predictions and CI estimation are shown in Fig. 9. Similar to the daily model, the GMR model for daily energy use, the localized CIs adapt smoothly to the changes in input conditions and appear to be more consistent with the observed prediction errors than the MLR model. The overall prediction performance of the two models in this case is comparable with an R2 value of 0.91 and 0.90 for the GMR and MLR models respectively. 3.2. Field data The proposed GMR model was used for baseline modeling for a large retail store building located in CA, U.S.A. The total gross floor size of this building is about 143,000 ft2 and sales floor size is about 85,000 ft2 . The HVAC system of this building includes 28 Roof Top Units (RTUs); there is a refrigeration system serving food items in multiple display cases and controlled by Refrigeration Monitoring and Control System. Other major energy using subsystems are sales floor lighting, parking lights and plug loads. Field

data was collected every five minutes over the period of one year for 2011 which included whole building energy use, sub-system energy use and weather information – outside-air dry-bulb temperature, humidity and direct solar radiation. The weather profile for this location is shown in Fig. 10. Both GMR and MLR models were trained for this retail store, data from the first 6 months (January–June) was used for training the models and the models were then used to make predictions for the entire year. For the GMR model 4 Gaussian modes were used, while for the MLR model a polynomial of degree 3 was used based on BIC criteria and global confidence intervals were computed for prediction errors. Fig. 11 shows the predicted (black lines), confidence intervals (blue patches) and measured values (squares) of the whole building energy use for the year of 2011 for GMR (top) and MLR (bottom) baseline models. The GMR model had an R2 value of 0.941 as compared to 0.938 for the MLR model, thus making slightly better predictions but the GMR model also provided far more consistent confidence estimates for building energy use than the MLR model. 4. Conclusion This paper presented a novel approach based on Gaussian Mixture Regression (GMR) for modeling building energy use with parameterized and locally adaptive uncertainty quantification. The

A. Srivastav et al. / Energy and Buildings 65 (2013) 438–447

proposed GMR approach was applied to two cases (1) synthetic data set generated by DoE reference model for a supermarket and (2) field data set for a retail store building located in California. The results from GMR model were compared with multivariate linear regression for response variable prediction and confidence estimation was compared with a recent technique proposed in [14]. It was shown that the GMR approach performed better in both prediction accuracy and local confidence estimation. Also, the proposed approach was shown to have the following key advantages (1) integrated response surface modeling and local uncertainty quantification, (2) low impact of correlated regressors, (3) less sensitivity to data sparsity, and (4) formal model structure selection. Mixture modeling is not restricted to using Gaussian densities for modeling the distribution in data, for certain scenarios other constituting densities, such the log-normal, might be more relevant. Also, linear dependence structure assumed in this work can be relaxed by dependence modeling approaches such as using Copula functions. These extensions of the current work are proposed to be explored in the domain of building energy performance modeling. References [1] ASHRAE, ASHRAE Guideline 14-2002 for Measurement of Energy and Demand Savings, 2002. [2] CBECS, Commercial Buildings Energy Consumption Survey, 2003. [3] M. Deru, K. Field, D. Studer, K. Benne, B. Griffith, P. Torcellini, M. Halverson, D. Winiarski, B. Liu, M. Rosenberg, J. Huang, M. Yazdanian, D. Crawley, U.S. Department of Energy Commercial Reference Building Models of the National Building Stock, U.S. Department of Energy, Energy Efficiency and Renewable Energy, Office of Building Technologies, 2010. [4] A. Dhar, T. Reddy, D. Claridge, A Fourier series model to predict hourly heating and cooling energy use in commercial buildings with outdoor temperature as the only weather variable, Journal of Solar Energy Engineering 121 (1999) 47–53.

447

[5] EVO, IPMVP Volume I: Concepts and Options for Determining Energy and Water Savings, 2012. [6] M. Fels, Special issue devoted to measuring energy savings, the Princeton scorekeeping method (PRISM), Energy and Buildings (1986) 5–18. [7] Y. Heo, V.M. Zavala, Gaussian process modeling for measurement and verification of building energy savings, Energy and Buildings 53 (2012) 7–18. [8] S. Katipamula, D. Claridge, Use of simplified system models to measure retrofit energy savings, Journal of Solar Energy Engineering 115 (2) (1993) 57–68. [9] J.K. Kissock, A methodology to measure retrofit energy savings in commercial buildings, Texas A&M University Department of Mechanical Engineering, 1993 (PhD thesis). [10] M. Krarti, J. Kreider, D. Cohen, P. Curtiss, Prediction of energy saving for building retrofits using neural networks, Journal of Solar Energy Engineering 120 (3) (1998) 47–53. [11] R. Neal, G.E. Hinton, A view of the EM algorithm that justifies incremental sparse and other variants, in: Learning in Graphical Models, Kluwer Academic Publishers, 1998, pp. 355–368. [12] T.A. Reddy, N.F. Saman, D.E. Claridge, J.S. Haberl, W.D. Turner, A.T. Chalifoux, Baselining methodology for facility-level monthly energy use—Part 1: Theoretical aspects, ASHRAE Transactions 103 (2) (1997). [13] R.J. Steele, A.E. Raftery, Performance of Bayesian model selection criteria for Gaussian mixture models. Technical Report 559, University of Washington, Dept. of Statistics, 2009. [14] K. Subbarrao, Y. Lei, T.A. Reddy, The nearest neighborhood method to improve uncertainty estimates in statistical building models, ASHRAE Transactions 117 (2) (2011). [15] H.G. Sung, Gaussian mixture regression and classification, Rice University, Houston, TX, 2004 (PhD thesis). [16] S. Wong, K.K. Wan, T.N. Lam, Artificial neural networks for energy analysis of office buildings with daylighting, Applied Energy 87 (2) (2010) 551–557. [17] J. Wu, T.A. Reddy, D. Claridge, Statistical modeling of daily energy consumption in commercial buildings using multiple regression and principal component analysis, in: Eight Symposium of Improving Building Systems in Hot and Humid Climate, Dallas, TX, 1992, pp. 155–164. [18] L. Xu, M.I. Jordan, On convergence properties of the EM algorithm for Gaussian mixtures, Neural Computation 8 (1995) 129–151. [19] J. Yang, H. Rivard, R. Zmeureanu, On-line building energy prediction using adaptive artificial neural networks, Energy and Buildings 37 (12) (2005) 1250–1259.