An application of Bayesian Network approach for selecting energy efficient HVAC systems

An application of Bayesian Network approach for selecting energy efficient HVAC systems

Journal of Building Engineering 25 (2019) 100796 Contents lists available at ScienceDirect Journal of Building Engineering journal homepage: www.els...

873KB Sizes 1 Downloads 57 Views

Journal of Building Engineering 25 (2019) 100796

Contents lists available at ScienceDirect

Journal of Building Engineering journal homepage: www.elsevier.com/locate/jobe

An application of Bayesian Network approach for selecting energy efficient HVAC systems

T

Zhichao Tiana,b,∗, Binghui Sia,b, Xing Shia,b, Zigeng Fangc a

School of Architecture, Southeast University, Nanjing, China Key Laboratory of Urban and Architectural Heritage Conservation, Ministry of Education, China c The Bartlett School of Construction and Project Management, University College London, London, UK b

A R T I C LE I N FO

A B S T R A C T

Keywords: Machine learning Bayesian network Building performance design HVAC selection

The conventional approach of selecting HVAC systems is based on a designer's knowledge and experience, and this may lead to flawed decisions. The ever-growing accumulation of building performance data makes the machine learning algorithm based HVAC system selection possible. This study presents an innovative approach wherein the Bayesian Network technique is applied to select the most energy efficient primary HVAC systems. The database upon which the approach is developed is the 2012 Commercial Building Energy Consumption Survey (CBECS). The first step of this research involves clustering a group of similar buildings for the target building. Euclidean distance is adopted to calculate the similarity of a building with the target building. In the meantime, a survey is carried out to investigate the major factors that designers considered when selecting the primary HVAC system. The survey results show that climate zone, design cooling/heating load, and principal activity type are three main factors considered by designers and only 13% designers value the building energy consumption. In this study, all factors that could potentially influence the HVAC system's energy consumption are used to construct the directed acyclic graph of the proposed Bayesian Network Classifier. This classifier is trained with high energy efficient buildings data after the filtering out of irrational outliers. Then, the trained classifier is applied to select the primary cooling systems for three case buildings. The results indicate that the selected systems coincide with common HVAC design logic. The proposed method provides designers with an innovative approach to select energy efficient HVAC systems by using the data of hundreds of high energy efficient buildings. This study demonstrates the feasibility and capability of data-driven building design.

1. Introduction 1.1. Potential of data-driven building energy efficient design Buildings consume approximately 40% of the total final energy in developed countries like the U.S. and the U.K. out of which HVAC systems account approximately 50% [1]. In the past two decades, green building design and certification have become prevailing practices to improve the energy efficiency of building components, especially HVAC systems. In China, several building design regulations and standards were propagated to improve the energy efficiency of HVAC systems [2]. The selection of HVAC systems that can maintain desired environmental conditions is the first and foremost job to achieve low energy HVAC operation. In order to select appropriate HVAC systems, traditionally, several criteria related to occupant comfort are considered: temperature, humidity, air changes per hour, first cost, operating cost,



reliability, performance limitations, and available utility source, etc. [3]. Currently, energy simulation is the most popular method in use to evaluate the performance of different HVAC solutions [4]. However, the gaps between measured data and simulation results jeopardize its credibility [5,6]. In recent years, several green building institutions and governments have built their own large-scale building databases such as the building performance database built by the U.S. Department of Energy [7], Commercial Building Energy Consumption Survey (CBECS) database gathered by the U.S. Information Administration [8], EU Buildings Database built by European Commission [9], and green building repository organized by the Building and Construction Authority of Singapore [10]. Information about energy patterns and efficiency measures can now be unveiled with massive building performance data to support energy policy-making, city planning, and high efficient building design [11]. For example, in order to picture the building

Corresponding author. School of Architecture, Southeast University, Nanjing, China. E-mail address: [email protected] (Z. Tian).

https://doi.org/10.1016/j.jobe.2019.100796 Received 18 November 2018; Received in revised form 1 May 2019; Accepted 5 May 2019 Available online 11 May 2019 2352-7102/ © 2019 Elsevier Ltd. All rights reserved.

Journal of Building Engineering 25 (2019) 100796

Z. Tian, et al.

energy use in a city scale, Shahrokni et al. [12] analyzed the meter data of 8245 multi-apartment residential buildings. It was indicated that if all the buildings were retrofitted to comply with current building standards, the heating energy usage would be reduced by one-third. This study suggested that public policies should target buildings with the highest payback potential. Mathew et al. [13] analyzed the usefulness, limitations, and future applications of current building performance database. There are abundant researches tried to build machine learning models for building energy prediction [14–16]. However, very few studies attempted to apply data-driven methods to building energy efficient designs [17,18].

component models that are quite simple. Magdalena [29] presented a simplified optimization method for the design of the HVAC system by using the daily profile loads of zones. In this study, optimization variables included a grouping of zones and a number of systems. The energy consumption of the HVAC system was carried out by DOE-2, a detailed energy simulation software. The results indicated that the energy savings of different HVAC system configurations depend on a variety of variables (e.g. building configuration, imposed constraints, types of HVAC systems, and corresponding control strategies).

1.2. Related works

When selecting HVAC systems for a building, HVAC designers may choose a familiar system or adopt a system according to engineering guidance, such as PNNL's recommendations for HVAC equipment of post-1980 buildings [30]. A reason why individuals often make bad decisions is that they overly rely on their own subjective experience. Misleading experience, here, refers to memories that appear similar to the situation engineers previous encountered. Misleading experience contributes to more than half of flawed decisions [31]. Machine learning methods may be able to enhance the robustness of HVAC selection decision making and eliminate misleading personal experience. This study attempts to develop a simple but powerful method to help designers select HVAC primary systems in the early design stage. A Bayesian Network Classifier is developed to recommend energy efficient HVAC systems by using the CBECS 2012 database. Moreover, a focused survey is conducted to reveal the attributes considered by designers. Both building design attributes and measured energy data will be fed into the Bayesian Network Classifier to provide a decisionmaking tool that enhances the robustness and scientific level of building efficiency design. The rest of the paper is organized as follows: Section 2 discusses the data and methodology; Section 3 provides the results; A detailed discussion is given in Section 4; Finally, Section 5 outlines key conclusions and future works.

1.3. Proposed approach

Data-driven building energy efficient design is normally relied on powerful statistical or machine learning algorithms to rise high efficient building design solutions. Marasco and Kontokosta [19] developed a user-facing falling rule list (FRL) classifier to predict the eligibility of energy conservation measures given a specific set of building characteristics data. Moreover, the developed linear decision lists allow building stakeholders to easily identify possible energy conservation renovation opportunities by limiting inputs to the most relevant factors. Makasis et al. [20] applied machine learning techniques to determine the maximum volume of thermal energy that can be provided by energy piles. In addition to that, they validated their models’ performance by using finite element numerical simulation under different geometries and thermal load distributions. Bayesian Network has been applied to many fields to calculate the probability of an event based on prior knowledge of a group of dependent variables that may relate to this event. In the field of HVAC operation, Bayesian Network has been deployed to identify faults intelligently. Zhao et al. [21,22] proposed a diagnostic Bayesian Network model for air handling unit fault detection. Cai et al. [23] developed a Bayesian Network based multi-source information fusion fault diagnosis methodology. They built two Bayesian Network models. One of them was built with sensor data, the other was built with observed information. For this study, the structure of the Bayesian Network was constructed with a fault layer and a symptom layer. Kim et al. [24] developed a multi-criteria decision-making process of selecting a better HVAC system out of two candidate systems using Bayesian Markov chain Monte Carlo method. A Bayesian Network approach has also been applied to compare different building designs by estimating the effects of the thermal indoor environment on the mental performance of office workers [25]. There are many studies of applying data-driven methods for building energy consumption prediction. However, very few of them can be used for HVAC system selection or configuration. Amasyali and El-Gohary [16] reviewed data-driven approaches for a large variety of building energy consumption prediction studies. They pointed out that data-driven approaches typically focus on heating, cooling, lighting, and total energy forecasting. Deng et al. [26] applied six machine learning based regression methods to predict the energy use intensity for US commercial office buildings based on the CBECS 2012 database. Even though HVAC type is one of the attributes for building the model, but the poor prediction accuracy indicated these models were not qualified for selecting energy efficient HVAC systems. Currently, building energy simulation is the most frequently used method for selecting HVAC systems. Youssef and Krarti [27] developed a simple energy simulation environment to select HVAC systems for residential buildings. Optimization algorithms were also integrated into the energy simulation environment. The results indicated a 10–25% potential life-cycle costs reduction, but, unfortunately, the method can only be applied to residential buildings. Wright et al. [28] applied a multi-objective genetic algorithm to optimize the desired size of the HVAC systems trading off between energy cost and occupant thermal comfort, in which, the HVAC model was modeled by steady state

2. Methodology An overview of the methodology of the proposed Bayesian Network Classifier for primary cooling system selection is shown in Fig. 1 with the following steps: 1. Select a suitable database. 2. Conduct a similar building analysis. Design strategies of buildings similar to the design buildings are more preferable than these from very different buildings. In this study, the Bayesian Network model is going to be trained with only with the high-performance buildings from the selected similar building data group. 3. Conduct a survey to find out the attributes considered by HVAC designers when they select primary cooling/heating systems. This survey helps to understand the existing situations. These attributes will contribute to the building-up of the Bayesian Network Classifier. 4. Build the Bayesian Network Classifier wherein attributes that influence cooling energy consumption are included. 5. Recommend HVAC primary systems for three case buildings using the proposed Bayesian Network Classifier. The red arrow line demonstrates the process of applying the Bayesian Network Classifier in selecting HVAC systems. 2.1. Data and pre-processing The proposed approach requires a database that contains a significant amount of data with attributes including building type, climate zone, floor area, envelope, HVAC system, and energy consumption, etc. Additionally, a part of the buildings in the database should be energy efficient. Otherwise, it is harder to guarantee the successful application 2

Journal of Building Engineering 25 (2019) 100796

Z. Tian, et al.

Fig. 1. A logical path of applying the Bayesian Network Classifier for selecting cooling system.

the fidelity, top 3% lowest cooling energy usage buildings are excluded. Thus, buildings whose EUIcooling range from 3% to 23% are set as high energy efficient buildings.

of machine learning techniques on the energy efficient design. Based on the above criteria, the 2012 CBECS database was selected as the starting point. The 2012 CBECS database is an open-access database with rich attributes and superior quality. The database is obtained from the U.S. Energy Information Administration [8]. Due to confidential concerns, data that may leaks private information is transformed to other forms or removed. The CBECS database contains 6720 records that represent an estimated 5.6 million commercial buildings in the United States. Each piece of data includes 1119 attributes, although most of which are not described in detail. For example, the construction of the external wall is classified into nine categories as follows: ‘1’ = ‘Brick, stone, or stucco’, ‘2’ = ‘Pre-cast concrete panels’, ‘3’ = ‘Concrete block or poured concrete (above grade)’, ‘4’ = ‘Aluminum, asbestos, plastic, or wood materials (siding, shingles, tiles, or shakes)’, ‘5’ = ‘Sheet metal panels’, ‘6’ = ‘Window or vision glass (glass that is seen through)’, ‘7’ = ‘Decorative or construction glass’, ‘8’ = ‘No one major type’, and ‘9’ = ‘Other’. Although the CBECS database lacks detailed attributes about operation schedules, it contains necessary attributes for this study. Table 1 lists the attributes and their abbreviations used in this study. In the study, we assume that buildings in top 20% of lowest energy usage intensity (EUI) for cooling are high-efficient. EUICooling is defined as the annual cooling energy consumption per unit area. With respect to the cooling system selection, the cooling energy usage intensity is calculated by Eq. (1). From the CBECS data, it is found that buildings with the lowest cooling energy usage would be smaller by one or two orders of magnitude compared with the average building. In order to increase

EUICooling =

ECooling A

(1)

where, ECooling is the energy for cooling, kBtu; A is the total building area, ft2. Cooling load is one of the key parameters used for HVAC selection. However, this attribute was not collected by CBECS 2012 database. The cooling loads used by many researchers were generated by building energy simulation software [32]. But it is impossible to build energy models for all buildings in the database. One of the simplified methods for calculating cooling load in the early design stage is named as the cooling load index method. This method states that the cooling load of a building approximately equal to the product of cooling load index and building area just as Eq. (2) shows. Practical Heating and Air Conditioning Design Manual gives the range of the cooling load index for each building type [33]. For example, the range of the cooling load index of an office building is 90–115. HVAC designers may adjust the cooling load index based on their understanding of the circumstances of the building, especially its climate zone or surface area. In this study, the cooling load index of buildings in the northeast of the U.S. are given the maximum value in the range; buildings in the midwest are sat in the middle of range; buildings in the west are given the minimum value of the range; and buildings in the south are given 80% of the minimum value. 3

Journal of Building Engineering 25 (2019) 100796

Z. Tian, et al.

Table 1 Attributes and their abbreviations. Attribute

Abbreviation

Attribute

Abbreviation

Building America Climate Region Building Area Building Shape Cooling Loads Census Division Cooling Degree Days (base 65 °F) Cooling System Energy Usage Intensity

PUBCLIM SQFT BLDSP COOLLOAD CENDIV CDD65 CS EUI

Heating system High efficient cooling system (HECS) Main Cooling Equipment Number of Employees Percent Exterior Glass Percent Cooled Principal Building Activity Wall Construction Material

HS HECS MAINCL NWKER CLSSPC COOLP PBA WLCNS

(2)

CL i = Ii•A

buildings as calculated by Eq. (4). As expressed in Eq. (5), the smaller the value is, the more similar these two buildings are. The similarity is calculated for each piece of data by Eq. (4). Finally, all the similarity values are sorted in ascending order, and the top N (in this study, N = 300) pieces of data are used for further analysis.

2

where, Ii is the cooling load index of building type i, W/m ; A is the total building area, m2. 2.2. Similarity analysis In data mining, similarity analysis is one of the most commonly used data pre-processing process. There are several commonly used methods for similarity analysis such as the Euclidian distance, road Network, Manhattan distance, and Minkowski distance [34]. In this study, Euclidian distance is employed to analyze the similarity of a pair of buildings. If two buildings are similar, we assume that their many attributes will be highly similar (e.g. areas, building shape locations, principal building activities, year of construction, number of employees, and climate regions). Then, the pairwise distance between pairs of observation is used to denote the similarity of different buildings. The equation of Euclidian distance shows as follows:

d(p, q) = d(q, p) =

(p1 − q1)2 + (p2 − q2)2 +…+(pn − qn )2

2.3. Determining factors for primary HVAC system selection Prior to applying Bayes’ theorem for recommending a set of heating and cooling system for designers, we conducted a focused survey to unveil the attributes considered by designers when they select the heating and cooling system in the early design stage. There are two questions in the questionnaire. The first question is about their occupations. The second question asks participants to point out which attributes out of eight candidates did they consider when choosing the primary HVAC systems. There are 50 interviewees in total. Most of them have a building service engineering background. There are also 16 HVAC experts participate in the second question. The survey results are shown in Fig. 2. In Fig. 2, it is showed that the top four main factors considered by designers are climate zone, main activity, and design cooling/heating loads. Occasionally, designers consider the requirements of the indoor environment and initial investment. Only a few designers consider annual building energy consumption and corresponding fees which may potentially because designers cannot acquire enough operation and financial budget information in the early design stage.

(3)

where, p and q are pairwise data; p1, p2 … pn are the attributes of p; q1, q2 … pn are the attributes of q. Several building attributes, such as wall types and building shapes, are categorical. A categorical value of a pair of observations does not exhibit any similarity unless their values are identical. Different attributes are assumed to have a different impact on the similarity calculation. Thus, weight factors are used to express these differences. There is no fixed pattern to figure out these weights. Practitioners can adopt different weights in various scenarios. Table 2 lists the attributes used to calculate the similarity and corresponding weights used in this study. The difference of each attribute is calculated as follows:

diffi = Weighti ∗ (Vc, i − VR, i )

2.4. Bayesian Network Classifier Bayes' theorem is a simple but powerful method to explore the potential relationship between results and their reasons from statistical data. Eq. (4) states the Bayes’ theorem mathematically as follows:

(4)

where, Weighti is weight of attribute i; Vc,i is the value of attribute i of the case building; VR,i: value of attribute i of the reference building. The similarity between the case and reference building is expressed as follows:

P(A|B) =

∑ diffi2

(5)

i=1

where, diffi is the difference of the attribute i in the case and reference Table 2 Attributes used to calculate similarity. Attributes

Weight

Variable type

Principal building activity Building area Wall construction material Building shape Percent exterior glass Year of construction Number of employees Building America Climate Region

5 2 1 1 2 2 1 5

Categorical Continuous Categorical Categorical Continuous Continuous Continuous Categorical

(6)

Where, A and B denote events; P (B ) denotes the probability of observing event B; P (A) denotes the probability of observing event A; P(A|B) denotes the conditional probability of the likelihood that event A occurs given that event B occurred; P (B|A) denotes the conditional probability of the likelihood that event B occurs given that event A occurred. If any attribute is independent to others, a naïve Bayesian classifier can be used to fulfill the recommendation goal. A set of bivariate analyses is implemented to test the dependence of any two attributes. The pair-wised P-value for a case building is presented in Table 3. If we set the significance level α= 0.01 and the p-value is lower than α , then the null hypothesis is rejected. Thus, the class-conditional independence of the two attributes is invalid. As a result, it is unreliable to apply the Naïve Bayesian Classifier for the study. More information on the bivariate analysis and P-value is given in Ref. [35]. Furthermore, the Apriori Algorithm is deployed to analyze the co-occurrence of heating and cooling systems. The Apriori algorithm is an unsupervised

n

S(p, q) =

P (B|A) P (A) P (B )

4

Journal of Building Engineering 25 (2019) 100796

Z. Tian, et al.

Fig. 2. Votes of different attributes considered by HVAC designers.

conditional probability table increase exponentially. For example, if each parent attributes of HECS has 5 observations respectively, then there are 55*2 = 7200 sets of combinations in CPT. The number of observations is reduced based on their original distributions. For example, the distribution of PUBCLIM initially is ‘Very cold/Cold’: 0.84; ‘Mixed-humid': 0.075; ‘Hot-dry/Mixed-dry/Hot-humid': 0.070; ‘Marine’: 0.016. Since the ratio of observations ‘Mixed-humid', ‘Hotdry/Mixed-dry/Hot-humid', and ‘Marine’ are comparably small, these three observations are combined to a new category. Continuous attributes are binned to several categories, for example, cooling degree days are binned to four categories as shown in Table 4. Table 4 summarizes the states of attributes of the proposed building. The conditional probability table for MAINCL and HECS are summarized from the data of all similar buildings. In the study, the Bayesian Network Classifiers are developed with Pomegranate, which is a Python module published by Jacob Schreiber designed for probabilistic modeling [38].

learning algorithm designed for mining frequent items that satisfy the minimal support level and association rules between frequent items [36]. In contrast to the Naïve Bayesian method, Bayesian Network is also based on Bayes’ theorem considers the dependencies of different attributes. The Bayesian Network employs a directed acyclic graph and a set of conditional probability tables (CPT) to represent the relationship between different sets of attributes [36]. A directed acyclic graph represents the conditional probability relationships between a group of attributes, for example, diseases of (lung cancer) and symptoms (smoker, family history, emphysema, dyspnea, and X-Ray result). As shown in the survey results, climate zone, main activity, and design cooling/heating loads are used to determine the HVAC system. Furthermore, Lokhandwala and Nateghi [37] applied statistical and machine learning algorithms with the CBECS database and identified that principle building activity, cooling degree days, percent cooled, and census division are the top four most important predictors for building cooling energy consumption. Based on the aforementioned information, a directed acyclic graph is plotted as shown in Fig. 3. Each node in the directed acyclic graph denotes an attribute. All the attributes are discrete. The directed arrow represents the inter-dependence relationship between different attributes. For instance, building America climate region, design cooling load, and principal building activity are the “parents” of main cooling equipment. This indicates that these three attributes are able to affect the selection of the main cooling equipment. Specifically, PUBCLIM, MAINCL, PBA, CDD65, and COOLP are the parent nodes of the high efficient cooling system (HECS). With respect to the proposed building, given it has a high-efficient cooling system, the Bayesian Network Classifier is capable of recommending the most likelihood cooling system. Thus, this cooling system recommendation process satisfies the maximum posterior hypothesis. In order to successfully construct the Bayesian Network Classifier, some observations of an attribute that account a small proportion are combined into one category. Otherwise, the number of data in the

3. Results Three buildings were selected as the case buildings. The case buildings were located at three different climate zones. Case 1 is a small size building; case 2 is a medium size building; and case 3 is a large size building. Detailed information of these buildings is presented in Table 5. We take case building 1 as an example to illustrate the similarity analysis. Table 6 lists the top 5 buildings that are similar to case building 1, as well as, the attribute values of these buildings. In Table 6, it is obvious that none of the buildings has the same attributes as the case building 1. Table 7 lists all the heating and cooling systems in the CBECS database. The observed frequency of heating and cooling systems in similar buildings of case building 1, high-performance buildings out of similar buildings, and all buildings are shown in Figs. 4 and 5. As shown in Fig. 4, for commercial buildings in the U.S., the packaged central unit (HS-2) and boiler insider (HS-3) are the most commonly

Table 3 Dependence between pair-wise attributes for a case building. Pair-wise attributes PUBCLIM PUBCLIM PBA PBA PBA

COOLLOAD PBA CENDIV MAINCL CDD65

P-Value

Dependence

Pair-wise attributes

0.952 0.994 2.82e-20 1.03e-10 0.0

No No Yes Yes Yes

PBA MAINCL CDD65 MAINCL

5

COOLP CDD65 COOLP COOLP

P-Value

Dependence

1.87e-33 1.0 0.0 0.992

Yes No Yes No

Journal of Building Engineering 25 (2019) 100796

Z. Tian, et al.

Fig. 3. Proposed Directed acyclic graph of Bayesian Network for primary cooling system selection.

used heating equipment. Compared to similar buildings, many buildings with high-efficient heating systems are equipped with heat pumps (HS-5) which indicated that buildings may consume less energy for heating if heat pumps were installed. Chiller inside (CS-3) and packaged air conditioning units (CS-2) are the two dominating systems that account for 81.4%. Table 8 lists the observed frequency of different cooling systems in similar buildings of the three case buildings. With respect to case building 2 and 3, packaged air conditioning units (CS-2) and chiller insider (CS-3) correspond to the two frequently observed systems respectively. With respect to the Apriori analysis for buildings similar to case building 1, if the minimum support level was set to 0.025, frequent systems will include HS-3, HS-5, CS-2, CS-3, and CS-5. Pairwise frequent systems will include {HS-3, CS-3}, {HS-4, CS-3}, {HS-5, CS-5}, {HS-2, CS-2}, and {HS-3, CS-2}. After the frequent items were determined in similar buildings, it was straightforward to calculate the association rules between them using a minimum confidence. Table 9 lists the association rules from the systems in the first column to the systems in the second column with a minimum confidence of 0.1. Systems in the first column also correspond to all the items that satisfy the minimum support level of 0.025. The table unveils the traditional combination of different heating and cooling systems. For example, when the HS-3 is adopted, the likelihood of CS-3 being used is 0.488 and that of CS-2 is 0.465. The heat pump is an HVAC system that includes both heating and cooling modes. Thus, the confidence between heating system 5 and cooling system 5 should correspond to 1.0 as listed in the last row of Table 9. The confidence from heating system 5 to cooling system 5 is 0.75, so a fade data must exist. The results of the Bayesian Network Classifier are listed in Table 10. The prediction results of the case buildings correspond to packaged air conditioning units or central chillers inside. Case building 1 exhibited a smaller cooling load owing to a small building area, and thus the Bayesian Network Classifier recommended packaged air conditioning units. With respect to a building with a smaller cooling load, it is reasonable to select a set of packaged air conditioning system because it typically exhibited a smaller cooling capacity than systems with central chillers. With respect to case buildings 1 and 2, the Bayesian Network Classifiers recommend two different cooling systems with the most frequently used systems in similar buildings corresponded to central

Table 5 Main attributes of the case buildings. Attribute

Case 1

Case 2

Case 3

Principal building activity Climate zone

Office Cold or very cold 1131 71,000 95 Brick, stone, or stucco “L” shaped 26 to 50% 2 1

Lodging Mixed-humid 1916 130,000 90 Brick, stone, or stucco “U” shaped 26 to 50% 4 0

Office Hot-dry/Mixed-dry/ Hot-humid 1366 1,000,000 95 Decorative or construction glass Other shape 76 to 100% 15 to 25 3

1989

1974

1983

CDD65 Square footage Percent cooled Wall construction material Building shape Percent exterior glass Number of floors Number of underground floors Year of construction

chillers and packaged air conditioning units. When compared to case building 1 and 2, case building 3 exhibited a higher design cooling load and the Bayesian Network Classifier recommended central chillers systems, which is reasonable based on HVAC selection principles. 4. Discussion In the study, we proposed an innovative methodology wherein a Bayesian Network Classifier is applied to recommend primary HVAC systems. CBECS 2012 database satisfies all the requirements for fulfilling the proposed approach. However, CBECS 2012 lacks detailed building information, for example, the configurations of HVAC systems. As a result, it cannot be used to power data-driven methods for configuring HVAC systems. The results from the focused survey indicate that to single out a cooling system in the early design stage, most HVAC designers do not consider building energy consumption. This may because designers lack simple but powerful analysis methods and tools. Similarity analysis clustered a group of similar buildings. In similarity analysis, it is difficult to determine the attributes and their weights. An acceptable solution involves incorporating knowledge from domain experts. However, the choice of those attributes and their weights still needs further study. An underlying problem of similarity

Table 4 Observations of attributes for case building 1. Attributes

States

Building America climate region Design cooling load Principal building activity Main cooling equipment Cooling degree day Percent cooled High-efficient cooling system

0: Others,1: Very cold/Cold. 7: 0-107,8: 107-108,9: 108-109,10: 109-1010,11: 1010-1011,12: 1011-1012,13: 1012-1013. 0: Others, 1: Office. 1: Residential-type central air conditioners, 2: Packaged air conditioning, 3: Central chillers inside, 4: Others. 0:0–600,1:600–900, 2:900–1200, 3:1200–1500, 4:1500-inf. 0: 0–90, 1:90–100. 0: Non-high-efficient, 1: High-efficient.

6

Journal of Building Engineering 25 (2019) 100796

Z. Tian, et al.

Table 6 Attribute values of the case building 1 and its top 5 similar buildings. Buildings

PBA

SQFTC

WLCNS

BLDSHP

GLSSPC

YRCONC

NWKERC

PUBCLIM

Case building 1 Similar B1a Similar B2 Similar B3 Similar B4 Similar B5

20 13 2 23 16 16

6 6 6 6 8 7

1 3 1 3 1 1

9 2 2 2 10 11

4 1 4 3 3 3

6 10 8 6 6 5

7 3 7 6 7 8

1 3 1 1 1 1

a

B1∼B5 are shortages of building 1 to 5.

Table 7 Heating and cooling systems in the CBECS 2012 database. ID

Main heating equipment

ID

Main cooling equipment

HS-0 HS-1

Not applicable Furnaces that heat air directly without using steam or hot water Packaged central unit (roof mounted) Boilers inside (or adjacent to) the building that produce steam or hot water District steam or hot water piped in from outside the building Heat pumps (other than components of a packaged unit) Individual space heaters (other than heat pumps) Other heating equipment

CS-0 CS-1 CS-2 CS-3

Not applicable Residential-type central air conditioners (other than heat pumps) that cool air directly and circulate it without using chilled water Packaged air conditioning units (other than heat pumps) Central chillers inside (or adjacent to) the building that chill water for air conditioning

CS-4 CS-5 CS-6 CS-7 CS-8

District chilled water piped in from outside the building Heat pumps for cooling Individual room air conditioners (other than heat pumps) “Swamp” coolers or evaporative coolers Other cooling equipment

HS-2 HS-3 HS-4 HS-5 HS-6 HS-7

Fig. 4. Observed frequency of the heating systems in the different group of buildings.

With respect to case building 1, the number of conditional probabilities of HECS with respect to its parent attributes is 320. Similar buildings should include a similar amount of data. Furthermore, 300 similar buildings were sorted from the CBECS database. In the aforementioned three cases, we assume that the buildings are highly efficient in terms of cooling. Thus, the prediction results correspond to systems with the highest posterior probability, i.e., Max{P (x|HECS = 1)}, where x denotes a type of cooling system. However, it was recognized that it is difficult to verify the effectiveness of the cooling system recommended by the Bayesian Network Classifier. Typically, building energy simulation methods are applied to compare the energy efficiency of different systems. However, detailed energy simulation exhibits significant uncertainty in the early design stage. The case buildings were designed to test whether HVAC systems recommended by the Bayesian Network Classifier comply with the HVAC design logic in which central chillers fit buildings with high cooling load, while, packaged air conditioning units fit low-rise buildings with lower design cooling load. The results indicate that the proposed Bayesian Network Classifier successfully collects inherent

analysis involves extreme cases where few buildings that are similar to the design building. In the aforementioned scenario, a baseline value is needed to resolve this problem. The frequent items and their association analysis serve as a preprocessing analysis for the Bayesian Network analysis. The usage frequency of different heating and cooling systems indicates that there is a positive relationship between the occurrence of the efficient cooling system and the usage frequency percentage of heat pump systems. The association rules analysis explains the inner connection between the heating and cooling system. In the study, the survey discloses the factors considered by designers when they select the primary HVAC system. Therefore, a part of DAG topology that determines the MAINCL was constructed based on knowledge of experts. Furthermore, another part of the directed acyclic graph topology that determines the HECS was inferred from the CBECS database. In default, each attribute has many observations. For example, the PBA has 27 observations. In some cases, some observations that only account for a small proportion of total observations are combined. Otherwise, the required volume of data grows exponentially. 7

Journal of Building Engineering 25 (2019) 100796

Z. Tian, et al.

Fig. 5. Observed frequency of the cooling systems in the different group of buildings. Table 8 Observed frequency of different cooling systems in similar buildings. System ID

CS-0

CS-1

CS-2

CS-3

CS-4

CS-5

CS-6

CS-7

CS-8

Case building 1 Case building 2 Case building 3

0.0 0.0 0.0

0.082 0.164 0.025

0.393 0.352 0.245

0.421 0.153 0.614

0.025 0.011 0.047

0.050 0.188 0.061

0.029 0.129 0.007

0.0 0.003 0.0

0.0 0.0 0.0

a few HVAC designers take building energy efficiency as a criterion when selecting primary HVAC systems. A building similarity analysis model is built using the Euclidian distance. We constructed a Bayesian Network Classifier for primary cooling system selection wherein high efficient cooling system is one of key parameters. The Bayesian Network Classifier was, then, applied to select cooling systems for three case buildings. The experimental results indicated that the recommended cooling systems coincide with typical HVAC design logic. Thus, it is suggested that the proposed method is able to learn the design strategies from those high-efficient buildings and recommends the optimal solution for selecting the primary HVAC system. Furthermore, it provides architects and HVAC designers with an innovative methodology in addition to the conventional approach that purely based on experts' knowledge and experience.

Table 9 Confidence between different heating and cooling systems for buildings similar to case building 1. Heating or cooling system

Heating or cooling system

Confidence

HS-3 CS-3 HS-2 CS-2 HS-3 CS-2 HS-4 CS-3 HS-5 CS-5

CS-3 HS-3 CS-2 HS-2 CS-2 HS-3 CS-3 HS-4 CS-5 HS-5

0.488 0.778 0.905 0.633 0.465 0.333 0.667 0.148 0.750 1.000

wisdom from the database. 5.2. Challenges and future works 5. Conclusion

One of the key challenges found in this study is to judge whether two buildings are similar to each other. In addition, the initial investment is another key factor concerned by building owners which cannot be fulfilled right now due to the lack of cost data. Another challenge is to design HVAC components using data-driven methods since existing building performance databases is short of detailed building data, such as U-value of exterior walls. Following this study, more analyses over similar building need to be carried out to mitigate the current limitations in determining weight factor and whether two buildings are similar or not. A comparison study of other machine learning algorithms, such as decision tree, in HVAC

5.1. Achievements In this study, we proposed an innovative methodology wherein a Bayesian Network Classifier is applied to recommend primary HVAC cooling systems by considering also the effects of energy consumption, which are always neglected by designers. One of the benefits for the method is that it recommends HVAC systems based on the solutions of hundreds, even thousands of high-efficient buildings that are similar to the design building. Survey results reconfirmed author's worry that only Table 10 Prediction results for the case buildings by the Bayesian Network Classifier.

Case 1 Case 2 Case 3

Notes

Bayesian Network prediction result

Most frequently used system in similar buildings

Small-size office building in cold region. Medium-size lodging building in mixed-humid region. Big-size office building in hot-humid region.

Packaged air conditioning units. Central chillers inside Central chillers inside

Central chillers inside Packaged air conditioning units. Central chillers inside

8

Journal of Building Engineering 25 (2019) 100796

Z. Tian, et al.

selection, can also be done to prove the reliability of the proposed Bayesian Network Classifier.

[19]

Acknowledgement [20]

This paper is financially supported by the Ministry of Science and Technology of China (Project number: 2016YFC0700102), Scientific Research Foundation of the Graduate School of Southeast University (YBJJ1801).

[21]

[22]

References [23] [1] Luis Pérez-Lombard, José Ortiz, Christine Pout, A review on buildings energy consumption information, Energy Build. 40 (3) (2008) 394–398 https://doi.org/10. 1016/j.enbuild.2007.03.007. [2] Yanan Li, et al., Green building in China: needs great promotion, Sustain. Cities Soc. 11 (2014) 1–6 Complete https://doi.org/10.1016/j.scs.2013.10.002. [3] ASHRAE, ASHRAE Handbook - HVAC Systems and Equipment vol. 1, (2016) 3-1.5 https://doi.org/10.1016/j.scs.2013.10.002. [4] Milorad Bojić, et al., A simulation appraisal of performance of different HVAC systems in an office building, Energy Build. 43 (6) (2011) 1207–1215 https://doi. org/10.1016/j.enbuild.2010.12.033. [5] Guy R. Newsham, Sandra Mancini, Benjamin J. Birt, ... , Do LEED-certified buildings save energy? Yes, but, Energy Build. 41 (8) (2009) 897–905 https://doi.org/10. 1016/j.enbuild.2009.03.014. [6] C. Van Dronkelaar, M. Dowson, E. Burman, C. Spataru, D. Mumovic, A review of the energy performance gap and its underlying causes in non-domestic buildings, Front. Mech. Eng. 1 (2016 Jan 13) 17 https://doi.org/10.3389/fmech.2015.00017. [7] Office of Energy Efficiency & Renewable Energy, Department of Energy. Building Performance Database, https://www.energy.gov/eere/buildings/buildingperformance-database. Visited at 23-Jan-2018. [8] U.S. Information Administration, Commercial Building Energy Consumption Survey (CBECS), https://www.eia.gov/consumption/commercial/. Visited at 23-Jan-2018. [9] European Commission, EU Buildings Database, https://ec.europa.eu/energy/en/eubuildings-database. Visited at 23-Jan-2018. [10] Building and Construction Authority of Singapore, About GBIC-Repository, http:// www.gbic-repository.sg/. Visited at 15-Mar-2018. [11] Yunyang Ye, Wangda Zuo, Gang Wang, A comprehensive review of energy-related data for US commercial buildings, Energy Build. 186 (2019) 126–137 https://doi. org/10.1016/j.enbuild.2019.01.020. [12] Hossein Shahrokni, F. Levihn, N. Brandt, Big meter data analysis of the energy efficiency potential in Stockholm's building stock, Energy Build. 78 (2014) 153–164 https://doi.org/10.1016/j.enbuild.2014.04.017. [13] Paul A. Mathew, et al., Big-data for building energy performance: lessons from assembling a very large national database of building energy use, Appl. Energy 140 (2015) 85–93 https://doi.org/10.1016/j.apenergy.2014.11.042. [14] A.S. Ahmad, et al., A review on applications of ANN and SVM for building electrical energy consumption forecasting, Renew. Sustain. Energy Rev. 33 (1) (2014) 102–109 https://doi.org/10.1016/j.rser.2014.01.069. [15] Hai-xiang Zhao, Frédéric Magoulès, A review on the prediction of building energy consumption, Renew. Sustain. Energy Rev. 16 (6) (2012) 3586–3592 https://doi. org/10.1016/j.rser.2012.02.049. [16] Kadir Amasyali, Nora M. El-Gohary, A review of data-driven building energy consumption prediction studies, Renew. Sustain. Energy Rev. 81 (2018) 1192–1205 https://doi.org/10.1016/j.rser.2017.04.095. [17] Melek Yalcintas, Energy-savings predictions for building-equipment retrofits, Energy Build. 40 (12) (2008) 2111–2120 https://doi.org/10.1016/j.enbuild.2008. 06.008. [18] Yi Kai Juan, G. Peng, W. Jie, A hybrid decision support system for sustainable office

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33] [34] [35]

[36] [37]

[38]

9

building renovation and energy performance improvement, Energy Build. 42 (3) (2010) 290–297 https://doi.org/10.1016/j.enbuild.2009.09.006. Daniel E. Marasco, C.E. Kontokosta, Applications of machine learning methods to identifying and predicting building retrofit opportunities, Energy Build. 128 (2016) 431–441 https://doi.org/10.1016/j.enbuild.2016.06.092. Nikolas Makasis, G.A. Narsilio, A. Bidarmaghz, A machine learning approach to energy pile design, Comput. Geotech. 97 (2018) 189–203 https://doi.org/10.1016/ j.compgeo.2018.01.011. Yang Zhao, et al., Diagnostic Bayesian Network for diagnosing air handling units faults–part I: faults in dampers, fans, filters and sensors, Appl. Therm. Eng. 111 (2017) 1272–1286 https://doi.org/10.1016/j.applthermaleng.2015.09.121. Yang Zhao, Xiao Fu, Shengwei Wang, An intelligent chiller fault detection and diagnosis methodology using Bayesian belief Network, Energy Build. 57 (2013) 278–288 https://doi.org/10.1016/j.enbuild.2012.11.007. Baoping Cai, et al., Multi-source information fusion based fault diagnosis of groundsource heat pump using Bayesian Network, Appl. Energy 114 (2014) 1–9 https:// doi.org/10.1016/j.apenergy.2013.09.043. Young-Jin Kim, Ki-Uhn Ahn, Cheol-Soo Park, Decision making of HVAC system using Bayesian Markov chain Monte Carlo method, Energy Build. 72 (2014) 112–121 https://doi.org/10.1016/j.enbuild.2013.12.039. Kasper L. Jensen, Jørn Toftum, Peter Friis-Hansen, A Bayesian Network approach to the evaluation of building design and its consequences for employee performance and operational costs, Build. Environ. 44 (3) (2009) 456–462 https://doi.org/10. 1016/j.buildenv.2008.04.008. H. Deng, D. Fannon, M.J. Eckelman, Predictive modeling for us commercial building energy use: a comparison of existing statistical and machine learning algorithms using cbecs database, Energy Build. 163 (2018) 34–43 https://doi.org/10. 1016/j.enbuild.2017.12.031. Youssef Bichiou, M. Krarti, Optimization of envelope and HVAC systems selection for residential buildings, Energy Build. 43 (12) (2011) 3373–3382 https://doi.org/ 10.1016/j.enbuild.2011.08.031. Jonathan A. Wright, H.A. Loosemore, R. Farmani, Optimization of building thermal design and control by multi-criterion genetic algorithm, Energy Build. 34 (9) (2002) 959–972 https://doi.org/10.1016/S0378-7788(02)00071-3. Magdalena Stanescu, Stanislaw Kajl, Louis Lamarche, Simplified optimization method for preliminary design of HVAC system and real building application, HVAC R Res. 19 (3) (2013) 213–229 https://www.tandfonline.com/doi/abs/10.1080/ 10789669.2012.755904. David W. Winiarski, Wei Jiang, Mark A. Halverson, Review of Pre-and Post-1980 Buildings in CBECS–HVAC Equipment, Pacific Northwest National Laboratory, 2006, https://www.osti.gov/servlets/purl/1013959. Sydney Finkelstein, J. Whitehead, A. Campbell, Think again: why good leaders make bad decisions, Bus. Strateg. Rev. 20 (2) (2010) 62–66 https://doi.org/10. 1111/j.1467-8616.2009.00601.x. Jui-Sheng Chou, Dac-Khuong Bui, Modeling heating and cooling loads by artificial intelligence for energy-efficient building design, Energy Build. 82 (2014) 437–446 https://doi.org/10.1016/j.enbuild.2014.07.036. Yaoqing Lu, et al., Practical Heating and Air Conditioning Design Manual, China Architecture & Building Press, 2008 (In Chinese). Michel Marie Deza, E. Deza, Encyclopedia of distances, Ref. Rev. 24 (6) (2009) 1–583 https://doi.org/doi:10.1007/978-3-662-44342-2. Zaki, Mohammed Javeed, Wagner Meira Jr., Data Mining and Analysis: Fundamental Concepts and Algorithms, (2014), pp. 81–92 http://dspace.fue.edu. eg/xmlui/bitstream/handle/123456789/3670/10526.pdf?sequence=1& isAllowed=y. Jiawei Han, M. Kamber, Data Mining Concept and Techniques, (2001), pp. 393–398. Mustafa Lokhandwala, Roshanak Nateghi, Leveraging advanced predictive analytics to assess commercial cooling load in the US, Sustain. Prod. Consum. 14 (2018) 66–81 https://doi.org/10.1016/j.enbuild.2017.12.031. Jacob Schreiber, Home of Pomegranate, http://pomegranate.readthedocs.io/en/ latest/index.html. Visited at 15-May-2018.