Refrigerant charge fault detection method of air source heat pump system using convolutional neural network for energy saving

Refrigerant charge fault detection method of air source heat pump system using convolutional neural network for energy saving

Energy 187 (2019) 115877 Contents lists available at ScienceDirect Energy journal homepage: www.elsevier.com/locate/energy Refrigerant charge fault...

1MB Sizes 0 Downloads 51 Views

Energy 187 (2019) 115877

Contents lists available at ScienceDirect

Energy journal homepage: www.elsevier.com/locate/energy

Refrigerant charge fault detection method of air source heat pump system using convolutional neural network for energy saving Yong Hwan Eom, Jin Woo Yoo, Sung Bin Hong, Min Soo Kim* Department of Mechanical and Aerospace Engineering, Seoul National University, Seoul, 08826, Republic of Korea

a r t i c l e i n f o

a b s t r a c t

Article history: Received 30 March 2019 Received in revised form 29 June 2019 Accepted 2 August 2019 Available online 21 August 2019

In heat pumps, refrigerant leakage is one of the frequent faults. Since the systems have the best performance at the optimal charge, it is essential to predict refrigerant charge amount. Hence, the refrigerant charge fault detection (RCFD) methods have been developed by researchers. Due to improvements in computing speed and big-data, data-driven techniques such as artificial neural networks (ANNs) have been highlighted recently. However, most existing ANN-based RCFD methods use low-performance shallow neural networks (SNNs) and require the features extracted by experts’ experiences. Also, they have some critical limitations. First, they cannot provide quantitative information on recharge amount due to a simple classification such as undercharge or overcharge. Second, many ANN-based RCFD methods can be used in one operation mode (cooling or heating mode). To improve the limitations, a novel RCFD strategy based on convolutional neural networks (CNNs) was suggested. Two prediction models using classification and regression can predict the quantitative refrigerant amount in both cooling and heating mode with a single model. The mean accuracy of the CNN-based classification model was 99.9% for the learned cases. Also, the CNN-based regression model showed the excellent prediction performance with root-mean-square (RMS) error of 3.1% including the untrained refrigerant charge amount data. © 2019 Published by Elsevier Ltd.

Keywords: Heat pump system Refrigerant charge fault detection Convolutional neural network Quantitative prediction Classification Regression

1. Introduction Growing energy demands and limited energy resources always make energy efficiency an important issue. According to the Energy Information Administration, commercial and residential buildings sectors used about 39.7% of U.S. total energy consumption in 2015, and space heating and cooling consumed nearly 40.7% of the buildings sector's energy use [1]. Heat pumps can be used for both space heating and cooling and have high efficiency. Hence, they have been widely used for space heating and cooling in many countries. Even a slight improvement in heat pump system performance can save entire energy consumption significantly. Therefore, first, it is imperative to enhance the energy efficiency of heat pumps. Second, diagnosis of the faults which may happen during their operation is also essential to maintain an improved performance of the heat pump. Because many studies have shown that some faults including inappropriate refrigerant charge had negative effects on heat pumps, resulting in

* Corresponding author. E-mail address: [email protected] (M.S. Kim). https://doi.org/10.1016/j.energy.2019.115877 0360-5442/© 2019 Published by Elsevier Ltd.

a significant increase in energy use. Jacobs et al. [2] investigated the 215 rooftop units of small commercial heating, ventilation, and air conditioning (HVAC) systems equipped in the 75 buildings and reported that 46% of units did not have optimal refrigerant charge amount. The authors explained that the average energy waste effect was nearly 5% of the annual cooling energy in view of the refrigerant charge amount. According to NBI's research, the performance improvement with the optimal refrigerant charge was estimated from 5 to 11% of the cooling energy [3]. Optimal refrigerant charge is significant in terms of energy efficiency of vapor compression systems [4e8]. Kim et al. [4] studied that there was a maximum COP at optimal charge amount of both single and cascade heat pump systems and refrigerant leakage caused performance degradation and a decrease of thermal comfort. A proper leakage detector can make it easy to find leakage point, whereas it is challenging to find out the amount of refrigerant leakage. In addition to detection of refrigerant leakage, many methods calculating the amount of refrigerant leakage have been developed until now. Grace et al. [9] investigated experimentally that due to a strong

2

Y.H. Eom et al. / Energy 187 (2019) 115877

correlation between refrigerant leakage and the degree of superheat (DSH) at the evaporator outlet, monitoring DSH was able to detect refrigerant charge level below 25% leakage. Besides, they insisted DSC at the condenser outlet was very sensitive to the refrigerant overcharge. Li and Braun [10] suggested the virtual refrigerant charge sensor method estimating refrigerant charge level using DSH and DSC. Kim and Braun [11] proposed an extended prediction method for the refrigerant charge of heat pumps with variable-speed compressors and fans by including the inlet quality in the evaporator and the discharge DSH of the compressor. Yoo et al. [12] investigated the refrigerant leakage detection method for a household air-conditioner with an electronic expansion valve (EEV) and restricted sensors, focusing on the temperature difference between inlet air and the mid-point of the evaporator. Recently, due to improvements in computing speed and big data, many fault detection and diagnosis (FDD) methods based on machine learning such as an artificial neural network (ANN) have been suggested. The ANN is a typical machine learning method which is frequently used in various engineering fields. Tassou and Grace [13] suggested the refrigerant leakage and overcharge detection method utilizing ANN and real-time monitoring. Kocyigit [14] proposed a method to detect faults including refrigerant leakage and overcharge connecting fuzzy inference system and ANN. Shi et al. [15] researched the RCFD method utilizing Bayesian artificial neural network (BANN) connected with ReliefF filter. They utilized ReliefF filter to extract the sensitive features related to refrigerant charge amount and used BANN to classify two refrigerant charge faults classes such as undercharge and overcharge. Guo et al. [16] studied the FDD strategy for four faults in variable refrigerant flow (VRF) systems using correlation analysis, association rules method using Apriori algorithm, and back-propagation neural network (BPNN). They used correlation analysis and association rules method using Apriori algorithm to elicit the features of the faults and classified the faults categories through BPNN. Also, Guo et al. [17] proposed the new FDD approach for VRF systems with Deep Belief Network. However, ANN-based FDD methods have some significant limitations. First, they could not give valuable quantitative information on how much to recharge the refrigerant owing to simple classification such as undercharge and overcharge. Second, many existing ANN-based RCFD articles only considered one operating mode (heating or cooling mode). Last, the other researches except for the study of Guo et al. [17] used shallow neural networks (SNNs) which have one hidden layer and required the features extraction process. While the feature extraction is a vital process due to significant effects on results, there are too many feature extraction methods including ReliefF filter, association rules method using Apriori algorithm, correlation analysis, principal component analysis, wavelet analysis, and other methods. Also, it takes many efforts to apply the appropriate feature extraction method because the experts’ knowledge and experiences are necessary. Therefore, various studies have been attempted to solve the above limitations. Some researchers combined the virtual refrigerant charge sensor technique with machine learning and improved the first limitation [18]. Recently, Li et al. [19] showed that using a weighted k-Nearest-Neighbor model, a single model could predict refrigerant charge fault in both cooling and heating mode. Also, although many conventional methods predicted three levels, such as undercharge, normal charge, and overcharge, their results presented a six-step prediction. However, the authors used the feature engineering to improve the model, and the feature engineering needs the expert's knowledge. To improve some limitations of the existing methods, this paper suggests a novel RCFD strategy for heat pumps using CNN highlighted in recent years. This research especially focuses on the

following four points. First, to find out the refrigerant lacking or excessive amount, two detailed quantitative prediction models for the current refrigerant charge level in heat pumps were attempted. Because we can apply neural networks to classification and regression tasks, this research dealt with both models. Also, we compared the results to see which of the two models was more suitable for the RCFD method. Second, this study considered both cooling and heating mode. A single model which could predict the refrigerant charge amount of the systems in both cooling and heating mode was developed. It is beneficial if a single model can predict the amount of refrigerant in both cooling and heating mode. In this case, the time and cost of training and optimizing models can be reduced. Generally, it is quite time-consuming to optimize hyperparameters. Also, because the number of models is reduced to half, the cost of managing the models can be reduced. Third, we studied some important hyperparameters such as weight initialization, activation functions, and architectures to find the appropriate hyperparameters for the CNN-based RCFD method. Last, we can extract the features automatically using CNN to extract the features from raw data automatically. Automatic feature extraction makes it easy to use the CNN-based RCFD method to detect refrigerant charge errors in heat pumps. Describing the results of the two models briefly, the results of the CNN-based classification model showed an excellent accuracy of 99.9% for the learned refrigerant amount. The mean accuracy of the CNN-based model was decidedly superior to those of the SNNbased model and the FCDNN-based model. On the other hand, the CNN-based regression model achieved the prediction performance with the RMS error of 3.1% including the refrigerant charge amount cases that were not learned. Both the CNN-based classification model and regression model were useful for estimating the refrigerant charge amount. The purpose of RCFD methods is to predict the refrigerant charge amount of the heat pumps when the unseen field data are provided. Moreover, the refrigerant charge amount is continuous. Therefore, the regression model was more suitable for the RCFD method in terms of continuous output considering the unobserved data.

2. Methodology 2.1. Convolutional neural network (CNN) There are many kinds of deep learning technologies, but fullyconnected deep neural network (FCDNN), CNN and recurrent neural network (RNN) are widely used in various fields. Generally, networks with one hidden layer are called SNN, and networks which have two or more hidden layers are called FCDNN. CNN has an additional structure, unlike FCDNN. CNN consists of convolutional layers, pooling layers, and a fully-connected layer. Convolutional layers and pooling layers together play a role in extracting the feature automatically from raw data [20]. Also, fullyconnected layer calculates the probabilities of output nodes. A brief explanation for CNN architecture is introduced in Fig. 1.

2.1.1. Convolutional layer Convolutional layers have convolutional filters and a nonlinear activation function. Convolutional filters make new feature maps through convolution operation with input images. A convolution operation can be depicted as follows.

Y.H. Eom et al. / Energy 187 (2019) 115877

Convolutional layer

3

Pooling layer

Fully-connected layer

P o o l i n g

R e L U



f

i o n

Feature maps



t



c



Input image

n



Convolution operation with filters



o p e r a t i o n

u

Feature maps

Fig. 1. Architecture of refrigerant charge fault detection method using CNN.

ulij ¼

A X B X

algorithm process [21]. The sigmoid function is shown as follows. l wlab xl1 iþa1 jþb1 þ bij

(1)

a¼1 b¼1

where ulij is the value of feature map after convolution operation and before activation function, A is the height of a convolutional filter, B is the width of a convolutional filter, wlab is the weight parameter of convolutional filters in the lth layer, xl1 ij is the value of input image data in the lth layer, and blij is a bias in the lth layer. Fig. 2 shows the example of a convolution operation. The weights of convolutional filters are determined automatically through a learning process [20]. After the convolution operation, a nonlinear activation function is applied to all the elements of the feature map arrays. A typical nonlinear activation function is the ReLU function and is described as follows.

x ¼ maxðu; 0Þ



Image A 























uij ¼









A X A 1 X x 2 A a¼1 b¼1 iþa1 jþb1

(4)

where uij is the value of feature map after pooling operation, A is the height and width of pooling area, and xij is the value of a feature map after activation function. Fig. 3 explains the average-pooling operation which was used in

Convolutional filter



(3)

2.1.2. Pooling layer Pooling layers play a role in reducing the dimension of the feature maps. This pooling operation can reduce the noise and the distortion of input data [20,22]. A pooling operation is separately applied to the elements of the feature map to merge the neighborhood values into one representative value. Typical pooling operations are max-pooling and average-pooling [22]. An average pooling operation is described as follows.

(2)

The output of the ReLU function is higher than one when the input value is higher than one. Hence, the ReLU function has the effect of solving the gradient vanishing phenomenon, which is one of the limitations of the sigmoid function in the back-propagation

1 1 þ eu









Image B





















1 1+1 0+4 0+6 1=7 Fig. 2. Convolution operation with filters.

4

Y.H. Eom et al. / Energy 187 (2019) 115877





















































































































































Fig. 3. Average pooling operation.

this study. 2.1.3. Fully-connected layer The fully-connected layer has a similar structure to FCDNN. All the values of feature maps after pooling operation are connected with the input nodes of the fully-connected layer from Fig. 1. Hence, the number of the input nodes is the same with the total number of feature maps’ pixels. The number of output nodes must equal the number of target classes. The feedforward propagation process is shown as follows.

xlj ¼ f

I X

wlji xl1 þ blj i



(5)

where xlj is the output value of the jth node in the lth layer after activation function, f() is a nonlinear activation function, wlji is the weight parameter when the node number of the lth layer is j and the node number of the ðl  1Þth layer is i, xl1 is the input value of the i ith node in the lth layer, and blj is a bias in the lth layer. In a fully-connected layer, a widely used nonlinear activation is also the ReLU function, but a softmax function is utilized as an activation function of an output layer. It is usually used to compute the probabilities of output nodes and is depicted as follows.

k¼1 e

(6)

xk

where K is the total number of output nodes, xk is the input value of the kth output node, and yk is the output value of the kth output node. The softmax function allows us to calculate the probabilities for the whole output nodes. If the number of nodes in the output layer is three, and each output value of a softmax function is 0.1, 0.2, and 0.7, it can be interpreted that since the probability of class 3 is the highest with 70%, the correct answer is class 3. A cost function is needed to train a CNN model. A cost function can be computed with the error between predicted outputs and the correct answers on output nodes. A cost function using a crossentropy error is depicted as follows.

EðwÞ ¼ 

N X K 1 X dnk loge ynk N n¼1

EðwÞ ¼

N X K 1 X ðynk  dnk Þ2 2N n¼1

(8)

k¼1

i¼1

exk yk ¼ PK

samples, K is the total number of output nodes (target classes), dnk is the correct answer of the kth output node in the nth sample, and ynk is the output value of the kth output node in the nth sample. This cost function with cross-entropy error is usually utilized for a classification model with a softmax function [23,24]. Also, a cost function using a mean squared error is described as follows.

(7)

k¼1

where w is the weight of a neural network, N is the total number of

where w is the weight of a neural network, N is the total number of samples, K is the total number of output nodes, dnk is the correct answer of the kth output node in the nth sample, and ynk is the output value of the kth output node in the nth sample. This cost function with mean squared error is used for a regression model with an identity function [25].

2.1.4. Weight updating method The learning algorithm is the procedure which updates the weights of a neural network to minimize the cost function value. The new weights are updated by subtracting the gradient of the cost function multiplied by a learning rate from the previous weights. The equation for updating weights is shown as follows [26].

wðtþ1Þ ¼ wðtÞ  εVE ¼ wðtÞ  ε

vE vw

(9)

where w is the weight of a neural network, ðtÞ is a current step, ðt þ1Þ is a next step, ε is a learning rate, and E is a cost function. This algorithm is the most straightforward procedure which is named a gradient descent method. Eq. (9) is a general representation of how to update the weights of the FCDNN. CNN requires some modifications in a convolutional layer and a pooling layer. More details of the weights update method for CNN are described in Bouvrie's study [27]. Usually, a mini-batch stochastic gradient descent method is used for updating the weights [20,26]. Training data were shuffled randomly, and the shuffled training data were divided by a minibatch size.

Y.H. Eom et al. / Energy 187 (2019) 115877

2.2. Time-domain data to 2-D image data conversion method Many conventional SNN-based RCFD methods have employed the feature extraction process to reduce meaningless attributes and to create new features which can explain the data well [15,16]. However, the feature extraction process is not easy due to too many diverse methods to manipulate and the needs for experts’ knowledge. Since CNN models have the practical ability to extract features automatically, CNN was employed to develop the RCFD method in this research. For CNN, it is necessary to transform the time-domain signals of multiple sensors in heat pumps into the 2-D image. In the FDD field, the 1-D time-domain data to 2-D image conversion methods have been developed in various ways. Recently, CNN-based FDD models have been introduced to diagnose the faults of bearings, and there are various 2-D conversion methods [28e31]. Wen et al. [28] proposed the conversion method to make a 2-D image by stacking one sensor's time-domain signals horizontally. Hoang and Kang [29] suggested the method to make a 2-D image by stacking timedomain data vertically. He and He [30] used the time-frequency based 2-D images after the FFT analysis of time-domain data. Xia et al. [31] suggested the conversion method by collocating the timedomain data of multiple sensors installed at rotating machinery. This research employed their conversion method because heat pumps have various sensors. The explanation for this conversion method is presented in Fig. 4. Also, Fig. 5 shows the flowchart of the CNN-based RCFD method. 2.3. Collecting, preprocessing and splitting of data In neural networks methods, the data normalization is essential for performance. In this investigation, the min-max scaling which is one of the normalization methods was used. The training, validation, and test data were normalized by Eq. (10). After min-max scaling, the whole data ranged from 0 to 1. If the elements are multiplied by 255, the dataset can indicate a gray image.

zij  Mini zij ¼ Maxi  Mini 0

(10)

where zij is the element of originally acquired matrix Z, Maxi is the maximum value of the ith feature in the training data, Mini is the minimum value of the ith feature in the training data, and z0ij is the element of normalized matrix Z0 . The sampling rate of measured data was 2 s, and the data were acquired after the heat pump system became a steady-state. The steady-state detector [32] used in this study indicates that the heat

pump reaches a steady-state when the moving window standard deviations of the features go below predefined thresholds. The details are explained in Ref. [32]. The goal in machine learning is that the model predicts well on previously unobserved data, and the ability to work well on unseen cases is called generalization [33,34]. Generally, three data groups are needed to improve the generalization performance of the machine learning models. More details of the reason can be found in section 5.3 in reference [33]. First, the total data should be divided into the entire training set and test set. Last, the whole training set should be split into training data to learn the parameters, and validation data to optimize the hyperparameters. The number of complete data obtained in this study is 34,608. If the total data are split as described above, there are 25,200 for training, 5,040 for validation, and 4,368 for testing. As mentioned earlier, one input dataset consists of 28 time-domain data of the 28 features, and the matrix size of the converted 2-D dataset is 28  28. It means the time-domain data of 28 features were acquired for 56 s. Thus, after conversion to the 2-D input dataset, training datasets were 900, validation datasets were 180, and test datasets were 156. In general, data splitting might be 50% for training data, 25% for validation data, and 25% for test data [35]. Also, there is a rule of 60%, 20%, and 20%. However, it is known that it is difficult to determine a general rule on the method to split data into three parts [35]. In this research, test datasets were set to 156 to estimate the generalization error under all experiment cases. Two random datasets were selected for each experiment. Training datasets and validation datasets are generally divided into 8:2 ratios [33,34]. However, because increasing the amount of training data helps to improve the generalization of the model, we increased training datasets from 864 (80%) to 900 (83.3%). Before training, the order of the training data should be randomly shuffled to improve the performance [36]. 2.4. Architectures and hyperparameters of various neural networks Three architectures were selected to compare and investigate the performances of most existing SNN-based RCFD methods and the newly suggested RCFD method. Since most conventional SNNbased RCFD methods dealt with a classification model, this comparison was carried out in a classification model. The first architecture was SNN which was used for many traditional RCFD methods. The second was a basic FCDNN which had two hidden layers. Also, the last was a basic CNN which was utilized for a new RCFD method. This basic CNN model had one convolutional layer, one pooling layer, and one fully-connected layer. Since SNN has a simple architecture, the basic architectures which are as simple as

2-D conversion matrix Feature #1



Feature #2



Feature #3



… …









Feature #27



Feature #28



Time (s)

2

4

6

8



5

Input image (28 28)





54

56

Fig. 4. Time-domain data to 2-D image conversion method.

6

Y.H. Eom et al. / Energy 187 (2019) 115877

Heat pump system

Sensor 1

Sensor 2



Sensor m

Time-domain data acquisition

Calculation of DSH, DSC

monitored 28 features

Time-domain signals into 2-D conversion

Training dataset CNN initialization

Validation dataset

CNN training / Hyperparameters selection

Test dataset Trained CNN model

Refrigerant charge amount

Fig. 5. Flowchart of CNN-based RCFD method.

possible are also selected in case of FCDNN and CNN. The architectures of SNN-based and FCDNN-based RCFD model are described in Table 1, and the architecture of CNN-based model is described in Table 2. Also, to investigate the effect of the activation function, this research compared the results of the SNN classification models using the sigmoid and the ReLU function. Additionally, this study investigated the effect of four different initialization methods for weights on all the structures. Xavier normal distribution initialization, Xavier uniform distribution initialization [37], He normal distribution initialization [38], and the uniform distribution initialization method which had been suggested in LeCun et al.’s study [39] were chosen to examine which initialization method was more suitable for an RCFD method. Each equation is described in these papers detailedly [37e39]. Therefore, the total number of simulations for the performances comparison of various neural networks mentioned above is 16. In all simulations, the early stopping algorithm was considered. Generally, when training neural network models, after training the model, a validation loss is calculated using validation data. As the learning procedure repeats, the model learns, a validation loss decreases, and validation accuracy increases. However, when overfitting occurs, the validation loss increases again. So, early stopping is widely used to avoid the overfitting which decreases the performance of a model [40]. The early stopping algorithm utilized in this paper is briefly described as follows. If a newly calculated validation loss is less than the minimum validation loss value so far, the learning results are saved, and the new validation loss replaces the minimum value.

Table 1 Architectures of SNN, FCDNN-based RCFD classification models. Hyperparameters

SNN

Input size Nodes in input layer Nodes in hidden layer1 Nodes in hidden layer2 Activation function Nodes in output layer Activation function

28  1 28 57 e Sigmoid 12 Softmax

FCDNN 28  1 28 57 e ReLU 12 Softmax

28  1 28 57 57 ReLU 12 Softmax

However, if the new validation loss is higher than the minimum loss value, the patience count increases. When the patience count reaches 5, the learning rate is reduced by 10%. After repeating these procedures, if the learning rate is changed ten times, or the lowest validation loss becomes less than the predetermined value, a learning process is stopped early. Also, the maximum of learning epochs is limited to the preset number, and the learning process ends when the number of learning epochs reaches the limit. So then, the learning results which have the lowest validation loss are chosen for the best model. For all 16 simulation cases, the initial learning rate started at 0.01, and the momentum method was used for the gradient descent optimization algorithm. In the models except for CNN, the number of nodes in the hidden layers was derived from Eq. (11) in HechtNielsen, Robert's paper [41] cited in Guo et al.'s research [17].

Nhidden ¼ 2  Ninput þ 1

(11)

where Nhidden is the number of nodes in the hidden layer, and Ninput is the number of nodes in the input layer. 3. Experimental setup 3.1. Experimental setup and conditions The commercial heat pump system with a nominal capacity of 30 kW was employed for this experiment. It consists of one outdoor unit and two indoor units. The system has a variable-speed scroll type compressor, fans with an inverter and multiple EEVs. The nominal refrigerant charge amount in this system is 9 kg in cooling mode, and 6 kg in heating mode. Fig. 6 explains a schematic of the experimental setup [42]. The outdoor unit has an accumulator, a variable-speed scroll type compressor, a variable-speed fan, an oil separator, a four-way valve, and a condenser. There are two evaporators which are equipped with an EEV and a fan per each indoor unit. The heat pump system has many sensors to monitor and control the system. The temperature sensors are located at a compressor outlet, a condenser inlet and outlet, a liquid line, the inlets and the outlets of two evaporators, an accumulator inlet, and the indoor and outdoor airside. Besides, there are two pressure

Y.H. Eom et al. / Energy 187 (2019) 115877

7

Table 2 Architecture of CNN-based RCFD classification model. Layer name

Input size

Input Convolutional Average-pooling Flatten Fully-connected Output

28  28 28  28 20  20 10  10 800  1 100  1

   

1 1 8 8

Filter size

Filter number

Stride

Output size

Activation function

e 99 22 e e e

e 8 e e e e

e 1 2 e e e

e 20  20  8 10  10  8 800  1 100  1 12  1

e ReLU e e ReLU Softmax

Fig. 6. Schematic of experimental setup [42].

sensors to measure the pressure of accumulator inlet and the pressure of the compressor outlet. While the sensors displayed in black circles are originally installed sensors, mass flow sensor, RTD sensors, and a pressure difference transducer are installed additionally for this experiment. Additionally installed wind tunnel, nozzles and sensors such as mass flow sensor, RTD sensors, and a pressure difference transducer were utilized to measure the state of refrigerant and air, and the cooling or heating capacity of the heat pump system. However, the information collected from additionally installed sensors was not used for training the CNN-based models. Air volume flow rate of the indoor unit was calculated based on ANSI/AMCA 210 [43] by measuring the pressure difference between nozzle inlet and outlet with the pressure difference transducer. The system was installed in the standard psychrometric chambers. The outdoor unit was located in one chamber, and the indoor units were installed in the other chamber. Table 3 presents the 28 features used for an RCFD method in this heat pump system in detail. The 28 features include the compressor speed, the voltage induced to the compressor and the current flowing in the compressor motor as well as the values of various sensors. Based on the previous researches [9e11] and the variables that are needed to analyze the thermodynamic property of the system, 28 features were selected. Some indirect features like feature 3 (DSH), feature 4 (Degree of superheat at compressor discharge), and feature 6 (DSC) were calculated by Eq. (12)e(14).

Feature 3 ¼ Feature 9  Feature 11

(12)

Feature 4 ¼ Feature 12  Feature 10

(13)

Feature 6 ¼ Feature 10  Feature 18

(14)

All the features used for developing models in this study could be obtained from the monitored variables provided by the manufacturer. Various experiments were conducted to develop and validate the new fault detection models for the refrigerant charge amount. Experiments were conducted for both heating and cooling mode. In all experimental conditions in cooling mode, the experiments started with the refrigerant charge of 6 kg. After the experimental data were obtained while varying the target DSH for at least 2 h at each refrigerant charge amount, the refrigerant charge amount was added by 1 kg additionally. This procedure was repeated until the final amount of refrigerant became 11 kg. In cooling mode, the control target DSH level was tested in three steps as 5 K, 10 K, and 15 K. In heating mode, the control target DSH level was tested in two steps as 5 K and 10 K. In heating mode experiments, the refrigerant charge amount was increased stepwise from 3 to 8 kg as described above. Also, the reason that the refrigerant charge amount is 6e11 kg in cooling mode and 3e8 kg in heating mode is

8

Y.H. Eom et al. / Energy 187 (2019) 115877

Table 3 Monitored features from various sensors. Number

Feature

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

Condensing pressure Evaporating pressure Degree of superheat Degree of superheat at compressor discharge Compression ratio Degree of sub-cooling Compressor speed Outdoor air temperature Compressor suction temperature Condensing temperature Evaporating temperature Compressor discharge temperature Outdoor heat exchanger temperature1 Outdoor heat exchanger temperature2 Outdoor heat exchanger temperature3 Subcooler temperature1 Subcooler temperature2 Liquid pipe temperature Compressor electric current Compressor electric voltage Indoor unit 1 suction air temperature Indoor unit 1 heat exchanger temperature1 Indoor unit 1 heat exchanger temperature2 Indoor unit 1 degree of superheat/sub-cooling Indoor unit 2 suction air temperature Indoor unit 2 heat exchanger temperature1 Indoor unit 2 heat exchanger temperature2 Indoor unit 2 degree of superheat/sub-cooling

that the nominal refrigerant charge amount is 9 kg in cooling mode and 6 kg in heating mode. Generally, the volume of the condenser significantly affects the appropriate refrigerant charge amount. Because the heat pump that we tested with has only two indoor units, the volume of the outdoor unit heat exchanger is much bigger than that of the indoor unit heat exchanger. Therefore, the nominal refrigerant charge amount in cooling mode is greater than that in heating mode. In these conditions, the compressor speed was chosen to either 60 or 100 Hz. Operation mode, DSH, and compressor speed were variables for the validation of this proposed refrigerant charge prediction method in various conditions. Fan speed was set at a constant state, and two indoor units were operated in all the experiments. In heating mode, EEVs for two indoor units are fully open, and EEV for outdoor unit operates. On the contrary, in cooling mode, EEVs for two indoor units are activated, and EEV for outdoor unit is fully open. The openings of EEVs for two indoor units were controlled for the same degree at various test conditions in cooling mode. Fans for two indoor units operated at the same speeds under all test conditions. According to ANSI/AHRI Standard 1230 [44], two

air temperature conditions for cooling mode, and two air temperature conditions for heating mode were selected. The detailed experimental conditions are shown in Table 4. More details of the various experiments are explained in Yoo's study [42]. 3.2. Uncertainty analysis Table 5 shows the uncertainty of the measured data and the derived data of the experiment, and more details are described in section 4.2.2 for the commercial heat pump system in the reference paper [42]. Two pressure transducers and thermistors are sensors installed initially, and RTD sensors and thermocouples are installed additionally for this experiment. 4. Results and discussion 4.1. Results of CNN-based classification model Many conventional data-driven RCFD models using SNN and deep learning indicated whether the current state was in a fault state or a normal state such as undercharge, normal, and overcharge. Therefore, these existing models could not give detailed information about the deficient or excessive refrigerant amount to a service engineer. However, in this research, two quantitative RCFD methods were attempted. 4.1.1. Detailed results of SNN-based, FCDNN-based, and CNN-based model The classification results showed the quantitative classification performance of this proposed CNN-based model including both cooling and heating mode. Additionally, the performance of this model based on CNN was compared with those of the conventional SNN-based model and the FCDNN-based model. The classification results and the comparison results for the mean accuracy are

Table 5 Uncertainty analysis at rating condition [42]. Measurements

Total error

Pressure transducer (condenser side) Pressure transducer (evaporator side) Dry bulb air inlet temperature (RTD) Wet bulb air inlet temperature (RTD) Dry bulb air outlet temperature (RTD) Wet bulb air outlet temperature (RTD) Thermistor Mass flow rate (refrigerant) Mass flow rate (air) Power consumption Cooling capacity COP

0.8% 1.7% 0.2 K 0.2 K 0.2 K 0.2 K 0.8 K 0.9% 3.1% 0.5% 5.5% 5.5%

Table 4 Experimental conditions for cooling and heating mode. Variables

Value

Refrigerant Lubricant Lubricant charge amount (kg) Operation mode Refrigerant charge amount (kg) DSH (K) Air condition (ANSI/AHRI 1230) ID inlet DB  (C) ID inlet WB  (C) OD inlet DB  (C) OD inlet WB  (C) Compressor speed (Hz)

R410A Polyvinyl ether(PVE) type 3 Cooling mode 6 - 11 (D ¼ 1) 5, 10, 15 (or maximum EEV opening) Rating Condensate 26.7 26.7 26.7 19.4 19.4 23.9 35.0 35.0 26.7 23.9 23.9 23.9 60 100 60

Heating mode 3 - 8 (D ¼ 1) 5, 10 (or maximum EEV opening) Rating (low temp.) 21.1 21.1 15.6 (max) 15.6 (max) 8.3 8.3 9.4 9.4 60 100

Rating (high temp.) 21.1 15.6 (max) 8.3 6.1 60

Y.H. Eom et al. / Energy 187 (2019) 115877

presented in Table 6. Due to random processes like the randomized initial values of weights and the shuffle of the training data, the results were summarized after ten trials. Table 7 represents especially the detailed ten trials results of case 5, 9, 13. Table 6 explains that the RCFD model using CNN has the mean accuracy of 99.9% (case 13) and the much better classification performance than the other models when the ReLU function and the He normal distribution initialization scheme are applied. On the other hand, from Table 6, the mean accuracy of the SNN-based RCFD model is 88.4% (case 5), and the FCDNN-based model has the mean accuracy of 98.0% (case 9) in the same conditions. The CNN-based model was much better than the SNN-based model. Therefore, a detailed comparison of the two models was considered unnecessary. However, the accuracy of the FCDNNbased model was 98.0% (case 9), which was 1.9% lower than the accuracy of the CNN-based model. It might be considered that the difference of 1.9% accuracy was not great, but there were significant differences when comparing the results of two models in detail. Above all, the error of the CNN-based model was much lower than that of the FCDNN-based model. Table 6 indicates that the CNN-based model has an error of 0.1% (case 13), and the FCDNNbased model has an error of 2.0% (case 9). Moreover, when comparing validation errors, the validation loss of CNN is 0.011 (case 13), and the validation loss of FCDNN is 3.6 (case 9) from Table 6. It means that the lower the validation loss, the better the model learns. Besides, comparing the test loss values of the two models in Table 6, the test loss for the CNN-based model is 0.013 (case 13) while the test loss for the FCDNN-based model is 0.1 (case 9). The lower the test loss value, the smaller the error between the correct answer and the predicted value. From these results, it was sure that the CNN-based RCFD model was more superior to the SNN-based model and the FCDNN-based model. 4.1.2. Comparison results of weight initializers, activation functions, and architectures of neural networks The classification results of 16 neural networks showed three significant results. First, in case of the weight initialization, He normal distribution initialization was better than the other three initialization methods. In the previous studies, Glorot and Bengio [37] show that Xavier normal distribution initialization is better than the heuristic uniform distribution weight initialization. They suggest both Xavier normal distribution initialization and Xavier uniform distribution initialization considering the sigmoid function and the hyperbolic

9

tangent function [37]. He et al. [38] insist that He normal distribution initialization is better than two Xavier initialization methods in case of the deeper CNN using ReLU. However, in this study on the RCFD method for the heat pump system, whether the sigmoid function or the ReLU function was used, the results of models with He initialization were always better than those of the other initialization methods. Table 6 explains these results well (case 1e8). Second, the performance of models using ReLU was much better when comparing the effects of the sigmoid and the ReLU function. Krizhevsky et al. [45] show that a CNN model using the ReLU function can learn six times faster than a comparable CNN model using the hyperbolic tangent function. A similar trend was observed in this research. Table 6 shows the mean accuracies of SNNs using the ReLU function (case 5e8) are two times over higher than those of SNNs using the sigmoid function (case 1e4) during the same epochs. Moreover, Table 6 presents the validation loss values of SNNs using the ReLU function (case 5e8) are about twice as low as those of SNNs using the sigmoid function (case 1e4). It meant that the learning speed of the ReLU function was faster than that of the sigmoid function. Finally, when comparing the results using SNN, FCDNN, and CNN, It is found out that CNN has the best performance of them. Above all, the CNN-based model has the best mean accuracy among the three models from Table 6. Also, all three architectures have a big difference when examining their validation loss and test loss in the conditions of applying both the ReLU function and He initialization method (case 5, 9, 13). Although the learning rate started at 0.01 and the same early stopping algorithm was applied, SNN and FCDNN could not obtain the target validation error even after the maximum 300 epochs. However, CNN terminated the learning procedure early because it achieved a sufficiently low target validation loss after approximately 150 epochs. As a result, the CNN-based RCFD model showed excellent performance in the classification accuracy, a validation loss, and a test loss. Also, there was no significant increase in computation time. 4.1.3. Confusion matrix of CNN-based classification model Fig. 7 presents the confusion matrix of the CNN-based RCFD model. The rows of the confusion matrix indicate the predicted label, and the columns represent the actual label. Also, classes 1 to 6 correspond to the refrigerant amount of 6 kge11 kg respectively in cooling mode, and classes 7 to 12 correspond to the refrigerant amount of 3 kge8 kg in heating mode, respectively. For example,

Table 6 Results of various neural networks after ten trials. Type

Activation function

Weight Initialization (distribution)

Mean accuracy (%)

Validation loss

Test loss

Mean run time (s)

Case

Shallow NN

Sigmoid

He normal Xavier normal Xavier uniform LeCun uniform He normal Xavier normal Xavier uniform LeCun uniform He normal Xavier normal Xavier uniform LeCun uniform He normal Xavier normal Xavier uniform LeCun uniform

38.8 35.8 37.3 36.3 88.4 85.7 86.0 86.3 98.0 96.5 96.9 96.7 99.9 99.6 99.4 99.2

40.9 42.4 42.2 42.3 18.1 19.9 19.4 19.4 3.6 4.7 5.0 5.2 0.011 0.012 0.011 0.009

1.5 1.6 1.6 1.6 0.7 0.8 0.7 0.7 0.1 0.2 0.2 0.2 0.013 0.019 0.018 0.019

81.7 80.9 81.3 81.7 84.1 75.2 74.5 75.3 139.5 145.5 142.3 140.5 143.2 149.2 146.4 142.3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

ReLU

FCDNN

ReLU

CNN

ReLU

10

Y.H. Eom et al. / Energy 187 (2019) 115877

Table 7 Detailed results for case 5, 9, and 13. 1

2

3

4

5

6

7

8

9

10

Mean accuracy (%)

Mean run time (s)

SNN (case 5) FCDNN (case 9) CNN (case 13)

84.6 98.7 100

89.7 97.4 99.4

92.3 96.2 100

88.5 98.7 100

86.5 98.7 100

87.2 97.4 99.4

89.1 99.4 100

90.4 97.4 100

86.5 96.8 100

89.1 99.4 100

88.4 98.0 99.9

84.1 139.5 143.2

Predicted label

Trial order

Class 1

100.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

Class 2

0.0%

99.4%

0.6%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

Class 3

0.0%

0.6%

99.4%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

Class 4

0.0%

0.0%

0.0%

100.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

Class 5

0.0%

0.0%

0.0%

0.0%

100.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

Class 6

0.0%

0.0%

0.0%

0.0%

0.0%

100.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

Class 7

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

100.0%

0.0%

0.0%

0.0%

0.0%

0.0%

Class 8

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

100.0%

0.0%

0.0%

0.0%

0.0%

Class 9

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

100.0%

0.0%

0.0%

0.0%

Class 10

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

100.0%

0.0%

0.0%

Class 11

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

100.0%

0.0%

Class 12

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

100.0%

Class 1 6 kg

Class 2 7 kg

Class 3 8 kg

Class 4 9 kg

Class 5 10 kg

Class 6 11 kg

Class 7 3 kg

Class 8 4 kg

Class 9 5 kg

Cooling mode

Class 10 Class 11 Class 12 6 kg 7 kg 8 kg

Heating mode Actual label

Fig. 7. Confusion matrix of ten trials results in CNN-based classification model.

class 1 represents the refrigerant charge of 6 kg in cooling mode, and class 12 indicates the refrigerant charge of 8 kg in heating mode. The confusion matrix explains that the CNN-based RCFD model has a high classification performance for 12 classes about both cooling and heating mode. This method predicted the refrigerant charge amount by 100% accuracy in heating mode and confused class 2 (refrigerant 7 kg) with class 3 (refrigerant 8 kg) by only 0.6% error in cooling mode.

4.2.1. Results of CNN-based classification model when the unlearned cases data are provided The experimental data including the refrigerant charge of 4.5 kg, 5.5 kg, 6.5 kg, and 7.5 kg in the heating mode were not trained in this proposed CNN-based classification model. These new test data were applied to the new suggested model for testing. The results that are shown in Fig. 8 present some interesting meanings. For example, in the case of experimental data with the refrigerant charge of 5.5 kg in heating mode, the test results indicated 5 kg class or 6 kg class in heating mode. Other experimental cases

Some studies chose test data from the experiments in which the model was trained. This study also selected test data in the same way. However, the primary purpose of RCFD methods is predicting the refrigerant charge amount of the heat pump system when unseen field data are given. Moreover, the refrigerant leakage amount is not stepwise but continuous. Thus, the results are wondered if the experimental data of the unlearned region are presented as input data. Therefore, to estimate the generalization performance, additional data for several experimental conditions that had not been used in the entire learning process were collected. The extra data were defined as new test data in this study.

Predicted label

4.2. Results of CNN-based regression model 3 kg

Class 7

0.0%

0.0%

4 kg

Class 8

99.6%

5 kg

Class 9

0.4%

6 kg

Class 10

7 kg

Class 11

8 kg

Class 12

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

99.6%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.4%

98.6%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

1.4%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

0.0%

4.5 kg

100.0% 100.0%

5.5 kg

100.0% 100.0% 100.0% 0.0%

6.5 kg

0.0%

0.0%

7.5 kg

Actual label

Fig. 8. Predicted results of untrained classes in CNN-based classification model.

Y.H. Eom et al. / Energy 187 (2019) 115877

Predicted refrigerant charge amount (kg)

12 11

Actual

10

Error ±10% Predicted

9

showed similar results. Even if the experimental data of refrigerant charge levels that had not been categorized were entered, the new RCFD model estimated the value adjacent to the correct answer of the input data. It is an excellent characteristic of the classification model of supervised learning. The model did not specify the amount of refrigerant far from the correct answer. Also, it did not indicate the wrong operation mode. This developed RCFD model presented the classification performance that could predict the refrigerant charge amount by error range of refrigerant ±0.5 kg when unlearned new test data were used.

8 7 6 5 4 3 2 3

4

5

6

7

8

9 10 11

Actual refrigerant charge amount (kg) Fig. 9. Actual and predicted results in CNN-based regression model (five trials).

4.2.2. Results of CNN-based regression model According to Ref. [34], generally if the output is a categorical data from some finite set, the task is the classification, and if the output is a continuous real-valued scalar, the problem is the regression in supervised learning. Because the refrigerant charge amount of the heat pump system is essentially continuous realvalued number, the prediction of the refrigerant charge amount is closer to the regression task than the classification task. Neural network models are usually used for the classification and the regression, so this proposed CNN-based model can be modified from the classification model to the regression model. To modify the model to a regression, the output layer of the aforementioned fully-connected layer should have just one node. The only item of interest is the refrigerant charge, so only one output node is required. It should be noted that the activation function of the output layer is an identity function, and the mean square error is used as the cost function [25]. A few hyperparameters such as a learning rate might be changed for the optimization. Also, the label of the training data should be changed from the category type to the real number. For example, if the refrigerant charge amount is 5 kg, the label is 5. Fig. 9 shows the predicted results of the CNN-based regression model. From Fig. 9, the x-axis shows the actual label, and the y-axis represents the predicted label. From Fig. 9, the performance which could predict the current refrigerant charge amount by an RMS error of 3.1% was achieved. The results included the refrigerant charge amount which had not been learned. These results were

2.5

2.5 Training error

Training error

Validation error

2

Validation error

2

1.5 Error

1.5 Error

11

1

1

0.5

0.5

0

0 0

20

40

60

80

100

Epoch

(a) CNN-based classification model

0

20

40

60

80

100

Epoch

(b) CNN-based regression model

Fig. 10. Training and validation error curves in CNN-based RCFD models.

12

Y.H. Eom et al. / Energy 187 (2019) 115877

much better than those of the classification model. When the CNNbased classification model included the untrained cases data, the RMS error of that model was 4.6%. As mentioned above, in the case of the experiment with the refrigerant charge of 5.5 kg in heating mode, the CNN-based classification model predicted the refrigerant charge as 5 kg or 6 kg in heating mode. These prediction values had a standard deviation of 0.51 kg. However, the predicted values by the regression model had an average of 5 kg and a standard deviation of 0.12 kg. Besides, the calculation time of the regression model was 125.7 s, and it was shorter than that of the classification model. As a result of the CNN-based regression model, a single neural network could predict the refrigerant charge amount with an RMS error of 3.1% in all the experimental conditions including cooling and heating mode. Fig. 10 shows the training and validation error curves in CNN-based RCFD models and explains overfitting did not occur in the training procedure. The CNN-based regression model was more suitable for the RCFD method because the purpose of RCFD methods is predicting the refrigerant charge amount of the heat pump system even if new test data are given. 5. Conclusions This study suggests two CNN-based RCFD strategies which have excellent quantitative prediction performance. One is the CNNbased classification model, and the other is the CNN-based regression model. Conclusions are as follows. (1) The more detailed quantitative prediction methods for the refrigerant charge amount than in previous studies were achieved. Therefore, the critical information that how much the current refrigerant amount of a heat pump system was undercharged or how much it was overcharged could be obtained. (2) These methods predicted the refrigerant charge amount of heat pump systems in both cooling and heating operating mode using a single model. (3) In the CNN-based classification model, the mean prediction accuracy of 99.9% was obtained for the trained classes including both cooling and heating mode. It was much higher than the mean accuracy 88.4% of a conventional SNNbased model. (4) In the CNN-based regression model, the present refrigerant charge amount could be predicted within an RMS error of 3.1% in the whole region including unobserved new test data. (5) Some important hyperparameters were investigated, and CNN, the ReLU function and the He initialization method were suitable for the CNN-based RCFD method. (6) The newly proposed RCFD methods using CNN were easy and uncomplicated due to automatic feature extraction.

Declaration of interests None. Acknowledgments This work was supported by the Brain Korea 21 Plus Project (F14SN02D1310) of Seoul National University. Additional support by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (2016R1A2A1A05005510) is much appreciated. This project was also supported by the R&D Center for reduction of

Non-CO2 Greenhouse gases (2017002430001) funded by Korea Ministry of Environment (MOE) as Global Top Environment R&D Program. Support from Korea Institute of Energy Technology Evaluation and Planning (KETEP) from the Ministry of Trade, Industry and Energy, Korea (No. 20173010032150) is also greatly appreciated. References [1] U.S. Energy Information Administration. Annual energy outlook 2017. US Energy Information Administration; 2017. [2] Jacobs P, Smith V, Higgins C, Brost M. Small commercial rooftops: field problems, solutions and the role of manufacturers. In: National conference on building Commissioning; 2003. [3] Cowan A, Institute NB. Review of recent commercial roof top unit field studies in the Pacific Northwest and California. 2004. [4] Kim DH, Park HS, Kim MS. The effect of the refrigerant charge amount on single and cascade cycle heat pump systems. Int J Refrig 2014;40:254e68. [5] Kim N-H. Application of the natural refrigerant mixture R-290/DME to a soft ice cream refrigerator. Int. J. Air Cond. Refrig. 2016;24(04):1650027. [6] Kim N-H. Optimization of the water spray nozzle, refrigerant charge amount and expansion valve opening for a unitary ice maker using R-404A. Int. J. Air Cond. Refrig. 2017;25(03):1750025. [7] Kim JH, Cho JM, Lee IH, Lee JS, Kim MS. Circulation concentration of CO2/ propane mixtures and the effect of their charge on the cooling performance in an air-conditioning system. Int J Refrig 2007;30(1):43e9. [8] Kim JH, Cho JM, Kim MS. Cooling performance of several CO2/propane mixtures and glide matching with secondary heat transfer fluid. Int J Refrig 2008;31(5):800e6. [9] Grace IN, Datta D, Tassou SA. Sensitivity of refrigeration system performance to charge levels and parameters for on-line leak detection. Appl Therm Eng 2005;25(4):557e66. [10] Li H, Braun JE. Development, evaluation, and demonstration of a virtual refrigerant charge sensor. HVAC R Res 2009;15(1):117e36. [11] Kim W, Braun JE. Extension of a virtual refrigerant charge sensor. Int J Refrig 2015;55:224e35. [12] Yoo JW, Hong SB, Kim MS. Refrigerant leakage detection in an EEV installed residential air conditioner with limited sensor installations. Int J Refrig 2017;78:157e65. [13] Tassou SA, Grace IN. Fault diagnosis and refrigerant leak detection in vapour compression refrigeration systems. Int J Refrig 2005;28(5):680e8. [14] Kocyigit N. Fault and sensor error diagnostic strategies for a vapor compression refrigeration system by using fuzzy inference systems and artificial neural network. Int J Refrig 2015;50:69e79. [15] Shi S, Li G, Chen H, Liu J, Hu Y, Xing L, et al. Refrigerant charge fault diagnosis in the VRF system using Bayesian artificial neural network combined with ReliefF filter. Appl Therm Eng 2017;112:698e706. [16] Guo Y, Li G, Chen H, Wang J, Guo M, Sun S, et al. Optimized neural networkbased fault diagnosis strategy for VRF system in heating mode using data mining. Appl Therm Eng 2017;125:1402e13. [17] Guo Y, Tan Z, Chen H, Li G, Wang J, Huang R, et al. Deep learning-based fault diagnosis of variable refrigerant flow air-conditioning system for building energy saving. Appl Energy 2018;225:732e45. [18] Li G, Hu Y, Chen H, Shen L, Li H, Li J, et al. Extending the virtual refrigerant charge sensor (VRC) for variable refrigerant flow (VRF) air conditioning system using data-based analysis methods. Appl Therm Eng 2016;93:908e19. [19] Li Z, Tan J, Li S, Liu J, Chen H, Shen J, et al. An efficient online wkNN diagnostic strategy for variable refrigerant flow system based on coupled feature selection method. Energy Build 2019;183:222e37. [20] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436e44. [21] Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth international conference on artificial intelligence and statistics; 2011. [22] Jarrett K, Kavukcuoglu K, LeCun Y. What is the best multi-stage architecture for object recognition?. In: 2009 IEEE 12th international conference on computer Vision; 2009. [23] Hinton G, Deng L, Yu D, Dahl G, Mohamed A-r, Jaitly N, et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 2012;29(6):82e97. ron A. Hands-on machine learning with Scikit-Learn and TensorFlow: [24] Ge concepts, tools, and techniques to build intelligent systems. O'Reilly Media, Inc.; 2017. [25] Gao J. Machine learning Applications for data Center optimization. Google White Paper.; 2014. [26] LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE 1998;86(11):2278e324. [27] Bouvrie J. Notes on convolutional neural networks. 2006. p. 38e44. [28] Wen L, Li X, Gao L, Zhang Y. A new convolutional neural network-based datadriven fault diagnosis method. IEEE Trans Ind Electron 2018;65(7):5990e8. [29] Hoang D-T, Kang H-J. Convolutional neural network based bearing fault diagnosis. Intelligent computing Theories and Application. 2017. p. 105e11. [30] He M, He D. Deep learning based approach for bearing fault diagnosis. IEEE

Y.H. Eom et al. / Energy 187 (2019) 115877 Trans Ind Appl 2017;53(3):3057e65. [31] Xia M, Li T, Xu L, Liu L, De Silva CW. Fault diagnosis for rotating machinery using multiple sensors and convolutional neural networks. IEEE/ASME Transactions on Mechatronics. 2018;23(1):101e10. [32] Kim M, Yoon SH, Payne WV, Domanski PA. Cooling mode fault detection and diagnosis method for a residential heat pump. NIST Special Publication; 2008. p. 1087. [33] Goodfellow I, Bengio Y, Courville A. Deep learning. MIT press Cambridge; 2016. [34] Murphy KP. Machine learning: a probabilistic perspective. MIT press; 2012. [35] Friedman J, Hastie T, Tibshirani R. The elements of statistical learning. New York: Springer series in statistics; 2001. [36] Hinton GE. A practical guide to training restricted Boltzmann machines. Neural networks: Tricks of the trade. Springer; 2012. p. 599e619. [37] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics; 2010. [38] He K, Zhang X, Ren S, Sun J. Delving deep into rectifiers: surpassing humanlevel performance on imagenet classification. In: Proceedings of the IEEE international conference on computer Vision; 2015. [39] LeCun YA, Bottou L, Orr GB, Müller K-R. Efficient backprop. Neural networks: Tricks of the trade. Springer; 2012. p. 9e48. [40] Prechelt L. Automatic early stopping using cross validation: quantifying the criteria. Neural Netw 1998;11(4):761e7. [41] Hecht-Nielsen R. Kolmogorov's mapping neural network existence theorem. In: Proceedings of the IEEE international conference on neural networks III; 1987. [42] Yoo JW. Study on the detection method of refrigerant leakage amount in air heat pump system. Ph. D. Dissertation. Seoul National University; 2018. [43] AMCA B. 210-Laboratory methods of testing fans for Aerodynamic performance rating. Air Movement and Control Association International, Inc.; 2007. [44] ANSI/AHRI standard 1230. Performance rating of variable refrigerant flow (VRF) Multi-split air-conditioning and heat pump Equipment. 2010. [45] Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems; 2012.

Nomenclature ANN: artificial neural network bij : bias BANN: Bayesian artificial neural network BPNN: back propagation neural network CNN: convolutional neural network COP: coefficient of performance dnk : the correct answer of the kth output node in the nth sample DSC: the degree of sub-cooling DSH: the degree of superheat EEV: electronic expansion valve EðwÞ: cost function (loss function) f: nonlinear activation function FCDNN: fully-connected deep neural network FDD: fault detection and diagnosis HVAC: heating, ventilation, and air conditioning K: the total number of output nodes Maxi : the maximum value of the ith feature in the training data Mini : the minimum value of the ith feature in the training data N: the total number of samples Nhidden : the number of nodes in the hidden layer Ninput : the number of nodes in the input layer RCFD: refrigerant charge fault detection ReLU: rectified linear unit SNN: shallow neural network uij : the value of feature map after convolution operation VRF: variable refrigerant flow w: the weight of a neural network xij : the value of input image data xk : the input value of the kth output node yk : the output value of the kth output node ynk : the output value of the kth output node in the nth sample zij : the element of originally acquired matrix Z 0 z0ij : the element of a normalized matrix Z Greeks ε: learning rate

13