Hidden Markov model based principal component analysis for intelligent fault diagnosis of wind energy converter systems

Hidden Markov model based principal component analysis for intelligent fault diagnosis of wind energy converter systems

Journal Pre-proof Hidden Markov model based principal component analysis for intelligent fault diagnosis of wind energy converter systems Abdelmalek K...

1MB Sizes 0 Downloads 58 Views

Journal Pre-proof Hidden Markov model based principal component analysis for intelligent fault diagnosis of wind energy converter systems Abdelmalek Kouadri, Mansour Hajji, Mohamed-Faouzi Harkat, Kamaleldin Abodayeh, Majdi Mansouri, Hazem Nounou, Mohamed Nounou PII:

S0960-1481(20)30011-2

DOI:

https://doi.org/10.1016/j.renene.2020.01.010

Reference:

RENE 12872

To appear in:

Renewable Energy

Received Date: 2 August 2019 Revised Date:

30 December 2019

Accepted Date: 2 January 2020

Please cite this article as: Kouadri A, Hajji M, Harkat M-F, Abodayeh K, Mansouri M, Nounou H, Nounou M, Hidden Markov model based principal component analysis for intelligent fault diagnosis of wind energy converter systems, Renewable Energy (2020), doi: https://doi.org/10.1016/j.renene.2020.01.010. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2020 Published by Elsevier Ltd.

Author Contributions Section

In the few related works (we found none similar), the detection/isolation of such a wide range of faults in Wind Turbine Converters (WTC) is infrequently studied. And no single multivariate statistical analysis tool has been investigated/examined in this field so far. Most approaches are based on system data but do not use statistical analysis, and instead incorporate straightforward tools (simply classifiers). These techniques are known to be fully supervised and decision-making is often uncertain since they always require more information about the system for training, including signals during faults. In the current work, a novel data-based framework is designed for using multivariate statistical analysis tools, notice first that the simulated benchmark model is used only as a source of data. A minimum and informative set of variables is first proposed for this framework according to the system and signals characteristics. The required datasets to construct this scheme are then defined regarding the particular system conditions. The proposed scheme can be simply generalized to a real-time application provided the availability of the proposed datasets of given dimensions as well as sampling rate and measured during the chosen normal modes of operation. In this work, a novel fault detection and diagnosis (FDD) framework based on Hidden Markov model (HMM) and principal component analysis (PCA) that is capable of detecting and identifying faults is developed. Features are appropriately extracted through PCA approach by which an optimal number of features is selected. Due to the need to develop a more sophisticated model that adequately takes into account the randomness of the operating environment, a well established probabilistic model based on HMM is used in classifying different faults that can be occurred in Wind Energy Conversion (WEC) power converters. The FDD performances using PCA-based HMM are illustrated through a simulated data collected from the WEC under different operating conditions. The obtained results showed that the PCA-based HMM performed better than the PCA-based support vector machine (SVM) where the data was heavily mixed with noise and measurement errors. The comparison was made based on several performance metrics. Finally, this new approach showed a high FDD accuracy which makes it a good candidate for real-world testing.

Hidden Markov model based principal component analysis for intelligent fault diagnosis of wind energy converter systems Abdelmalek Kouadria,c , Mansour Hajjia,b , Mohamed-Faouzi Harkata,d , Kamaleldin Abodayehe , Majdi Mansouria , Hazem Nounoua , Mohamed Nounouf a

Electrical and Computer Engineering Program, Texas A&M University at Qatar, Qatar b Institut Superieur des Sciences Appliqu´ees et de Technologie de Kasserine, Kairouan University, Tunisia c Signals and Systems Laboratory, Institute of Electrical and Electronics Engineering, University M Hamed Bougara of Boumerdes, Boumerdes, Algeria d Department of Electronics, Faculty of Engineering Annaba, Badji Mokhtar, Annaba, Algeria e Department of Mathematical Sciences, Prince Sultan University, Riyadh, Saudi Arabia f Chemical Engineering Program, Texas A&M University at Qatar, Qatar

Abstract Fault Detection and Diagnosis (FDD) for overall modern Wind Energy Conversion (WEC) systems, particularly its converter, is still a challenge due to the high randomness to their operating environment. This paper presents an advanced FDD approach aims to increase the availability, reliability and required safety of WEC Converters (WECC) under different conditions. The developed FDD approach must be able to detect and correctly diagnose the occurrence of faults in WEC systems. The developed approach exploits the benefits of the machine learning (ML)-based Hidden Markov model (HMM) and the principal component analysis (PCA) model. The PCA technique is used for efficiently extracting and selecting features to be fed to HMM classifier. The effectiveness and higher classification accuracy of the developed PCA-based HMM approach are demonstrated via simulated data collected from the WEC. The obtained results demonstrate the efficiency of the PCA-based HMM method over the PCA-based support vector machine (SVM) method. The comparison is made based on several performance metrics through different operating conditions of the WEC systems. Preprint submitted to Renewable Energy

January 4, 2020

Keywords: Machine Learning (ML), Hidden Markov Model (HMM), Principal Component Analysis (PCA), Wind Energy Conversion Converter (WECC) Systems, Fault Detection and Diagnosis (FDD).

1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

1. Introduction At the heart of the Wind Energy Conversion (WEC) systems is the power converter, which is used as a link between the machine and the grid [1, 2, 3]. Due to its influence on the power quality, the WEC Converter (WECC) should be inherently reliable and continuously available. It has been reported in [4] that 21% from 25% of the total failures in WECC are caused by the semiconductor. Thereby, the downtime becomes increasingly reliable on the WEC system size [5]. In this context, wind speed and power output of a WECC are used for overall rotor condition monitoring regardless of an increased blade surface roughness [6], [7]. A spectral analysis of the nacelle oscillation has been successfully applied for the rotor blade supervision. On the same way, the issue of the double-fed induction generator (DFIG) blade imbalance has been addressed in [8] where the stator current of DFIG has been duly analyzed for extracting imbalanced faults features of the WEC under different wind speeds and imbalance coefficients [9], [10], [11]. Authors in [12] presented an unknown input observer to estimate faults in wind turbine converter assuming that the wind speed is unknown which in turn affects the rotational speed where its control is based on the converter torque. On the other hand and for the same purposes, a diagnosis formalism based on fuzzy prototypes is provided in [13]. Some other techniques are also duly considered and summarized in [14]. On the other hand, it has been proposed a deep learning machine for anomaly detection in the WEC gearbox and generator [15]. More recently, an artificial intelligence-based probabilistic anomaly detection approach has been used for a reliable WEC condition monitoring. This has been achieved by quantifying realistic uncertainties. The WECC power is based on Insulated Gate Bipolar Transistor (IGBT). Nonetheless, one of the main factors of WECC faults is its periodic switching which affects the thermal cycle of different materials with different expansion coefficients; therefore, their life cycle decreases [16]. Besides, switching losses, due to the increase of the internal resistance, become increasingly important. In addition, the meteorological conditions, vibrations, dust and chemical products, under which the WEC operates, represent another source of faults 2

56

in the converter [17]. Under these circumstances, a quite realistic WEC environment has been simulated and different experiments have been carried out. The majority of fault detection and diagnosis (FDD) works have been focused on overall WEC faults diagnosis. Less much of the related literature on WECC has been reported. Therefore, in this work, a novel FDD framework based on Hidden Markov model (HMM) and principal component analysis (PCA) that is capable of detecting and identifying faults is developed. Features are appropriately extracted through PCA approach by which an optimal number of features is selected. Due to the need to develop a more sophisticated model that adequately takes into account the randomness of the operating environment, a well established probabilistic model based on HMM is used in classifying different faults that can be occurred in WEC power converters. The FDD performances using PCA-based HMM are illustrated through a simulated data collected from the WEC under different operating conditions. The application of the PCA-based HMM approach provides the high reliability and safety of the overall WEC system via the FDD of the converter. The next sections of the paper are organized as follows: Section 2 briefly describes theoretically the background of PCA used in feature extraction and selection. Section 3 is devoted to the description of the machine learning technique using HMM. The simulation results that assess the performance of the proposed PCA-based HMM are presented in Section 4. In Section 5, some findings are drawn.

57

2. Feature extraction and selection

33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55

58 59 60 61 62 63 64 65 66 67

2.1. PCA-based feature extraction PCA technique is mainly used for dimensional reduction. It is accomplished by determining a set of orthogonal vectors named as loading vectors which describe the most dominant features and the main trends in data. Let us consider a process that operates under healthy conditions with m sensors from which is collected n observations and regrouped in the data matrix X. The data is first shifted and scaled to zero mean and often in addition to unit variance, respectively, collected from a process that operates under normal conditions with N samples of m variables. The data is first normalized to zero mean and unit variance X = [X1 X2 · · · Xm ] ∈
(1)

68 69

70 71 72 73

Then by means of eigenvalue decomposition of the sample covariance matrix Φ as follows 1 Φ= X T X = P ΛP T , (2) n−1 where Λ = diag(λ1 , λ2 · · · , λm ) is a diagonal matrix that contains the eigenvalues sorted in a decreasing order and P is the loading matrix, PCA transforms the data matrix X into a new matrix T ∈
74 75 76

2.2. PCA-based feature selection The selection of most significant data features can be achieved by splitting P and Λ into modelled and non-modelled variations as follows   P = Pˆ` P˜m−` (4) ˆ` Λ

 Λ= 77 78 79 80

(3)

0



˜ m−` Λ

0

The first part of the original space known as principal subspace is defined by ˆ ` ∈ <`×` . Whilst, the other part, constituting the residual Pˆ` ∈
81

83 84 85 86 87 88 89

(6)

with Tˆ = X Pˆ`

82

(5)

ˆ = TˆPˆ`T . and X

(7)

Tˆ represents the selected features which are obtained through the projection of X onto the first ` eigenvectors corresponding to the largest variances of the sample covariance matrix. In summary, the PCA model is determined based on an eigen-decomposition of the covariance matrix Φ. The obtained PCA model is used to extract and select significant features to be involved as HMM observables in order to be classified. Such features should be conveniently extracted in such a way to emphasize the differences between normal and one or various abnormal operating conditions.

4

90

91 92 93 94 95 96

3. Description of Hidden Markov model technique The Hidden Markov model (HMM) consists of a finite set of states {Si }N i=1 , each of which is associated with a generally joint probability distribution of multivariate observables. Conceptually, HMM is based on a Markov chain. Only an external observation is visible at a hidden state [18], [19], [20]. Transitions between the states are commanded by a matrix of probabilities A = {aij } called transition probabilities, i.e., aij = p(qt+1 = Sj |qt = Si ),

97

1 ≤ i, j ≤ N

(8)

where qt represents the state at the time t. This probability should satisfy N X

1≤i≤N

aij = 1,

(9)

j=1 98

The emission probability distribution in a given state Sj is bj such that bj = p(vt |qt = Sj ),

99 100

(10)

where vt denotes the observation at time t. An emission probability distribution in each of the states should satisfy, X bj = 1, 1≤j≤N (11) t

101 102

103 104 105 106 107

Another parameter characterizing HMM is the initial state probability, π = {πi }, πi = p(q1 = Si ), 1≤i≤N (12) Therefore, the three main elements of a HMM are the state transition probability matrix A, the measurement probability distribution matrix B, and the initial state probability distribution π. For convenience, a compact notation is used to indicate the complete parameter set of the model. So, HMM can be denoted by a triplet G = (A, B, π).

108 109 110

(13)

Figure 3 presents, in a comprehensive manner, a HMM scheme. The observables probability distributions in each state, {bj }, are joint discrete probability mass functions. They jointly represent the features of probability 5

111 112 113 114 115 116 117 118 119 120 121 122 123 124

distributions. Each of them corresponds to a given process status. Therefore, each state indicates the corresponding process status (Healthy, Faulty 1 (F1 ), Faulty 2 (F2 ),· · · , Faulty N (FN ), N is the total number of faults). Transition probabilities are the weights of the links between the different states of a HMM model. They are often manually introduced to ensure certain required performance. For FDD purposes, the transition from healthy to one of the faulty statuses and vice-versa is provided with minimum rates which increases robustness and sensitivity [21], [22]. In this work, the transition probabilities of HMM model are selected for the diagnosis performance. The recognition of the process statuses is accomplished by finding the most probable corresponding hidden states through a series of features observations. There are many efficient approaches developed for this purpose [18]. The main idea is to compute the probability of each state at time t for a given sequence of features observations αi (t) = p(qt = Si |vt , G),

125 126 127

1≤i≤N

(14)

Using the Bayesian approach to find p(qt = Si |vt , G) over all states, subsequently, the maximum likelihood decision for the process status at time t is the state associated with the largest αi (t).

Figure 1: General description of HMM

128 129

130 131

4. Fault detection and classification using PCA-based HMM technique For the FDD purpose, the main steps of the proposed framework are as follows: i) Different measurements are recorded from the process under 6

132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151

different operating conditions. The collected data represents healthy and different possible faulty scenarios that can be occurred in the process. It can be divided into two sets; one set is used for training and the other set is used for testing. ii) A PCA model is built using only the training data set where the system is working under normal operating conditions. iii) A set of features is extracted through the PCA model by which the information in the original data becomes decreasingly insignificant. Therefore, the most captured features are kept and represent the projection of the data onto a subspace defined by a reduced number of overall projector directions. This number is adequately chosen because it significantly affects the classification performance. iv) Joint probabilities of the selected features over the different process situations are computed. The HMM structure depends on the number of scenarios. Thus, a corresponding state is assigned for each scenario. Based on some requirements and for large-sized data, the intermediate transitions between all HMM states are defined manually. Whereas, in the univariate case, it is more accessible to estimate the transition probabilities. Once the HMM is trained, this is meant that its parameter triplet is defined, a testing setup is used to assess its performance. The different steps of the proposed strategy for FDD purposes are summarized in the block diagram illustrated in Figure 2. WECC training data

WECC testing data

WECC data pre-processing

WECC data pre-processing

Feature extraction and selection using PCA

Feature extraction and selection using PCA

Fault classification using HMM

Fault classification using HMM

Model

Prediction

Fault diagnosis results

Figure 2: Illustration of PCA-based HMM procedures for WECC fault detection and diagnosis

7

152

153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169

5. Simulation Results 5.1. Description of wind turbines converter systems In this study, a wind turbines converter system topology at variable speed is considered. This topology is based on a squired cage induction machine. The asynchronous machine is coupled to the turbine through a speed multiplier, (see Figure 2). The variable speed operation of these turbines has become possible by static converters development and their control systems. Indeed, two static converters interfaced by a continuous bus are used. The connection of these converters to the grid is provided by a passive filter to reduce the current harmonics. The first converter controls the power generated by acting on the generator speed. This makes it possible to control the blades system orientation for wind gusts. The second converter allows, with adequate control, to deliver fixed frequency currents corresponding at the grid frequency, with power factor adjustment. Thus, the machine rated power determines the maximum power supplied by wind turbine. For this installation, the sizing of the used converters is done to transfer the total power exchanged between the machine and the grid. Wind turbine parameters are illustrated in Table 1.

Figure 3: Variable speed wind turbine based on asynchronous machine.

8

Table 1: Wind Turbine parameters

Nominal Power of turbine Moment of inertia of turbine Stator resistance Stator leakage inductance Rotor resistance Rotor leakage inductance Magnetizing inductance Number of poles Moment of inertia of generator 170 171 172 173 174 175 176

Ptn Jt Rs ls Rr lr Lm P Jg

15 kW 1000 kgm2 0.087 Ohm 0.8 mH 0.228 Ohm 0.8 mH 34.7 mH 4 0.2 kgm2

Power converters topology, used in the wind chain, is at two levels. Each converter is composed of three arms. Each arm consists of a high and a low IGBTs, (see Figure 4). The diagnostic study concerns only two IGBTs. Only one IGBT for each converter (IGBT11 for the generator converter and IGBT21 for the grid converter). Three types of faults are considered: shortcircuit, open-circuit and wear-out (see Table 2). The last fault is modeled by an internal resistance which equals two Ohms. 

IGBT11

IGBT12

IGBT13

IGBT21

IGBT22

IGBT23

IGBT14

IGBT15

IGBT16

IGBT24

IGBT25

IGBT26

Figure 4: Converters topology.

Table 2: Main electrical faults in wind energy converters

Fault symbol SC11 OC11 W O11

description IGBT11 Short-Circuit IGBT11 Open-Circuit IGBT11 Wear-Out 9

Fault symbol SC21 SC21 W O21

description IGBT21 Short-Circuit IGBT21 Open-Circuit IGBT21 Wear-Out

800 Healthy

SC11

SC21

WO 11

WO 21

OC11

OC21

600

400 Mecanical torque

178

Figures 5 to 8 show the behavior of some electrical and mechanical variables for different fault scenarios.

200

0

-200

-400 0

2000

4000

6000 8000 Number of Samples

10000

12000

14000

Figure 5: Mechanical torque for different conditions.

1200 SC11

Healthy

SC21

WO 11

OC11

WO 21

OC21

1000

800 Generator speed

177

600

400

200

0 0

2000

4000

6000

8000 Number of Samples

10000

Figure 6: Generator speed for different conditions.

10

12000

14000

200 SC11

Healthy

SC21

WO 11

WO 21

OC21

OC11

100

Generator current, isa

0

-100

-200

-300

-400 0

2000

4000

6000 8000 Number of Samples

10000

12000

14000

Figure 7: Generator current for different scenarios.

500 SC11

Healthy

400

SC21

WO 11

OC21

OC11

WO 21

300 200

Grid current, isa

100 0 -100 -200 -300 -400 -500 -600 0

2000

4000

6000 8000 Number of Samples

10000

Figure 8: Grid current for different scenarios.

11

12000

14000

1400 SC11

Healthy

SC21

WO 11

WO 21

OC21

OC11

1200

Bus voltage

1000

800

600

400

200

0 0

2000

4000

6000 8000 Number of Samples

10000

12000

14000

Figure 9: Bus voltage for different conditions. 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198

5.2. Fault classification results Different experiments are conducted under different operating conditions of WECC for FDD purposes. In Table 3, there are 12 various simulated variables measurements are collected. These variables represent one healthy and 6 different faulty operating modes of WECC. Each mode behaviour is adequately described over 10-spaced 2000 samples with 20 KHz as sampling frequency for the training phase. Also, for the testing phase, other data set is not similar to that on which training occurred (see Table 4). Under healthy operating conditions, its corresponding training data set, after its normalization to zero mean and unit variance, is used to build a PCA model. Through the eigenvalue decomposition, the obtained variances of the transformed WECC variables are illustrated in Figure 10. According to the most significant captured information in the data via its projection, a PCA model with 5 directions has been constructed. In fact, this selection is based on the fact that the components with variances less than one are neglected and considered as noise and measurements error. Typically, this variance limit can be extended until 0.7, therefore, only 5 principal components are retained to be used in a HMM classifier as observables. In this study and in regard to the number of different scenarios, a HMM consists of 7 states representing different process statuses (see Table 4). Furthermore, along with the observables, 12

199 200 201 202 203 204 205 206 207 208 209 210 211 212 213

a joint probability distribution has been appropriately determined. Also, the initial state probabilities {πi } are equally likely distributed. The transition probabilities between the different HMM states are properly established and kept as fixed parameters in this work. Consequently, to diagnose faults, the state with the largest probability to the observation presented is chosen as the candidate process status. In the training phase, HMM shows a potential ability to certainly recognize all the different operating conditions of the WECC. In comparison with support vector machine (SVM), an accuracy of 93.61% has been recorded. The two obtained confusion matrices via the HMM and SVM techniques are listed in Tables 5 and 6, respectively. To thoroughly evaluate the quality of the classification, in addition to the accuracy, four other metrics are used for the testing data set: Recall, Precision, Specificity, F-score and Matthews Correlation Coefficient (MCC) [23], [24], [25], [26]. They are respectively defined as follows: Recall =

TP TP + FN

214

(15)

Precision =

TP TP + FP

(16)

Specificity =

TN TN + FP

(17)

215

216

MCC = p 217 218 219 220 221 222 223 224 225 226

TP × TN − FP × FN (T P + F P ) × (T P + F N ) × (T N + F P ) × (T N + F N )

, (18)

where T P is the number of samples that are correctly identified; T N represents the number of samples that are correctly dismissed; F P is the number of samples that are uncorrectly dismissed; and F N is the number of samples that are uncorrectely identified. All the aforementioned metrics are usually expressed in percentage. Recall metrics, known also as sensitivity, measures the individual classification accuracy. From Table 5 and in term of the recall metric, the HMM identifies perfectly the faults W O11 , OC11 and OC21 which shows a slight improvement to the SVM classifier (see Table 6). Whilst, the other three faults are relatively well arranged to their corresponding classes where well 13

227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250

improvement has been shown by HMM than SVM classifier. In point of view fault detection, almost 10% of false alarms have been reduced. As well, the proposed HMM performs better than the SVM in term of classification accuracy. The two last rows in Tables 5 and 6 list the Precision measures through HMM and SVM, respectively. It can be clearly seen that the HMM demonstrates a good performance than the SVM for the faults SC11 and SC21 . Whereas, a remarkably small improvement for the classification of the other faults. Furthermore, the precision for the healthy operating conditions of WECC is more meaningful via the HMM than SVM. At the exception of Accuracy for evaluating classification performance, the other metrics are suited. To provide an efficient single performance indicator, the Recall and Precision are used together in the F-score metric in analyzing the classifier. In Table 7, it is clearly shown that the SVM classifier generally fails compared to the HMM in term of F-score. On the other hand, the fault diagnosis quality for the rest classes to a concerned fault is evaluated via the Specificity. So, HMM slightly shows good Specificity compared to the SVM technique. To avoid a poor generalization in WECC faults diagnosis via the accuracy, the MCC is usually considered as a balance metric to assess the classification accuracy. Consequently, the MCC using SVM approach is worse than the proposed HMM in healthy and two first faulty operating modes of the WECC. Further, small diagnosis goodness using HMM is recorded in three last faults compared to SVM. Whilst, in faulty W O11 WECC status, a small improvement is reported.

14

Table 3: Variables description

Variables x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12

Descriptions Cm : Mechanical torque (Nm) Ng : Generator speed (tr/m) isag : Generator current phase a (A) isbg : Generator current phase b (A) Isd : Generator current along d-axis (A) Isq : Generator current along q-axis (A) VDC : Bus voltage (V) POut : Output power (W) isar : Grid current phase a (A) isbr : Grid current phase b (A) Isd : Grid current along d-axis (A) Isq : Grid current along q-axis (A)

Table 4: Different HMM states, process statuses and its corresponding data

State S0 S1 S2 S3 S4 S5 S6

Process status Healthy Faulty-SC11 Faulty-SC21 Faulty-WO11 Faulty-WO21 Faulty-OC11 Faulty-OC21

Training Data Testing Data 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000

15

3.5 3

Variances

2.5 2 1.5 1 0.5 0 2

3

4

5 6 7 8 Number of Principal Components

9

10

11

12

Figure 10: Selected number of principal components

Table 5: Confusion matrix with different performance metrics for PCA-based HMM classifier in testing phase

Actual process statuses S0 S1 S2 S3 S4 S5 S6 Precision

1741 47 147 0 9 0 0 89.56

16

Predicted process statuses 47 207 5 0 0 1890 63 0 0 0 128 1694 0 4 0 0 0 2000 0 0 0 8 0 1983 0 0 0 0 0 2000 0 0 0 0 0 91.53 85.90 99.75 99.80 100

0 0 0 0 0 0 2000 100

Recall 87.05 94.50 84.70 100 99.15 100 100 95.06

Table 6: Confusion matrix with different performance metrics for SVM classifier in testing phase

Actual process statuses

Precision

S0 S1 S2 S3 S4 S5 S6

1532 99 298 0 50 0 0 77.41

Predicted process statuses 101 341 0 26 1793 107 0 1 248 1435 0 19 0 0 1999 0 1 27 0 1922 0 0 0 0 0 0 0 0 83.67 75.13 100 97.66

0 0 0 1 0 1999 0 99.95

Table 7: Different performance metrics for PCA-based HMM/SVM classifiers in testing phase

Classifier Metrics S0 S1 S2 S3 S4 S5 S6 251

252 253 254 255 256 257 258 259 260 261

F1 88.29/77.00 92.99/86.56 85.29/73.40 99.87/99.97 99.47/96.87 100.0/99.95 100.0/99.97

HMM/SVM Specificity 98.31/96.28 98.54/97.08 97.68/96.04 99.96/100 99.62/99.62 100/99.99 100/99.99

MCC 86.38/73.20 91.81/84.29 82.87/69.11 99.85/99.97 99.39/96.36 100/99.94 100/99.97

6. Conclusion In this paper, a machine learning-based Hidden Markov model (HMM) merged with principal component analysis (PCA) was proposed to deal with the problem of faults detection and diagnosis (FDD) in wind energy conversion converts (WECC) systems. The PCA model was applied in order to extract and select more efficient features to be used as observables in the HMM technique. Different operating conditions of the WECC were considered to show the robustness and the efficiency of the developed PCA-based HMM approach. The obtained results showed that the PCA-based HMM performed better than the PCA-based support vector machine (SVM) where the data was 17

Recall 0 0 0 0 0 1 2000 99.95

76.60 89.65 71.75 99.95 96.10 99.95 100 90.57

264

heavily mixed with noise and measurement errors. The comparison was made based on several performance metrics. Finally, this new approach showed a high FDD accuracy which makes it a good candidate for real-world testing.

265

Acknowledgment

262 263

268

This work was made possible by NPRP grant NPRP9-330-2-140 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.

269

References

266 267

270 271 272

273 274 275

276 277 278

279 280 281

282 283 284

285 286 287

288 289 290

[1] P. B. Dao, W. J. Staszewski, T. Barszcz, T. Uhl, Condition monitoring and fault detection in wind turbines based on cointegration analysis of scada data, Renewable Energy 116 (2018) 107–122. [2] H. Niemann, N. Kjølstad Poulsen, M. Mirzaei, L. C. Henriksen, Fault diagnosis and condition monitoring of wind turbines, International Journal of Adaptive Control and Signal Processing 32 (4) (2018) 586–613. [3] D. Zappal´a, N. Sarma, S. Djurovi´c, C. Crabtree, A. Mohammad, P. Tavner, Electrical & mechanical diagnostic indicators of wind turbine induction generator rotor faults, Renewable energy 131 (2019) 14–24. [4] W. Qiao, D. Lu, A survey on wind turbine condition monitoring and fault diagnosis—part i: Components and subsystems, IEEE Transactions on Industrial Electronics 62 (10) (2015) 6536–6545. [5] J. Lan, R. J. Patton, X. Zhu, Fault-tolerant wind turbine pitch control using adaptive sliding mode estimation, Renewable Energy 116 (2018) 219–231. [6] P. Caselitz, J. Giebhardt, Rotor condition monitoring for improved operational safety of offshore wind energy converters, Journal of Solar Energy Engineering 127 (2) (2005) 253–261. [7] Y. Zhang, B. Chen, G. Pan, Y. Zhao, A novel hybrid model based on vmd-wt and pca-bp-rbf neural network for short-term wind speed forecasting, Energy Conversion and Management 195 (2019) 180–197.

18

291 292 293

294 295 296

297 298 299

300 301 302

303 304 305

306 307 308

309 310 311

312 313 314

315 316 317

318 319 320

[8] D. Yang, J. Tang, F. Zeng, Blade imbalance fault diagnosis of doubly fed wind turbine based on current coordinate transformation, IEEJ Transactions on Electrical and Electronic Engineering 14 (2) (2019) 185–191. [9] H. Habibi, I. Howard, S. Simani, Reliability improvement of wind turbine power generation using model-based fault detection and fault tolerant control: A review, Renewable energy (2018). [10] M. Shahbazi, P. Poure, S. Saadate, Real-time power switch fault diagnosis and fault-tolerant operation in a dfig-based wind energy system, Renewable Energy 116 (2018) 209–218. [11] E. Artigao, A. Honrubia-Escribano, E. Gomez-Lazaro, Current signature analysis to monitor dfig wind turbine generators: A case study, Renewable Energy 116 (2018) 5–14. [12] P. F. Odgaard, J. Stoustrup, Unknown input observer based scheme for detecting faults in a wind turbine converter, IFAC Proceedings Volumes 42 (8) (2009) 161–166. [13] S. Simani, P. Castaldi, A. Tilli, Data—driven approach for wind turbine actuator and sensor fault detection and isolation, IFAC Proceedings Volumes 44 (1) (2011) 8301–8306. [14] Z. Yang, Y. Chai, A survey of fault diagnosis for onshore grid-connected converter in wind energy conversion systems, Renewable and Sustainable Energy Reviews 66 (2016) 345–359. [15] H. Zhao, H. Liu, W. Hu, X. Yan, Anomaly detection and fault analysis of wind turbine components based on deep learning network, Renewable energy 127 (2018) 825–834. [16] A. Timbus, M. Liserre, R. Teodorescu, P. Rodriguez, F. Blaabjerg, Evaluation of current controllers for distributed power generation systems, IEEE Transactions on power electronics 24 (3) (2009) 654–664. [17] E. Wolfgang, L. Amigues, N. Seliger, G. Lugert, Building-in reliability into power electronics systems, The world of electronic packaging and system integration (2005) 246–252.

19

321 322 323

324 325 326 327

328 329 330 331

332 333 334

335 336 337

338 339

340 341 342

343 344

345 346 347

[18] L. R. Rabiner, A tutorial on hidden markov models and selected applications in speech recognition, Proceedings of the IEEE 77 (2) (1989) 257–286. [19] M. M. Rashid, J. Yu, Hidden markov model based adaptive independent component analysis approach for complex chemical process monitoring and fault detection, Industrial & Engineering Chemistry Research 51 (15) (2012) 5506–5514. [20] C. Ning, M. Chen, D. Zhou, Hidden markov model-based statistics pattern analysis for multimode process monitoring: an index-switching scheme, Industrial & Engineering Chemistry Research 53 (27) (2014) 11084–11095. [21] M. Qui˜ nones-Grueiro, A. Prieto-Moreno, C. Verde, O. Llanes-Santiago, Data-driven monitoring of multimode continuous processes: A review, Chemometrics and Intelligent Laboratory Systems (2019). [22] M. S. Kan, A. C. Tan, J. Mathew, A review on prognostic techniques for non-stationary and non-linear rotating systems, Mechanical Systems and Signal Processing 62 (2015) 1–20. [23] D. M. Powers, Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation (2011). [24] S. Boughorbel, F. Jarray, M. El-Anbari, Optimal classifier for imbalanced data using matthews correlation coefficient metric, PloS one 12 (6) (2017) e0177678. [25] Q. Zou, S. Xie, Z. Lin, M. Wu, Y. Ju, Finding the best classification threshold in imbalanced classification, Big Data Research 5 (2016) 2–8. [26] R. M. Losee, When information retrieval measures agree about the relative quality of document rankings, Journal of the American Society for Information Science 51 (9) (2000) 834–840.

20

Hidden Markov model based principal component analysis for intelligent fault diagnosis of wind energy converter systems Highlights: 1.

Machine learning based- Hidden Markov model (HMM) technique has been developed for faults detection and diagnosis (FDD)

2.

Most relevant features have been extracted and selected via the principal component analysis (PCA) approach

3.

The extracted and selected features have been used as observables in HMM procedure

4.

The developed PCA-based HMM approach has shown good FDD efficiency in Wind Energy Conversion systems

Conflicts of Interest Statement Manuscript title: Hidden Markov models based principal component analysis for intelligent fault diagnosis of wind energy converter systems

The authors whose names are listed immediately below certify that they have NO affi liations with or involvement in any organization or entity with any fi nancial interest (such as honoraria; educational grants; participation in speakers’ bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-fi nancial interest (such as personal or professional relationships, affi liations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript. Author names:

Abdelmalek Kouadri, Mansour Hajji, Mohamed-Faouzi Harkat, Kamaleldin Abodayeh, Mansouri Mansouri, Nounou Hazem, Nounou Mohamed

The authors whose names are listed immediately below report the following details of affiliation or involvement in an organization or entity with a financial or non-financial interest in the subject matter or materials discussed in this manuscript. Please specify the nature of the conflict on a separate sheet of paper if the space below is inadequate. Author names: