A multi-attribute classification fusion system for insulated gate bipolar transistor diagnostics

Microelectronics Reliability 53 (2013) 1117–1129 Contents lists available at SciVerse ScienceDirect Microelectronics Reliability journal homepage: w...

Download PDF

2MB Sizes 3 Downloads 82 Views

Report

PDF Reader
Full Text

Microelectronics Reliability 53 (2013) 1117–1129

Contents lists available at SciVerse ScienceDirect

Microelectronics Reliability journal homepage: www.elsevier.com/locate/microrel

A multi-attribute classiﬁcation fusion system for insulated gate bipolar transistor diagnostics Prasanna Tamilselvan a, Pingfeng Wang a,⇑, Michael Pecht b a b

Department of Industrial and Manufacturing Engineering, Wichita State University, Wichita, KS 67208, USA Center for Advanced Life Cycle Engineering (CALCE), University of Maryland, College Park, MD 20742, USA

a r t i c l e

i n f o

Article history: Received 8 September 2012 Received in revised form 18 February 2013 Accepted 29 April 2013 Available online 25 May 2013

a b s t r a c t Effective health diagnosis provides beneﬁts such as improved safety, improved reliability, and reduced costs for the operation and maintenance of complex engineered systems. This paper presents a multiattribute classiﬁcation fusion system which leverages the strengths provided by multiple membership classiﬁers to form a robust classiﬁcation model for insulated gate bipolar transistor (IGBT) health diagnostics. The developed diagnostic system employs a k-fold cross-validation model for the evaluation of membership classiﬁers, and develops a multi-attribute classiﬁcation fusion approach based on a weighted majority voting with dominance scheme. An experimental study of IGBT degradation was ﬁrst carried out for the identiﬁcation of failure precursor parameters, and classiﬁcation techniques (e.g., supervised learning, unsupervised learning, and statistical inference) were then employed as the member algorithms for the development of a robust IGBT classiﬁcation fusion system. In this study, the developed classiﬁcation fusion model based on multiple member classiﬁcation algorithms outperformed each stand-alone method for IGBT health diagnostics by providing better diagnostic accuracy and robustness. The developed multi-attribute classiﬁcation fusion system provides an effective tool for the continuous monitoring of IGBT health conditions and enables the development of IGBT failure prognostics systems. Ó 2013 Elsevier Ltd. All rights reserved.

1. Introduction System health state (HS) classiﬁcation provides beneﬁts such as improved safety, improved reliability, and reduced costs for the operation and maintenance of complex engineered systems. Research on real-time diagnostics and prognostics interprets data acquired by smart sensors and utilizes these data streams in making critical operation and maintenance (O&M) decisions [1]. Maintenance and life-cycle management is one area that will signiﬁcantly beneﬁt from the improved design and maintenance activities in both the manufacturing and service sectors. Maintenance and life-cycle management activities constitute a large portion of overhead costs in many industries [2]. These costs are likely to increase due to the rising competition in today’s global economy. In the manufacturing and service sectors, unexpected breakdowns are prohibitively expensive, since they immediately result in lost production, failed shipping schedules, and poor customer satisfaction. In order to reduce and possibly eliminate such problems, it is necessary to accurately assess the current state of system degradation through health diagnostics. Research on condition monitoring addressed these challenges by utilizing sensory information from functioning systems and assessing their degradation states. Con⇑ Corresponding author. Tel.: +1 316 978 5910; fax: +1 316 978 3742. E-mail address: [email protected] (P. Wang). 0026-2714/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.microrel.2013.04.011

tinuous monitoring of current system HSs notiﬁes users about both early and advanced stages of damage by analyzing the performance degradation of system components [3–5]. Condition monitoring has been successfully applied in bearings [6–9], machine tools [10], transformers [11], engines [12], aircraft wings [13], and turbines. Due to the complexity of HS classiﬁcation for different engineered systems, machine learning and statistical inference techniques are often employed to solve diagnostic problems. The machine learning-based health diagnostics methodology can be broadly classiﬁed into supervised, semi-supervised, and unsupervised learning techniques. A supervised learning technique is the process of learning the relationship between the input values and the desired target value in the form of a set of patterns having both an input object and the desired target output. Error values are evaluated and given as feedback to the learning model in order to get a potential solution. The learned relationship/function from the training data is used as a classiﬁer model to predict the unlearned and unknown patterns. An unsupervised learning process is a process of learning a hidden relationship in the input values without the target labels/outputs (unlabeled data). The unlabeled data generally refers to training data set for system input variables without knowing the corresponding system outputs, which is often used for the unsupervised learning. In contrast, labeled data refers to paired training data set having both system input variables and

1118

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129

Nomenclature CCi,j Acronyms HS health state WMVD weighted majority voting with dominance DBN deep belief networks BNN back-propagation neural network GA genetic algorithm SVM support vector machine SOM self-organizing map MD Mahalanobis distance RBM restricted Boltzmann machine RUL remaining useful life IGBT insulated gate bipolar transistor MOSFET metal oxide semiconductor ﬁeld effect transistor BJT bipolar junction transistor VCE collector emitter voltage ICE collector emitter current T case temperature Notation k total number of folds used for fusion formulation, where 16i6k n total number of HSs, where 1 6 j 6 n r total number of training data points in each fold, where 16l6r c total number of classiﬁer methods used in fusion process, where 1 6 m 6 c

corresponding system outputs, which is often used for the supervised learning. A semi-supervised learning technique learns the relationship between the input values by utilizing both labeled (input values with target outputs) and unlabeled data (input values without target outputs). Besides the different types of machine learning-based diagnostic algorithms, statistical distance-based algorithms can also be used for classifying different HSs based on their statistical distances, such as Euclidean distance and Mahalanobis distance. Signiﬁcant advancements in the diagnostics area have been achieved by applying classiﬁcation techniques based on machine learning or statistical inferences, resulting in a number of classiﬁcation methods [14], such as back-propagation neural networks (BNNs) [15–18], deep belief networks (DBNs) [19,20], support vector machines (SVMs) [21–25], self-organizing maps (SOMs) [26], genetic algorithms [27,28], and Mahalanobis distance (MD) [29,14]. Despite successful applications of different diagnostic algorithms in various engineering ﬁelds, a challenge for health diagnostics is that the implicit relationship between different system HSs and features of sensory signals makes it difﬁcult to develop a generic health diagnostics algorithm. Furthermore, there are many factors that inﬂuence the efﬁcacy of diagnostic systems, such as (i) the dependency of the algorithm’s accuracy on the number of data points in a training data set; (ii) the signiﬁcant variability in manufacturing conditions and large uncertainties in environmental and operational conditions; and (iii) the sensory signal relationships with different HSs (e.g., linear, non-linear). Therefore, no single diagnostic classiﬁer works well for all possible situations. Instead of using an individual diagnostic algorithm, it would be beneﬁcial to leverage the strengths of different algorithms to form a robust uniﬁed algorithm [30]. A classiﬁcation fusion system is an algorithm in which results from different individual diagnostic algorithms are combined into a single diagnostic decision, thereby

dl,m ACi,j al wtm,j incm,j T xi qi w b

lj Sj ni C wi,j ui P() bi hi

vi

h() a(t)

multi-attribute classiﬁer decision matrix of jth HS for ith fold classiﬁcation decision of mth classiﬁer for lth training data point target classiﬁcation decision matrix of jth HS for ith fold target classiﬁcation decision of lth training data point weight value of mth classiﬁer in classifying jth HS classiﬁcation decision of mth classiﬁer as jth HS of the incoming data point classiﬁcation fusion decision of the incoming data point p-dimensional vector ith class label normal vector of the hyper-plane bias of the hyper-plane mean vector of the training data variance matrix of the training data slack variable penalty parameter synaptic weight between the ith and the jth neurons state of the ith neuron probability distribution function bias of the ith neuron state of the ith neuron in hidden layer state of the ith neuron in visible layer neighborhood function learning coefﬁcient

improving the robustness and accuracy of health diagnostics. The fusion methods can be classiﬁed by their combination strategy: consensus or learning. Examples of noted fusion methods and brief descriptions are in Table 1 [31]. The fusion method is applied in a wide variety of research ﬁelds, such as in the development of committees of neural networks [32,33], meta-modeling for the design of modern engineered systems [34–36], discovery of regulatory motifs in bioinformatics [37], detection of trafﬁc incidents [38], transient identiﬁcation of nuclear power plants [39], and development of ensemble Kalman ﬁlters [40]. Similar to health diagnostics applications, Hu et al. [41] developed an ensemble prognostic system by combining different prognostic member algorithms to predict remaining useful life (RUL) and utilized a k-fold cross-validation process to evaluate the error of each member algorithm. The results from different algorithms are combined into single predicted RUL by an optimized weighting process. However, in most of the existing diagnostic classiﬁcation fusion systems, multiple classiﬁcation models are developed by training using multiple training subsets from the single training data set, and the results from different classiﬁcation models are combined into a single diagnostic decision (committees of neural networks [32,33]). Research on HS diagnostics does not use a dominant member algorithm for each HS, thereby utilizing the advantage of the each member algorithm in diagnosing the corresponding HS when developing a classiﬁcation fusion system. Although there have been signiﬁcant advances in diagnostics and health monitoring, degradation in electronics is more difﬁcult to detect and inspect than in most mechanical systems and structures due to the small scale (micro- to nano-scale) but complex architecture of most electronic products [46]. Insulated gate bipolar transistors (IGBTs) are used in applications, such as the switching of automobile and train traction motors and in switch mode power supplies, to regulate DC voltage [47]. The failure of these

1119

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129 Table 1 Examples of noted fusion methods. Combining strategy

Fusion method

Description

Reference

By consensus

Bagging Random Forest Boosting Adaboost

Bagging determines a class label with major voting by multiple classiﬁers Random forest improves the performance of bagging by combining with random feature selection scheme

Breiman [41] Breiman [42]

Boosting trains weak classiﬁers and combines them into a strong classiﬁer Adaboost trains each base classiﬁer with a weighted data set and its weighting coefﬁcients are computed from classiﬁcation errors by the previous classiﬁers and then aggregates the base classiﬁers into one classiﬁer Rule Ensemble not only uses a basis function as a base classiﬁer, it also includes a rule as a base classiﬁer. As the rule has a simple form, it is easy to understand the inﬂuences of rules on predictions and the degree of dependency on each other

Schapire [43] Freund and Schapire [44] Friedman and Popescu [45]

By learning

Rule ensemble

switches can reduce the efﬁciency of the system and lead to unexpected system failures [47]. Through accurate health diagnostics on critical components such as IGBTs, cost beneﬁts can be achieved by avoiding unscheduled maintenance while improving system safety [47]. Patil et al. [48] applied Mahalanobis distance (MD) for anomaly detection for IGBTs and monitored collector–emitter voltage and collector–emitter currents as input parameters to calculate the MD. The MD values obtained from the healthy data were transformed into three sigma limits and used as a threshold to detect degradation in the IGBTs [48]. Most research on IGBT diagnostics has not considered the classiﬁcation of the current health condition into different HSs. Early warnings of IGBT failure will aid the user in taking necessary O&M actions to avoid the unexpected breakdown of IGBTs. Therefore, it is important to identify the current IGBT health condition and assign it to one of the possible HSs of the IGBT unit. This will warn system operators about the early stages of failure, so that appropriate actions can be taken to avoid catastrophic failures induced by the malfunction of IGBTs. Despite the success of HS diagnostics in different applications, there are still problems with handling multiple heterogeneous data, identifying the appropriate precursor parameters to detect the current IGBT health condition, and choosing the appropriate classiﬁer model for IGBT health diagnostics. This paper presents a multi-attribute classiﬁcation fusion system which leverages the strengths provided by multiple membership classiﬁers to form a robust classiﬁcation model for IGBT health diagnostics. The diagnostics system employs a k-fold cross-validation model for the evaluation of membership classiﬁers and develops a multi-attribute fusion approach based on a weighted majority voting scheme. To apply the classiﬁcation fusion system to the health diagnostics of IGBTs, an experimental study of IGBT degradation was ﬁrst carried out for the identiﬁcation of failure precursor parameters, and state-of-the-art classiﬁcation techniques were then employed as the member algorithms. The rest of the paper is organized as follows. Section 2 introduces the generic multi-attribute classiﬁcation fusion system and the weighted majority voting with dominance (WMVD) process. Section 3 discusses the multi-attribute classiﬁcation member algorithms. Section 4 discusses the health diagnostics of IGBTs. Section 5 demonstrates the developed multi-attribute classiﬁcation fusion system for IGBT diagnostics, and Section 6 summarizes the presented research and future work.

2. A multi-attribute classiﬁcation fusion system In this section, a multi-attribute classiﬁcation fusion system for health diagnostics is developed. Section 2.1 presents a generic framework utilizing a multi-attribute classiﬁcation fusion system for a health diagnostics system. Section 2.2 details the developed classiﬁcation fusion systems with k-fold cross-validation for diagnostic accuracy evaluation and weighted majority voting with a dominance approach for classiﬁer fusion.

2.1. Framework for the multi-attribute classiﬁcation fusion system As shown in Fig. 1, developing a multi-attribute classiﬁcation fusion system using classiﬁer models involves three major steps: (1) system familiarization and data preprocessing; (2) multi-attribute classiﬁcation fusion system; and (3) online monitoring. In the ﬁrst step, the speciﬁc diagnostic problem and system HS of interest are deﬁned. After deﬁning the diagnostic problem, the precursor parameters need to be identiﬁed, which are observable and relevant directly or indirectly to the health condition of the system. The data collected for the precursor parameters will be preprocessed and categorized into different predeﬁned HSs. The second part of health diagnostics is the multi-attribute classiﬁcation fusion system, which consists of three primary steps: fusion formulation, multi-attribute classiﬁer diagnostics using member algorithms, and classiﬁer fusion. The preprocessed sensory data with a known HS in the ﬁrst step and will be utilized to determine weights for each multi-attribute classiﬁer model based on the classiﬁcation rate of each classiﬁcation technique. The testing data set will be divided into its corresponding HSs using different multi-attribute classiﬁcation techniques. The results for each set of precursor parameters from the multi-attribute classiﬁcation techniques are combined into a single output for each set using a classiﬁer fusion process. Therefore, fusion diagnostics results are obtained for each set of precursor parameters. The last step of the health diagnostics process is performing online diagnostics using the trained classiﬁer fusion model, which involves the extraction of real-time data for the precursor parameters and diagnosis of the current health state using the trained classiﬁer model. As more online data are collected, continuous learning is implemented by updating the initial diagnostic model with the newly collected data. The fusion process of diagnostic algorithms is explained in detail in the following subsection. 2.2. Multi-attribute classiﬁcation fusion It is essential to develop a robust diagnostic solution that accurately classiﬁes the different HSs using data features extracted from multi-dimensional sensory signals. To construct such a uniﬁed health diagnostic framework, this paper develops (i) a k-fold cross-validation (CV) approach to evaluate the error metric associated with a candidate classiﬁer model; and (ii) a WMVD approach for the fusion of multi-attribute classiﬁcation algorithms. Fig. 2 shows the overall procedure of the developed fusion approach with the k-fold CV and WMVD approaches. This data-driven fusion diagnostic approach is ﬁrst carried out ofﬂine and is composed of two steps: fusion formulation and classiﬁer fusion. The fusion formulation is done with the k-fold CV, and the classiﬁcation rates of kfolds are computed, where the classiﬁcation rate is deﬁned as the ratio of the number of data points which are correctly diagnosed into their corresponding HSs to the total number of data points in the data set. k-fold CV is an effective CV process for evaluating different member algorithms of the classiﬁcation fusion system

1120

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129

System Familiarization and Data Preprocessing

Multi-attribute Classification Fusion System

Online Monitoring

Fusion Formulation

Real time acquired health parameters

Identify precursor health parameters

Multiattribute Classifier Diagnostics

HS diagnostics using Fusion System

Derive different health states

Classifier Fusion

Fusion Diagnostics Results

Continuous Monitoring

Define Problem and System HS

Fig. 1. Framework of classiﬁcation fusion systems for health diagnostics.

Multi-attribute Classification Fusion System Dataset

Fusion Formulation

Training & Testing Dataset

K-fold Cross Validation Classifier Fusion

Multi-attribute Member Classifiers Fusion Diagnostics Results

Accuracy Based Weighting WMVD Process

Fig. 2. Flowchart of classiﬁcation fusion system for health diagnostics.

based on the classiﬁcation error. The complete data set is divided into k subsets; k 1 subsets are utilized for training the classiﬁcation model, and the trained classiﬁcation model is tested with the remaining 1 subset. This process is continued until each of the k subsets is used for testing exactly once and is used for training k 1 times. The dominant algorithms and weights of member algorithms for each HS are determined based on the classiﬁcation rate. The online diagnostic process combines the HS classiﬁcations from all member algorithms to form fusion diagnostic output using the dominant classiﬁers or using accuracy-based weighting and a weighted majority voting process. The computationally expensive training process with multiple algorithms is done ofﬂine; therefore, the online classiﬁcation process with multiple algorithms requires only a relatively small amount of computational effort. In many engineered systems, the diagnostic accuracy is treated as more important than the computational complexity, since a catastrophic system failure causes more economic losses than the increased computational efforts. Therefore, in cases where the fusion approach considerably improves the diagnostic accuracy over any sole member algorithm, the fusion approach is preferred. In Section 2.2.1, the k-fold CV technique to evaluate diagnostic accuracy is explained ﬁrst, and the developed WMVD fusion approach is then developed. 2.2.1. k-Fold cross validation for fusion formulation and evaluation of diagnostic accuracy The accuracy of the fusion process is determined by the fusion formulation process. The k-fold CV is used in this study to evaluate the accuracy of a given fusion process. Let Y = {y1, y2, . . ., ytot} be a data set consisting of multi-dimensional sensory signals (e.g., acceleration, strain, pressure) from different HSs, and it will be randomly divided into k mutually exclusive subsets (or folds), Y1, Y2,

. . ., Yk, having approximately equal size r [14]. Of the k subsets, one is used as the test set and the other k 1 subsets are put together as a training set. The CV process is performed k times, with each of the k subsets used exactly once as the test set. The important indexes used in the proposed WMVD are i, j, m and l. The index i represents the subset of the k-fold and it ranges from 1 to the total number of subsets (k). The index j represents a HS and it ranges from 1 to the total number of HSs (n). The index m represents the classiﬁer and it ranges from 1 to the total number of classiﬁers (c). The index l represents the data point in a subset and it ranges from 1 to the total number of data points (r) in a subset. The classiﬁcation decision of the testing data set of ith subset by each classiﬁer method for jth HS is represented as CCi,j, as shown in Eq. (1). It is calculated for all k subsets and n number of HSs. Similarly, the actual classiﬁcation decision of the testing data set of the ith subset for the jth HS is represented as ACi,j, in Eq. (2). The value dl,m of CCi,j is equal to one when the lth data point of the ith subset is the jth HS by the mth classiﬁer; otherwise, it is set to zero. Similarly, for ACi,j, the value of al is equal to one when the lth data point of the ith subset is actually from the jth HS; otherwise, it is zero.

2

d1;1 6 . 6 CC i;j ¼ 4 .. dr;1

3 d1;c 7 .. ... 7 . 5 dr;c

ð1Þ

2

3 a1 6 . 7 7 AC i;j ¼ 6 4 .. 5 ar

ð2Þ

The classiﬁcation evaluation metric measures the accuracy of one classiﬁer diagnosing one HS with the actual classiﬁcation results, as shown in Eq. (3). If the classiﬁer-diagnosed result is the same as the actual classiﬁer result, then the R value is set to one; otherwise, it is set to zero.

RðCC i;j ðl; mÞ; AC i;j ðl; 1ÞÞ ¼

1; if CC i;j ðl; mÞ ¼ AC i;j ðl; 1Þ 0; otherwise

ð3Þ

The evaluation metric is utilized for the determination of a classiﬁcation index, vm,j, for each classiﬁer and HS. The classiﬁcation index (vm,j) is a metric used to determine the weights of each member algorithm in the developed WMVD approach, which is computed as the average classiﬁcation rate of the mth classiﬁer for the jth HS over all subsets, and it can be expressed as:

vm;j

" # k r X 1 X ¼ RðCC i;j ðl; mÞ; AC i;j ðl; 1ÞÞ k:r i¼1 l¼1

ð4Þ

The traditional hold-out approach is the basic kind of CV in which the complete data set is divided into a training data set and a testing data set. The classiﬁcation model is trained using

1121

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129

the training data set and tested using the testing data set. The traditional hold-out approach has high classiﬁcation errors due to the selection of training and testing data points from the data set. The classiﬁcation errors from the k-fold approach are likely to be reduced compared to the traditional holdout approach, resulting in superior performance when employing a small data set and also when k is increased. However, it is important to note that the disadvantage of the k-fold CV against the holdout method is a greater computational expense, because the training process has to be executed k times. As a commonly used setting for CV, a 10-fold CV is employed in this study. 2.2.2. WMVD for classiﬁer fusion The classiﬁer fusion process consists of two steps: weight determination and WMVD. The ﬁrst step determines the weight based on the accuracy-based weighting scheme, and the second step combines the diagnostic solution of different classiﬁcation techniques into a uniﬁed robust solution using the WMVD approach. These two steps are detailed in the rest of this subsection. 2.3. Accuracy-based weighting The accuracy-based weighting scheme is utilized to determine the weights of member algorithms based on the classiﬁcation rate of each algorithm. The classiﬁcation rate of the mth member algorithm for detecting the jth HS is quantiﬁed by its classiﬁcation index, as shown in Eq. (4). The weight, wtm,j, of the mth member algorithm for detecting the jth HS can then be deﬁned as the normalization of the corresponding classiﬁcation index, expressed as:

vm;j

wtm;j ¼ Pn Pc j¼1

m¼1

vm;j

ð5Þ

This deﬁnition indicates that a larger weight is assigned to a member algorithm with a higher classiﬁcation rate. Thus, a member algorithm with a better classiﬁcation rate has a larger inﬂuence on the fusion classiﬁcation. This weighting scheme relies exclusively on the classiﬁcation rate to determine the weights of member algorithms. 2.4. Weighted majority voting with dominance A simple voting system of HS diagnostics obtained using the member algorithms assigns equal weights to the member algorithms, and the current health condition is deﬁned as the HS with the maximum number of votes by the member algorithms. This is acceptable only when the member algorithms provide the same level of accuracy for a given problem. However, it is more likely that a single classiﬁer algorithm is more accurate than others for detecting one or more HSs, and the accurate classiﬁer can be termed as the dominant classiﬁer for the corresponding HSs. The classiﬁcation rate of the dominant member algorithm in detecting its corresponding dominant HS can be utilized for the fusion process. The dominant classiﬁer for each HS is determined by the maximum classiﬁcation rate of the member algorithm for the corresponding HSs. The HS classiﬁcation decisions made by the dominant classiﬁer algorithms are the ﬁnal decisions in the diagnostics process. However, there are some situations in which the dominance rule will not be effective: (i) When two or more dominant classiﬁers claim the incoming point as their corresponding dominant HSs. (ii) When none of the dominant classiﬁers claim the incoming point to any of their corresponding dominant HSs.

To handle these situations, an accuracy-based weighting process is utilized for determining the weights of each classiﬁer algorithm for each HS, and the WMVD approach is utilized to combine the classiﬁer results. It is ideal to assign a greater weight to the member algorithm with higher prediction accuracy in order to enhance the overall prediction accuracy and robustness. The incoming point classiﬁcation result of each classiﬁer is determined for each HS, as shown in the following equation:

( incm;j ¼

th

1; if classifer m classified the incoming point as j HS 0; otherwise ð6Þ

In the WMVD approach, the classiﬁcation decision of mth classiﬁer as jth HS of the incoming data point (incm,j) is multiplied with its corresponding weight value of mth classiﬁer for classifying as jth HS (wtm,j) to determine the weighted classiﬁcation decision of mth classiﬁer for classifying as jth HS. The sum of weighted classiﬁcation decision for all classiﬁers for classifying as jth HS is determined as the weighted-sum formulation for jth HS, Tj, as shown in Eq. (7). The jth HS corresponding to the maximum value of weighted sum, Tj, will be identiﬁed as the HS of the incoming point.

Tj ¼

c X

wtm;j incm;j

ð7Þ

m¼1

The stepwise procedure of the developed WMVD approach for health diagnostics is shown as pseudo code in Table 2. It is composed of two important modules: (i) weight determination and (ii) incoming point diagnostics. The ﬁrst step in the weight determination process is to divide the data set Y randomly into k number of folds or subsets. The multi-attribute classiﬁer decision CCi,j and the target classiﬁer decision ACi,j for each data point in each subset are calculated for all HSs. The CCi,j and ACi,j values are compared, and the classiﬁcation index for each HS is calculated for each member algorithm. Based on the classiﬁcation index for each HS provided by the different member algorithms, the wtm,j weight values are determined by the accuracy of each classiﬁcation method in detecting each HS. The member algorithm with the maximum classiﬁcation index for an HS will be the dominant classiﬁer of the corresponding HS. The next process is health diagnostics of the new incoming data point, xnew. The classiﬁcation results, incm,j, for each classiﬁer for all HSs are determined ﬁrst. If a single dominant classiﬁer diagnoses the incoming point as its corresponding dominant HS, then the ﬁnal decision for the incoming point is made based on the dominant classiﬁer algorithm. If any of the conﬂicting dominance rule situations occur, as discussed above, the incoming point is then classiﬁed by the accuracy-based weighting process and weighted majority voting system. With the developed WMVD approach, classiﬁcation algorithms with any classiﬁcation rate could be included for the classiﬁcation fusion process. If a new member algorithm is added to the classiﬁcation fusion, then the dominant member algorithm for the HS will not be changed unless the added algorithm is performing better than the corresponding dominant member algorithm. During the conﬂicting dominance rule, accuracy based weighting and weighted majority voting approaches are performed to determine the HS. The algorithms with less classiﬁcation accuracy for the HS will have low weightage for the majority voting process and vice versa. Although the classiﬁcation result of the added algorithm will be included to the weighted majority voting approach, the fusion result will not be affected unless the added algorithm is one of the highly weighted classiﬁers for the corresponding HS. Therefore, if more member algorithms are employed in the classiﬁcation fusion process, the classiﬁcation rate will not be worse than the fusion with less member algorithms. Moreover, the fusion

1122

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129

Table 2 Pseudo code for weighted majority voting with dominance. Input: Member algorithms, training data Y, and new incoming data – xnew Loop: (A) Weight determination of each classiﬁer for each HS 1. Divide the data Y randomly into k-fold training and testing data subsets 2. Calculate the multi-attribute classiﬁer decision, CCi,j, and formulate the target classiﬁer decision, ACi,j, for each subset 3. Determine the classiﬁcation index, vm,j, for each HS by each classiﬁer method using Eq. (4) 4. Determine weights, wtm,j, for each HS by each classiﬁer method using Eq. (5) 5. Determine the dominant classiﬁers for each HS based on the maximum classiﬁcation index (B) HS diagnostics of incoming point, xnew 1. Calculate the incoming point classiﬁcation result for each classiﬁer and each HS, incm,j, using Eq. (6) 2. If the dominant rule of WMVD works, then the incoming point HS, the dominant classiﬁer, determines the incoming point HS 3. Otherwise, the incoming points are classiﬁed based on the accuracy-based weighting process and the weighted majority voting system and calculated T using Eq. (7) Output: Incoming point HS, T

classiﬁcation rate might be better if the added member algorithm becomes the dominant algorithm for some HSs. The next section discusses the multi-attribute classiﬁcation techniques used as the member algorithms in the classiﬁcation fusion system.

3. Member algorithms for classiﬁcation fusion system This section discusses several multi-attribute classiﬁcation techniques that will be used to develop the IGBT multi-attribute classiﬁcation fusion system. To analyze the efﬁcacy of the proposed classiﬁcation fusion system, ﬁve different classiﬁcation algorithms are selected as a representative of broader categories of machine learning and statistical distance based classiﬁcation techniques. The algorithms for machine learning can be broadly classiﬁed as supervised and unsupervised learning techniques. The supervised learning process is a commonly used classiﬁcation technique. The supervised learning process learns the relationship between the input data points and the target classes based on mechanisms such as network, kernel, and deep learning. BNN, SVM, and DBN are chosen as a representative of network based, kernel based, and deep learning based classiﬁcation techniques respectively. SOM and MD are chosen as a representative of unsupervised learning techniques and statistical distance based classiﬁcation techniques. Each of these classiﬁcation algorithms are explained with their working principles and classiﬁcation capabilities in this section.

3.1. Back-propagation neural networks Diagnostic methodology that employs an artiﬁcial neural network imitates a human brain’s neural network. There are different types of supervised artiﬁcial neural network techniques available, among which back-propagation neural network (BNN) is the most common. BNN is a supervised learning technique with a basic neural network structure with three types of layers: input layer, output layer, and hidden layers [8]. The size of the input layer depends on the problem dimensionality, and the output layer size depends on the number of classiﬁcation classes of the problem. The number of hidden layers and the number of neurons vary based on the complexity of the problem. The model input is fed through the input layer of the network and is connected to hidden layers by synaptic weights. The training of the neural network is to learn the relationship between the input layer and the output layer through adjusting the weights and bias values of each neuron in the network for each training pattern. The BNN model is trained by optimizing the synaptic weights and biases of all neurons until the maximum number of epochs is reached.

3.2. Support vector machine Different from approaches employing the neuron-based learning process, there are kernel-based machine learning techniques, including SVM, the most popular method for HS diagnostics. SVM is one of the leading edge machine learning techniques for multidimensional classiﬁcation based on supervised learning. Each xi is a p-dimensional real vector showing the preprocessed sensory data for ith class label. With the organized input data, SVM constructs hyper-planes with maximum margins to divide data points with different qi values, where qi is the ith class label (e.g., 0, 1, or 2) indicating the class to which the point xi belongs. A hyper-plane can be written as a set of points x satisfying:

wxb ¼0

ð8Þ

where vector w is a normal vector that is perpendicular to the hyper-plane, and b is the offset of the hyper-plane. The parameter b/ ||w|| determines the offset of the hyper-plane from the origin along the normal vector w. The optimization problem eventually yields a set of optimized w and b that deﬁne the different classiﬁcation margins [22]. The diagnostic SVM results provide different classiﬁcation classes as a solution when a set of preprocessed sensory data is provided as an input. The intricate problem in SVM diagnostics is to develop the relationship between d and x, to estimate the target d given x. The optimization problem of non-linear separable classes can be formulated as deﬁned in Eq. (9) by including the error or slack variable ni and the penalty parameter C. The corresponding separating hyper-plane constraint is formulated in Eq. (10).

" # r X kwk2 þ C ni 2 i¼1

ð9Þ

s:t: yi ðw xi bÞ > 1 ni

ð10Þ

min

3.3. Deep belief networks Deep belief networks (DBNs) employ a multi-layered architecture which consists of one visible layer and multiple hidden layers. The visible layer of the DBN accepts the input data and transfers the data to the hidden layers in order to complete the machine learning process [49]. The DBN structure is similar to the stacked network of the Restricted Boltzmann Machine (RBM) [50]. The overall learning process of the DBN classiﬁer model can be divided into two primary steps: the RBM learning process and back-propagation learning. The RBM units will be trained iteratively with input training data. Each training epoch consists of two phases: the positive phase and the negative phase. The positive phase transforms the input data from the visible layer to the hidden layer, whereas the negative phase reconstructs the data from the hidden

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129

layer to the successive visible layer. Eq. (11) denotes the sigmoid transformation of the hidden layer state to the visible layer state.

Pðv j ¼ 1jhÞ ¼ sigmðbj

X hi wi;j Þ

ð11Þ

i

where hi is the state of the ith neuron in the hidden layer and vj is state of the jth neuron in the visible layer. The learning process continues through an iterative process from a lower layer to a higher layer until the maximum number of layers is trained. The HS target information of the training data will be used during the succeeding supervised back-propagation classiﬁcation training process. The trained DBN classiﬁer model can be utilized for HS classiﬁcation. Tamilselvan et al. [20] utilized the DBN HS classiﬁcation by including the advantages of deep machine learning techniques in system health diagnostics.

the correlation between different input variables and determines the system HS based on the minimum MD values of the testing sample, compared to training samples from different HSs. Patil et al. [48,51] used Mahalanobis distance (MD) for anomaly detection for IGBTs and monitored collector–emitter voltage and collector–emitter currents as input parameters to calculate the MD. Wang et al. [29] employed MD to classify different HSs for the development of sensor networks for health monitoring. Given the variety of multi-attribute classiﬁcation techniques that could be used for the health diagnostics of IGBTs, different methods have advantages for different applications. The following section presents an experimental study of the IGBT degradation process and identiﬁcation of the failure precursor parameters for health diagnostics.

3.4. Self-organizing maps

4. Experimental study of IGBT failures

The methodologies discussed above were different machine learning processes where the target HS classes are known. If the user is not familiar with the different health conditions and their functional relationships with the input parameters, then an unsupervised learning process will help the user ﬁnd the possible health conditions of the system. Moreover, the algorithm can segregate the data based on the possible health conditions. The SOM is a type of artiﬁcial neural network that is trained using unsupervised learning to produce a two-dimensional discretized representation of the input space of the training samples. The SOM uses a neighborhood function to preserve the topological properties of the input space and determine the closest unit distance to the input vector [26], which will be used to construct class boundaries graphically on a two-dimensional map. The weight vectors of the best matching unit (BMU) and its topological neighbors are ﬁnetuned to move closer to the input vector space [8]. The learning rule for updating the weight vectors can be expressed as:

4.1. IGBT failure modes

wi ðt þ 1Þ ¼ wi ðtÞ þ aðtÞhðnBMU ; ni ; tÞðX wi ðtÞÞ

ð12Þ

where wi(t + 1) is the updated weight vector, wi(t) is the weight vector from the previous iteration, a(t) is the monotonically decreasing learning coefﬁcient with 0 < a < 1, and h(nBMU, ni, t) is the neighborhood function which decreases monotonically with an increase in the distance between the BMU nBMU and the neuron ni in the lattice. Thus, through the iterative learning, the input data is transformed into different HS clusters, and the overlapping of the different clusters will be determined as the misclassiﬁcation. 3.5. Mahalanobis distance Besides the machine learning-based multi-attribute classiﬁcation methods, which classify input data samples through learning the relationship between input and output, statistical inference can also be used for classifying health relevant input samples into different HSs based on their relative statistical distances. The Mahalanobis distance classiﬁer is one of these classiﬁcation techniques. In statistics, MD is a distance measure based on the correlations between variables by which different patterns can be identiﬁed and analyzed. It gauges the similarity of an unknown sample set to a known one. The MD measure shows the degree of the deviation of the measured data point xf from a reference training set (l), which can be calculated as:

Dðxf Þ ¼

qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ðxf lf ÞT S1 f ðxf lf Þ

IGBTs are power transistors that are mainly used in medium to high power and low frequency applications [14], including railway traction motors, wind turbines, electric vehicles, and hybrid vehicles for uninterrupted power supplies. The failure of an IGBT can reduce the efﬁciency of a system and lead to system failure. Therefore, health diagnostics of IGBTs provides many beneﬁts, such as improved safety, improved reliability, and reduced costs for operation and maintenance. An IGBT is a combination of a metal oxide semiconductor ﬁeld effect transistor (MOSFET) and a bipolar junction transistor (BJT). The switching characteristics of IGBTs are similar to a MOSFET, and the high current and voltage capabilities are similar to a BJT. A schematic representation of an IGBT power module is shown in Fig. 3. The typical IGBT module consists of an IGBT silicon die attached to a direct bond copper on ceramic (Al2O3) substrate soldered to a copper base plate. The copper base plate is attached to a heat sink. The IGBT silicon die is interconnected to the ceramic substrate and the base plate by wire bond interconnects made of aluminum. The failure locations on the IGBT power module include: (a) the gate oxide in the IGBT die; (b) the wire bonds; (c) the die attach solder interface between the silicon die and the ceramic substrate; and (d) the solder interface between the ceramic substrate and the base plate. The different failure modes of IGBT power modules include short circuits, open circuits, leakage current increases, or loss of gate control (inability to turn off) [53]. The failures are generally due to environmental conditions, such as high temperatures, and operating conditions, such as thermal and electrical stresses. The IGBT power module failure mechanisms include solder fatigue, wire bond and wire ﬂexure fatigue, hot electrons, and time dependent dielectric breakdown.

ð13Þ

where xf = (x1, x2, . . ., xF) is a multi-dimensional data vector, and l and S are the mean vector and variance matrices, respectively, of the reference training data set. MD health diagnostics considers

1123

Fig. 3. Schematic representation of an IGBT power module [52].

1124

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129

ICE ON (A)

8 6 4 IGBT N1

2

IGBT N2 0

0

5

10

15

20

25

30

35

TIME (hr)

VCE ON (V)

10 IGBT N1 IGBT N2

8 6 4 2 0

0

5

10

15

20

25

30

35

TIME (hr) Fig. 4. ICE ON and VCE ON vs. time for N1 and N2.

4.2. IGBT failure precursor parameter identiﬁcation The identiﬁcation of IGBT failure precursor parameters was carried out through accelerated aging with in-situ monitoring of selected parameters, such as collector–emitter current, transistor case temperature, transient and steady state gate voltages, and collector–emitter voltages. The thermal overstresses were conducted within the safe operating areas of the current and voltage limits, within which the IGBT can be operated without destructive failure. A detailed description of the IGBT accelerated aging study can be found in [41,47]. IGBT aging was conducted until a loss of gate control was observed. This behavior indicated latch-up attributed to a temperature-stimulated parasitic thyristor within the IGBT structure or secondary breakdown due to localized hotspots in the device leading to thermal runaway. The transient collector emitter voltage increased initially during aging and then decreased with an increase in die-attach degradation [47]. The degraded die-attach leads to increased temperature at the p–n junction above the collector due to thermal impedance. The increased temperature at this junction results in an increase in the intrinsic carrier concentration and a decrease in the voltage drop at the junction after aging. The increased device temperature due to the degraded die-attach increases the susceptibility of the IGBT to latch-up. During latch-up, the collector current is not controlled by the gate. When latch-up occurs, if it is not terminated quickly, then the device burns out due to excessive power dissipation. The collector current at which latch-up occurs is called the latching current. The magnitude of the collector emitter current required to induce latch-up decreases with an increase in device temperature. Also, the higher the device temperature, the more susceptible the IGBT is to latch-up. Therefore, the precursor parameters for IGBT failure diagnostics are identiﬁed as collector–emitter ON voltage (VCE ON), collector–emitter ON current (ICE ON), and case temperature (T). These precursor parameters are used for the development of the IGBT multi-attribute classiﬁcation fusion system.

4.3. Experimental data In this case study, two non-punch-through IGBTs, N1 and N2, were tested, with the case temperature varying from 100 °C to

200 °C for a temperature swing of 100 °C. Each power cycle corresponds to one temperature swing, and there were up to 10,000 power cycles before an IGBT failed. The precursor parameters identiﬁed in the previous section, VCE ON, ICE ON, and case temperature T, are collected for both IGBT N1 and N2 until IGBT latch up. The ICE ON and VCE ON data of IGBT N1 and N2 are plotted against the time in hours, as shown in Fig. 4. With the increase in total working hours, the VCE ON increases, while the ICE ON decreases. The abrupt data change in the ICE ON and VCE ON curves indicates IGBT failure and also shows that IGBT N1 reached failure (at nearly 19 h) before IGBT N2 (at nearly 35 h). In Fig. 5, the ICE ON and VCE ON data of IGBT N1 are plotted against the case temperature, varying from 100 °C to 200 °C, in three different time zones (t1, t2, and t3), where t1 = 0– 5 working hours, t2 = 5–10 working hours, and t3 = 10–15 working hours. An increase in the time zone from t1 to t3 increases the VCE ON of the IGBT at the same temperature levels, whereas the ICE ON decreases with an increase in the time zone. Degradation at the wire bond and die interface occurs as a result of temperature cycling, which leads to an increase of the VCE ON (Figs. 4 and 5). Based on the collected precursor parameters, the classiﬁcation of the current IGBT health condition into different HSs (normal, degrading, and failed) will warn the user about the early stages of IGBT failure. Therefore, it is essential to classify the IGBT HS based on its current health condition to avoid unexpected breakdown of the IGBT and, thus, total failure of the system. The following section presents the results of applying the developed multi-attribute classiﬁcation fusion technique for IGBT health diagnostics.

5. IGBT diagnostics using the multi-attribute classiﬁcation fusion system In this section, the developed multi-attribute classiﬁcation fusion system is applied to IGBT health diagnostics with the experimental data discussed in the Section 4. The multi-attribute classiﬁcation fusion system uniﬁes the different classiﬁcation approaches for IGBT health diagnostics based on the classiﬁcation rate of each approach. The developed multi-attribute classiﬁcation system for IGBT health diagnostics is demonstrated with IGBT

1125

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129

ICE ON (A)

7 6.5 6 Time Zone 1 Time Zone 2

5.5

Time Zone 3 5 100

110

120

130

140

150

160

170

180

190

200

Temperature (°C ) 4 Time Zone 1

VCE ON (V)

3.5

Time Zone 2 Time Zone 3

3 2.5 2 1.5 100

110

120

130

140

150

160

170

180

190

200

Temperature (°C ) Fig. 5. ICE ON and VCE ON vs. temperature for N1.

Section 3. The steps involved in IGBT HS diagnostics are shown in Fig. 6: (1) system familiarization and data processing; (2) multiattribute classiﬁcation fusion system; and (3) online monitoring. This subsection details the development process of the IGBT health diagnostic system. 5.1. System familiarization and data preprocessing

Fig. 6. Multi-attribute classiﬁcation for IGBT health diagnostics.

experimental data. The trained fusion model of the multi-attribute classiﬁcation fusion system can be utilized for real-time condition monitoring of IGBT health diagnostics. Following the general procedure outlined in Fig. 1, this section presents IGBT health diagnostics using the multi-attribute classiﬁcation fusion system based on the member algorithms discussed in

The IGBT diagnostic problem is the classiﬁcation of the IGBT health condition into different HSs based on the precursor parameters. In this step, the observable precursor parameters of the IGBT are identiﬁed, and the relevant health conditions of the system are deﬁned. From the testing of IGBTs discussed in Section 4, the precursor parameters for IGBT failure diagnostics are identiﬁed as the collector–emitter ON voltage, collector–emitter ON current, and case temperature. The different health conditions, including the healthy, degrading, and failed states, are derived, and the identiﬁed precursor parameters are utilized for categorizing the data into different HSs. The different health conditions are derived based on the abrupt change in the data of VCE ON and ICE ON. The abrupt change of IGBT N1 is found at 8 h, and the degrading state is deﬁned to be 3 h before and after. Thus, the healthy HS is deﬁned as the data from 0 to 5 h of operation, the degrading state is deﬁned as the data from 5 h to 11 h of operation, and the failed state is from 11 h to 16 h for IGBT N1. Similarly, for IGBT N2, the abrupt data change occurs at 14.5 h, and the degrading state is deﬁned to be 5 h before and after the abrupt data change point. For IGBT N2, the healthy state ranges from 0 to 9.5 h; the degrading state ranges from 9.5 to 19.5 h; and the failed state ranges from 19.5 to 30 h. Fig. 7 shows the partition of different HSs in the VCE ON–ICE ON time chart for IGBT N1. The data collected for the precursor parameters are categorized into different predeﬁned HSs as selected features. Because of the difference in testing loading conditions, IGBT N1 and N2 have shown different length of useful lives. As in this study, we aim to demonstrate the proposed WMVD classiﬁcation fusion approach for health diagnosis, thus to be simple we evenly deﬁned three HSs over the total useful life of each IGBT. We

1126

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129

7

ICE ON (A)

6.5

6

Healthy State Degrading State

5.5

Failure State 5

0

2

4

6

8

10

12

14

16

10

12

14

16

TIME (hr) 4 Healthy State

VCE ON (A)

3.5

Degrading State 3

Failure State

2.5 2 1.5 0

2

4

6

8

TIME (hr) Fig. 7. ICE ON and VCE ON vs. time for IGBT N1.

SOM results for IGBT N1 1

Y

0.5

0

Healthy State Degrading State

-0.5

Failure State -1 -1

-0.5

0

0.5

1

X SOM results for IGBT N2 1

Y

0.5

0

Healthy State Degrading State

-0.5

Failure State -1 -1

-0.5

0

0.5

1

X Fig. 8. SOM results for IGBT N1 and N2.

assume that under generally same operation loading condition, the IGBT would represent similar characteristics for the identiﬁed failure precursor parameters, thus, new testing unit from the same operation loading condition could be classiﬁed to different predeﬁned HSs based on the training data in that particular operation loading condition.

5.2. Multi-attribute classiﬁcation fusion system The experimental study of two types of non-punch-through IGBTs, N1 and N2, discussed earlier, is demonstrated with the multi-attribute classiﬁcation fusion system. The IGBT classiﬁcation fusion system consists of three steps: fusion formulation,

1127

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129 Table 3 Weight values of different member algorithms using accuracy-based weighting. Member algorithm classiﬁer

MD SVM BNN DBN SOM

N1

N2

Healthy state

Degrading state

Failed state

Healthy state

Degrading state

Failed state

0.2214 0.2319 0.2352 0.2323 0.0790

0.2344 0.2276 0.2301 0.2238 0.0839

0.2326 0.2336 0.2364 0.2340 0.0631

0.2188 0.2395 0.2242 0.2139 0.1035

0.2470 0.2445 0.1724 0.2398 0.0962

0.2181 0.2142 0.2137 0.2177 0.1364

Table 4 Classiﬁcation results of member algorithms. Member algorithm classiﬁer

MD SVM BNN DBN SOM

IGBT N1 Healthy state

Degrading state

Failed state

Total

Overall classiﬁcation rate (%)

465 487 494 488 166

486 472 477 464 174

490 492 498 493 133

1441 1451 1469 1445 473

96.07 96.73 97.93 96.33 31.53

480 475 335 466 187

499 490 489 498 312

1425 1453 1281 1400 710

95.00 96.87 85.40 93.33 47.33

495 499

1472 1460

98.13 97.33

IGBT N2 MD SVM BNN DBN SOM

446 488 457 436 211

IGBT

Classiﬁcation fusion system

IGBT N1 IGBT N2

495 481

482 480

multi-attribute classiﬁer diagnostics, and classiﬁer fusion. In the fusion formulation process, a k-fold CV model is developed with the training and testing data sets of the IGBT N1 and IGBT N2. For the CV process, the training data set with 500 data points from each HS are divided into 10 data subsets with 50 data points each. Each data subset is used nine times for training and once for testing. The multi-attribute classiﬁer models with ten sets of training and testing data sets are developed to classify the HSs as healthy, degrading, and failed. The ﬁve different classiﬁcation techniques are used as member algorithms for the development of the IGBT diagnostic system: BNN, SVM, DBN, SOM, and MD. The network architectures of each classiﬁer model used in this IGBT health diagnostic system are as follows. The BNN architecture has three processing layers: input, hidden, and output, with 5, 1, and 3 neurons in each layer, respectively, and the transfer function used is TanH. The trained DBN model architecture has three network layers of 50, 50, and 100 neurons. A Gaussian kernel function is used to train the SVM model. The SOM is trained using the 10 10 architecture of neurons. In MD-based classiﬁcation, the HS with the minimum determined MD is the corresponding HS of the data. The IGBT diagnostic model of each classiﬁcation technique is trained using the training data subset, and the accuracy and efﬁciency of the trained diagnostic model is validated with the testing data subset. Being an unsupervised learning technique, SOM does not require any prior knowledge about the health conditions of the IGBT system. The multi-dimensional input data is represented as a twodimensional SOM topology with different clusters/HSs. The SOM classiﬁer diagnostic results for IGBT N1 and N2 test data are shown in Fig. 8. The BMUs from the input vectors are determined and the different HS clusters are formed in the SOM topology based on the neighborhood distances as discussed in Section 3.4. The developed SOM classiﬁer is tested using the testing data for the classiﬁcation of different HSs based on the precursor parameters. The x and y

dimensions of Fig. 8 show the architecture of the neurons and determine the number of partitions used to segregate the multidimensional data, e.g. SOM model for the IGBT case study is trained using 10 10 architecture of neurons with 100 partitions. Each colored shape in Fig. 8 represents each HS of the IGBTs; HS clusters are formed based on the high accumulation of shapes of the same color within the neighborhood. The SOM results for IGBT N1 do not show distinct clusters formed for each HS of the IGBT. In contrast, the SOM results for IGBT N2 show that there is a slightly distinct cluster formed for the failed HS, but there is overlap of data points for the healthy and degrading states. The multi-attribute classiﬁcation fusion system is developed using multiple member algorithms such as MD, SVM, BNN, DBN and SOM. The weight for each member algorithm is determined with accuracy-based weighting using Eq. (5) and is listed in Table 3. The classiﬁcation results of each HS for each member algorithm are shown in Table 4. The diagnostic results of the member algorithms in Table 4 show that BNN provides more accurate diagnostic outcomes for the healthy state and the failed state of IGBT N1 compared to other member algorithms with correct classiﬁcation of 494 and 498 out of 500 data points, respectively. Similarly, MD provides a better overall classiﬁcation rate for the degrading state of IGBT N1 than the other member algorithms with correct classiﬁcation of 486 out of 500 data points. For IGBT N2, SVM provides more accurate diagnostic outcomes for a healthy state compared to other member algorithms with correct classiﬁcation of 488 out of 500 data points. MD provides a better overall classiﬁcation rate for the degrading state and the failed state of IGBT N2 than the other member algorithms with correct classiﬁcation of 480 and 499 out of 500 data points, respectively. From the classiﬁcation rate for each of the member algorithms, the dominant member algorithm for each HS is determined. BNN is the dominant classiﬁer of the healthy and failed HSs of IGBT N1. BNN can be applied to different

1128

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129

problems with diversiﬁed relationships, such as the non-linearity of the different health conditions and precursor parameters and dependencies within different precursor parameters. The IGBT BNN diagnostic classiﬁer model gave a classiﬁcation rate of 97.93% for IGBT N1 and 85.40% for IGBT N2. While BNN provided high classiﬁcation results for IGBT N1, the classiﬁcation rate for IGBT N2 was low due to the inability to learn the complex patterns of the IGBT precursor parameters, as in an under-ﬁtting condition (inadequate training process). SVM is used as the dominant classiﬁer of the IGBT N2 healthy state. The overall classiﬁcation results of the SVM member algorithm for IGBT N1 and IGBT N2 are 96.73% and 96.87%, respectively. The main reason for the high classiﬁcation rate is learning the non-linear separable nature of different HSs based on the precursor parameters of the IGBT. DBN supports classiﬁcation of different health conditions in complex or non-linear separable health conditions based on the precursor parameters and results in classiﬁcation rates of 96.33% and 93.33% for IGBT N1 and IGBT N2, respectively. Although DBN has a capability for learning the non-linear relationships between the precursor parameters and the different HSs of the IGBT, the classiﬁcation rate for IGBT N1 is lower compared to MD and BNN, and the IGBT N2 results are lower than with MD and SVM. Among the ﬁve member algorithms, SOM did not perform well for the discussed IGBT case study; its classiﬁcation rates for IGBT N1 and IGBT N2 were 31.53% and 47.33% respectively. MD is the dominant classiﬁer for the degrading state of IGBT N1 and the degrading and failed states of IGBT N2. The overall classiﬁcation rate for IGBT N1 is 96.07%, and for IGBT N2 it is 95%. The correlation between the multiple IGBT precursor parameters, ICE ON, VCE ON, and case temperature is utilized for analysis and results in a high classiﬁcation rate of the MD classiﬁer model for IGBT N1. MD provides good results for the linearly separable data, and the implementation of MD is straightforward and does not require a training process. Despite being the dominant classiﬁer for both the degrading and failed HSs for IGBT N2, MD has the overall lowest classiﬁcation rate due to the low healthy state classiﬁcation in IGBT N2. MD cannot handle non-linear separable relationships between the different HSs based on their corresponding precursor parameters. The classiﬁcation results of the member classiﬁer models show that none of the single member algorithm performs better classiﬁcation for all existing HSs in the IGBT health diagnostic system. To make the classiﬁcation robust, the advantages of the member algorithms must be utilized to combine the classiﬁcation results for both the IGBT N1 and IGBT N2 cases. The WMVD approach is utilized as the classiﬁer fusion process to combine the results of the member algorithms to obtain a uniﬁed, robust classiﬁcation result. The WMVD rules are utilized to perform the classiﬁer fusion process based on the dominant classiﬁers and accuracy-based weighting process. The dominant classiﬁers of IGBT N1 are BNN and MD, and for IGBT N2, the dominant classiﬁers are MD and SVM. The WMVD approach is utilized to process the individual results of the member algorithms, and then the ﬁnal classiﬁcation results are determined. The classiﬁcation fusion process results are listed in Table 4. The classiﬁcation fusion system provided an overall classiﬁcation rate for IGBT N1 of 98.13% and 97.33% for IGBT N2, which is higher than the member algorithms. Based on the IGBT N1 results, the classiﬁcation fusion system provided higher healthy state classiﬁcation compared to the other member algorithms due to the WMVD approach. Although the classiﬁcation results of the degrading and failed states are lower than the dominant classiﬁers, the overall classiﬁcation rate of the developed classiﬁcation fusion system is higher than the member algorithms. The IGBT N2 results show that the classiﬁcation fusion system provided classiﬁcation rates for both the degrading and

failed states equal to the corresponding dominant classiﬁer of both HSs. MD is the dominant algorithm for the degrading and failed states of IGBT N2 diagnostics, but it is the lowest performing algorithm for the healthy state. However, the healthy state diagnostics of the classiﬁcation fusion system has been improved, leading to a higher overall classiﬁcation rate of 97.33%. None of the individual member algorithms provided a robust classiﬁcation rate for all HSs. For instance, in IGBT N1 diagnostics, MD is the dominant algorithm for the degrading state, but it is the lowest performing member algorithm for the healthy state. Similarly, in IGBT N2, MD is the dominant algorithm for degrading and failed HS; however, MD is one of the lowest performing algorithms for classifying healthy HS among the member algorithms. This shows that the member algorithms do not have a robust classiﬁcation rate for each HS individually, which leads to a low overall classiﬁcation rate. Although the different algorithms perform well for their dominant HSs, there is no robust member algorithm to classify all the different HSs accurately. On the other hand, the developed classiﬁcation fusion system overcame these challenges and robustly classiﬁed all the existing HSs effectively for both IGBT N1 and N2 cases. The developed classiﬁcation fusion algorithm provided better classiﬁcation results than member algorithms to determine the health condition of the IGBTs. 6. Conclusions This paper presents a novel classiﬁcation fusion approach for health diagnostics with three sequential stages: (i) fusion formulation using a k-fold cross-validation model; (ii) diagnostics using multiple multi-attribute classiﬁers as member algorithms; and (iii) classiﬁcation fusion using a weighted majority voting with dominance system. State-of-the-art multi-attribute classiﬁcation techniques (e.g., supervised learning, unsupervised learning, and statistical inference) were employed as the member algorithms. The developed algorithm was demonstrated with IGBT health diagnostics. By combining the classiﬁcations of all member algorithms, the classiﬁcation fusion approach achieves better accuracy in HS classiﬁcations than any stand-alone member algorithm. Furthermore, the classiﬁcation fusion approach has the inherent ﬂexibility to incorporate any advanced diagnostic algorithm that may be developed. Since the computationally expensive training process is done ofﬂine and the online prediction process requires a small amount of computational effort, the fusion approach is computationally feasible. The classiﬁcation fusion system can be applied for structural health diagnostics and fusion of different condition monitoring systems. Considering the enhanced accuracy in classiﬁcation, the classiﬁcation fusion approach leads to the possibility of effective condition-based maintenance and the development of IGBT failure prognostic systems. Acknowledgments This research is partially supported by National Science Foundation (CMMI-1200597), Kansas NSF EPSCoR program (NSF-0068316) and Wichita State University through the University Research Creative Project Awards (UCRA). References [1] Licht T, Deshmukh A. Hierarchically organized Bayesian networks for distributed sensor networks. Am Soc Mech Eng, Dyn Syst Control Div 2003;71:1059–66. [2] Dekker R. Applications of maintenance optimization models: a review and analysis. Reliab Eng Syst Safety 1996;51:229–40. [3] Ebeling CE. An Introduction to reliability and maintainability engineering. Long Grove, IL: Waveland; 1997. [4] Coit DW, Jin T. Gamma distribution parameter estimation for ﬁeld reliability data with missing failure times. IIE Trans 2000;32(12):1161–6.

P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129 [5] Elsayed EA. Perspectives and challenges for research in quality and reliability engineering. Int J Prod Res 2000;38(9):1953–76. [6] Alguindigue IE, Loskiewicz-Buczak A, Uhrig RE. Monitoring and diagnosis of rolling element bearings using artiﬁcial neural networks. IEEE Trans Ind Electron 1993;40(2):209–17. [7] Li Y, Billington S, Zhang C. Dynamic prognostic prediction of defect propagation on rolling element bearings. Lubric Eng 1999;42(2):385–92. [8] Huang R, Xi L, Li X, Richard Liu C, Qiu H, Lee J. Residual life predictions for ball bearings based on self-organizing map and back propagation neural network methods. Mech Syst Signal Process 2007;21:193–207. [9] Zhang L. Bearing fault diagnosis using multi-scale entropy and adaptive neurofuzzy inference. Expert Syst Appl 2010;37(8):6077–85. [10] Martin KF. Review by discussion of condition monitoring and fault diagnosis in machine tools. Int J Mach Tools Manuf 1994;34(4):527–51. [11] Macian V, Tormos B, Olmeda P, Montoro L. Analytical approach to wear rate determination for internal combustion engine condition monitoring based on oil analysis. Tribol Int 2003;36(10):771–6. [12] Booth C, McDonald JR. The use of artiﬁcial neural networks for condition monitoring of electrical power transformers. Neuro-Computing 1998;23:97–109. [13] Zhao X, Gao H, Zhang G, Ayhan B, Yan F, Kwan C, et al. Active health monitoring of an aircraft wing with embedded piezoelectric sensor/actuator network: I. Defect detection, localization and growth monitoring. Smart Mater Struct 2007;16:1208. [14] R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceeding of the international joint conference on, artiﬁcial intelligence – IJCAI’95’; 1995. [15] Srinivasan S, Kanagasabapathy P, Selvaganesan N. Fault diagnosis in deaerator using neural networks. Iran J Electr Computer Eng 2007;6:62. [16] Samanta B. Gear fault detection using artiﬁcial neural networks and support vector machines with genetic algorithms. Mech Syst Signal Process 2004;18:625–44. [17] Saxena A, Saad A. Evolving an artiﬁcial neural network classiﬁer for condition monitoring of rotating mechanical systems. Appl Soft Comput 2007;7:441–54. [18] Yang BS, Hwang WW, Kim DJ, Chit Tan A. Condition classiﬁcation of small reciprocating compressor for refrigerators using artiﬁcial neural networks and support vector machines. Mech Syst Signal Process 2005;19:371–90. [19] Arel I, Rose DC, Karnowski TP. Deep machine learning – a new frontier in artiﬁcial intelligence research. Comput Intell Mag, IEEE 2010;5(4):13–8. [20] Tamilselvan P, Wang P. Failure Diagnosis Using Deep Belief Learning Based Health State Classiﬁcation. Reliability Engineering and System Safety 2013;115:124–35. [21] Saimurugan M, Ramachandran KI, Sugumaran V, Sakthivel NR. Multi component fault diagnosis of rotational mechanical system based on decision tree and support vector machine. Expert Systems With Applications 2010;115:24–5. [22] Ge M, Du R, Zhang G, Xu Y. Fault diagnosis using support vector machine with an application in sheet metal stamping operations. Mech Syst Signal Process 2004;18:143–59. [23] Abbasion S, Rafsanjani A, Farshidianfar A, Irani N. Rolling element bearings multi-fault classiﬁcation based on the wavelet denoising and support vector machine. Mech Syst Signal Process 2007;21:2933–45. [24] Sun J, Rahman M, Wong Y, Hong G. Multiclassiﬁcation of tool wear with support vector machine by manufacturing loss consideration. Int J Mach Tools Manuf 2004;44:1179–87. [25] Geramifard O, Xu JX, Pang C, Zhou J, Li X. Data-driven approaches in health condition monitoring—a comparative study. In: 8th IEEE international conference on control and automation (ICCA); 2010. p. 1618–22. [26] Wong M, Jack LB, Nandi AK. Modiﬁed self-organising map for automated novelty detection applied to vibration signal monitoring. Mech Syst Signal Process 2006;20:593–610. [27] Breikin T, Kulikov G, Arkov V, Fleming P. Dynamic modelling for condition monitoring of gas turbines: genetic algorithms approach. In: 16th IFAC World Congress; 2005.

1129

[28] Pawar PM, Ganguli R. Genetic fuzzy system for online structural health monitoring of composite helicopter rotor blades. Mech Syst Signal Process 2007;21:2212–36. [29] Wang P, Youn BD, Hu C. ‘‘A probabilistic detectability-based structural sensor network design methodology for prognostics and health management,’’ presented at the Annual Conference of the Prognostics and Health Management Society; 2010. [30] Polikar R. Ensemble based systems in decision making. IEEE Circuits Syst Mag 2006;6(3):21–45. [31] Gao J, Fan W, Han J. On the power of ensemble: supervised and unsupervised methods reconciled. In: Tutorial on SIAM data mining conference (SDM), Columbus, OH; 2010. [32] Perrone MP, Cooper LN. When networks disagree: ensemble methods for hybrid neural networks. In: Mammone RJ, editor. Neural networks for speech and image processing. Chapman-Hall; 1993. [33] Bishop CM. Neural networks for pattern recognition. Oxford University Press; 2005. [34] Zerpa LE, Queipo NV, Pintos S, Salager JL. An optimization methodology of alkaline–surfactant–polymer ﬂooding processes using ﬁeld scale numerical simulation and multiple surrogates. J Petrol Sci Eng 2005;47(3–4):197–208. [35] Goel T, Haftka R, Shyy W, Queipo N. Ensemble of surrogates. Struct Multidiscip Optim 2007;33(3):199–216. [36] Acar E, Rais-Rohani M. Ensemble of metamodels with optimized weight factors. Struct Multidiscip Optim 2009;37(3):279–94. [37] Hu J, Yang YD, Kihara D. EMD: an ensemble algorithm for discovering regulatory motifs in DNA sequences. BMC Bioinform 2006;7(342). [38] Chen S, Wang W, Zuylen H. Construct support vector machine ensemble to detect trafﬁc incident. Expert Syst Appl 2009;36(8):10976–86. [39] Baraldi P, Razavi-Far R, Zio E. Classiﬁer-ensemble incremental-learning procedure for nuclear transient identiﬁcation at different operational conditions. Reliab Eng Syst Safety 2011;98(4):480–8. [40] Evensen G. The ensemble Kalman ﬁlter: theoretical formulation and practical implementation. Ocean Dyn 2003;53(4):343–67. [41] Hu C, Youn BD, Wang P, Yoon JT. Ensemble of Data-Driven Prognostic Algorithms for Robust Prediction of Remaining Useful Life. Reliability Engineering and System Safety 2012;103:120–35. [42] Breiman L. Bagging predictors. Mach Learn 1996;24:123–40. [43] Breiman L. Random forests. Mach Learn 2001;45:5–32. [44] Schapire RE. The strength of weak learnability. Mach Learn 1990;5:197–227. [45] Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 1997;55:119–39. [46] Friedman JH, Popescu BC. Predictive learning via rule ensembles. Ann Appl Stat 2008;2:916–54. [47] Pecht M. Prognostics and health management of electronics. Wiley Online Library; 2008. [48] Patil N, Celaya J, Das D, Goebel K, Pecht M. Precursor parameter identiﬁcation for insulated gate bipolar transistor (IGBT) prognostics. Reliab, IEEE Trans 2009;58(2):271–6. [49] Patil N, Das D, Pecht M. Mahalanobis distance approach for insulated gate bipolar transistors (IGBT) diagnostics. In: Proceedings of the 17th international conference on concurrent engineering, Cracow; 2010. p. 583–91 [50] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Comput 2006;18:1527–54. [51] Hinton GE. A practical guide to training restricted boltzmann machines. Momentum 2010;9:1. [52] Patil N, Menon S, Das D, Pecht M. Anomaly detection of non punch through insulated gate bipolar transistors (IGBT) by robust covariance estimation techniques. In: International conference on reliability, safety & hazard (ICRESH-2010); 2010. [53] Patil N, Das D, Yin C, Bailey C, Pecht M. A fusion approach to IGBT power module prognostics. In: 10th International conference on thermal, mechanical and multiphysics simulation and experiments in micro-electronics and microsystems EuroSimE; 2009.

A multi-attribute classification fusion system for insulated gate bipolar transistor diagnostics

A multi-attribute classification fusion system for insulated gate bipolar transistor diagnostics

Recommend Documents