Microelectronics Reliability 53 (2013) 1117–1129
Contents lists available at SciVerse ScienceDirect
Microelectronics Reliability journal homepage: www.elsevier.com/locate/microrel
A multi-attribute classification fusion system for insulated gate bipolar transistor diagnostics Prasanna Tamilselvan a, Pingfeng Wang a,⇑, Michael Pecht b a b
Department of Industrial and Manufacturing Engineering, Wichita State University, Wichita, KS 67208, USA Center for Advanced Life Cycle Engineering (CALCE), University of Maryland, College Park, MD 20742, USA
a r t i c l e
i n f o
Article history: Received 8 September 2012 Received in revised form 18 February 2013 Accepted 29 April 2013 Available online 25 May 2013
a b s t r a c t Effective health diagnosis provides benefits such as improved safety, improved reliability, and reduced costs for the operation and maintenance of complex engineered systems. This paper presents a multiattribute classification fusion system which leverages the strengths provided by multiple membership classifiers to form a robust classification model for insulated gate bipolar transistor (IGBT) health diagnostics. The developed diagnostic system employs a k-fold cross-validation model for the evaluation of membership classifiers, and develops a multi-attribute classification fusion approach based on a weighted majority voting with dominance scheme. An experimental study of IGBT degradation was first carried out for the identification of failure precursor parameters, and classification techniques (e.g., supervised learning, unsupervised learning, and statistical inference) were then employed as the member algorithms for the development of a robust IGBT classification fusion system. In this study, the developed classification fusion model based on multiple member classification algorithms outperformed each stand-alone method for IGBT health diagnostics by providing better diagnostic accuracy and robustness. The developed multi-attribute classification fusion system provides an effective tool for the continuous monitoring of IGBT health conditions and enables the development of IGBT failure prognostics systems. Ó 2013 Elsevier Ltd. All rights reserved.
1. Introduction System health state (HS) classification provides benefits such as improved safety, improved reliability, and reduced costs for the operation and maintenance of complex engineered systems. Research on real-time diagnostics and prognostics interprets data acquired by smart sensors and utilizes these data streams in making critical operation and maintenance (O&M) decisions [1]. Maintenance and life-cycle management is one area that will significantly benefit from the improved design and maintenance activities in both the manufacturing and service sectors. Maintenance and life-cycle management activities constitute a large portion of overhead costs in many industries [2]. These costs are likely to increase due to the rising competition in today’s global economy. In the manufacturing and service sectors, unexpected breakdowns are prohibitively expensive, since they immediately result in lost production, failed shipping schedules, and poor customer satisfaction. In order to reduce and possibly eliminate such problems, it is necessary to accurately assess the current state of system degradation through health diagnostics. Research on condition monitoring addressed these challenges by utilizing sensory information from functioning systems and assessing their degradation states. Con⇑ Corresponding author. Tel.: +1 316 978 5910; fax: +1 316 978 3742. E-mail address:
[email protected] (P. Wang). 0026-2714/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.microrel.2013.04.011
tinuous monitoring of current system HSs notifies users about both early and advanced stages of damage by analyzing the performance degradation of system components [3–5]. Condition monitoring has been successfully applied in bearings [6–9], machine tools [10], transformers [11], engines [12], aircraft wings [13], and turbines. Due to the complexity of HS classification for different engineered systems, machine learning and statistical inference techniques are often employed to solve diagnostic problems. The machine learning-based health diagnostics methodology can be broadly classified into supervised, semi-supervised, and unsupervised learning techniques. A supervised learning technique is the process of learning the relationship between the input values and the desired target value in the form of a set of patterns having both an input object and the desired target output. Error values are evaluated and given as feedback to the learning model in order to get a potential solution. The learned relationship/function from the training data is used as a classifier model to predict the unlearned and unknown patterns. An unsupervised learning process is a process of learning a hidden relationship in the input values without the target labels/outputs (unlabeled data). The unlabeled data generally refers to training data set for system input variables without knowing the corresponding system outputs, which is often used for the unsupervised learning. In contrast, labeled data refers to paired training data set having both system input variables and
1118
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129
Nomenclature CCi,j Acronyms HS health state WMVD weighted majority voting with dominance DBN deep belief networks BNN back-propagation neural network GA genetic algorithm SVM support vector machine SOM self-organizing map MD Mahalanobis distance RBM restricted Boltzmann machine RUL remaining useful life IGBT insulated gate bipolar transistor MOSFET metal oxide semiconductor field effect transistor BJT bipolar junction transistor VCE collector emitter voltage ICE collector emitter current T case temperature Notation k total number of folds used for fusion formulation, where 16i6k n total number of HSs, where 1 6 j 6 n r total number of training data points in each fold, where 16l6r c total number of classifier methods used in fusion process, where 1 6 m 6 c
corresponding system outputs, which is often used for the supervised learning. A semi-supervised learning technique learns the relationship between the input values by utilizing both labeled (input values with target outputs) and unlabeled data (input values without target outputs). Besides the different types of machine learning-based diagnostic algorithms, statistical distance-based algorithms can also be used for classifying different HSs based on their statistical distances, such as Euclidean distance and Mahalanobis distance. Significant advancements in the diagnostics area have been achieved by applying classification techniques based on machine learning or statistical inferences, resulting in a number of classification methods [14], such as back-propagation neural networks (BNNs) [15–18], deep belief networks (DBNs) [19,20], support vector machines (SVMs) [21–25], self-organizing maps (SOMs) [26], genetic algorithms [27,28], and Mahalanobis distance (MD) [29,14]. Despite successful applications of different diagnostic algorithms in various engineering fields, a challenge for health diagnostics is that the implicit relationship between different system HSs and features of sensory signals makes it difficult to develop a generic health diagnostics algorithm. Furthermore, there are many factors that influence the efficacy of diagnostic systems, such as (i) the dependency of the algorithm’s accuracy on the number of data points in a training data set; (ii) the significant variability in manufacturing conditions and large uncertainties in environmental and operational conditions; and (iii) the sensory signal relationships with different HSs (e.g., linear, non-linear). Therefore, no single diagnostic classifier works well for all possible situations. Instead of using an individual diagnostic algorithm, it would be beneficial to leverage the strengths of different algorithms to form a robust unified algorithm [30]. A classification fusion system is an algorithm in which results from different individual diagnostic algorithms are combined into a single diagnostic decision, thereby
dl,m ACi,j al wtm,j incm,j T xi qi w b
lj Sj ni C wi,j ui P() bi hi
vi
h() a(t)
multi-attribute classifier decision matrix of jth HS for ith fold classification decision of mth classifier for lth training data point target classification decision matrix of jth HS for ith fold target classification decision of lth training data point weight value of mth classifier in classifying jth HS classification decision of mth classifier as jth HS of the incoming data point classification fusion decision of the incoming data point p-dimensional vector ith class label normal vector of the hyper-plane bias of the hyper-plane mean vector of the training data variance matrix of the training data slack variable penalty parameter synaptic weight between the ith and the jth neurons state of the ith neuron probability distribution function bias of the ith neuron state of the ith neuron in hidden layer state of the ith neuron in visible layer neighborhood function learning coefficient
improving the robustness and accuracy of health diagnostics. The fusion methods can be classified by their combination strategy: consensus or learning. Examples of noted fusion methods and brief descriptions are in Table 1 [31]. The fusion method is applied in a wide variety of research fields, such as in the development of committees of neural networks [32,33], meta-modeling for the design of modern engineered systems [34–36], discovery of regulatory motifs in bioinformatics [37], detection of traffic incidents [38], transient identification of nuclear power plants [39], and development of ensemble Kalman filters [40]. Similar to health diagnostics applications, Hu et al. [41] developed an ensemble prognostic system by combining different prognostic member algorithms to predict remaining useful life (RUL) and utilized a k-fold cross-validation process to evaluate the error of each member algorithm. The results from different algorithms are combined into single predicted RUL by an optimized weighting process. However, in most of the existing diagnostic classification fusion systems, multiple classification models are developed by training using multiple training subsets from the single training data set, and the results from different classification models are combined into a single diagnostic decision (committees of neural networks [32,33]). Research on HS diagnostics does not use a dominant member algorithm for each HS, thereby utilizing the advantage of the each member algorithm in diagnosing the corresponding HS when developing a classification fusion system. Although there have been significant advances in diagnostics and health monitoring, degradation in electronics is more difficult to detect and inspect than in most mechanical systems and structures due to the small scale (micro- to nano-scale) but complex architecture of most electronic products [46]. Insulated gate bipolar transistors (IGBTs) are used in applications, such as the switching of automobile and train traction motors and in switch mode power supplies, to regulate DC voltage [47]. The failure of these
1119
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129 Table 1 Examples of noted fusion methods. Combining strategy
Fusion method
Description
Reference
By consensus
Bagging Random Forest Boosting Adaboost
Bagging determines a class label with major voting by multiple classifiers Random forest improves the performance of bagging by combining with random feature selection scheme
Breiman [41] Breiman [42]
Boosting trains weak classifiers and combines them into a strong classifier Adaboost trains each base classifier with a weighted data set and its weighting coefficients are computed from classification errors by the previous classifiers and then aggregates the base classifiers into one classifier Rule Ensemble not only uses a basis function as a base classifier, it also includes a rule as a base classifier. As the rule has a simple form, it is easy to understand the influences of rules on predictions and the degree of dependency on each other
Schapire [43] Freund and Schapire [44] Friedman and Popescu [45]
By learning
Rule ensemble
switches can reduce the efficiency of the system and lead to unexpected system failures [47]. Through accurate health diagnostics on critical components such as IGBTs, cost benefits can be achieved by avoiding unscheduled maintenance while improving system safety [47]. Patil et al. [48] applied Mahalanobis distance (MD) for anomaly detection for IGBTs and monitored collector–emitter voltage and collector–emitter currents as input parameters to calculate the MD. The MD values obtained from the healthy data were transformed into three sigma limits and used as a threshold to detect degradation in the IGBTs [48]. Most research on IGBT diagnostics has not considered the classification of the current health condition into different HSs. Early warnings of IGBT failure will aid the user in taking necessary O&M actions to avoid the unexpected breakdown of IGBTs. Therefore, it is important to identify the current IGBT health condition and assign it to one of the possible HSs of the IGBT unit. This will warn system operators about the early stages of failure, so that appropriate actions can be taken to avoid catastrophic failures induced by the malfunction of IGBTs. Despite the success of HS diagnostics in different applications, there are still problems with handling multiple heterogeneous data, identifying the appropriate precursor parameters to detect the current IGBT health condition, and choosing the appropriate classifier model for IGBT health diagnostics. This paper presents a multi-attribute classification fusion system which leverages the strengths provided by multiple membership classifiers to form a robust classification model for IGBT health diagnostics. The diagnostics system employs a k-fold cross-validation model for the evaluation of membership classifiers and develops a multi-attribute fusion approach based on a weighted majority voting scheme. To apply the classification fusion system to the health diagnostics of IGBTs, an experimental study of IGBT degradation was first carried out for the identification of failure precursor parameters, and state-of-the-art classification techniques were then employed as the member algorithms. The rest of the paper is organized as follows. Section 2 introduces the generic multi-attribute classification fusion system and the weighted majority voting with dominance (WMVD) process. Section 3 discusses the multi-attribute classification member algorithms. Section 4 discusses the health diagnostics of IGBTs. Section 5 demonstrates the developed multi-attribute classification fusion system for IGBT diagnostics, and Section 6 summarizes the presented research and future work.
2. A multi-attribute classification fusion system In this section, a multi-attribute classification fusion system for health diagnostics is developed. Section 2.1 presents a generic framework utilizing a multi-attribute classification fusion system for a health diagnostics system. Section 2.2 details the developed classification fusion systems with k-fold cross-validation for diagnostic accuracy evaluation and weighted majority voting with a dominance approach for classifier fusion.
2.1. Framework for the multi-attribute classification fusion system As shown in Fig. 1, developing a multi-attribute classification fusion system using classifier models involves three major steps: (1) system familiarization and data preprocessing; (2) multi-attribute classification fusion system; and (3) online monitoring. In the first step, the specific diagnostic problem and system HS of interest are defined. After defining the diagnostic problem, the precursor parameters need to be identified, which are observable and relevant directly or indirectly to the health condition of the system. The data collected for the precursor parameters will be preprocessed and categorized into different predefined HSs. The second part of health diagnostics is the multi-attribute classification fusion system, which consists of three primary steps: fusion formulation, multi-attribute classifier diagnostics using member algorithms, and classifier fusion. The preprocessed sensory data with a known HS in the first step and will be utilized to determine weights for each multi-attribute classifier model based on the classification rate of each classification technique. The testing data set will be divided into its corresponding HSs using different multi-attribute classification techniques. The results for each set of precursor parameters from the multi-attribute classification techniques are combined into a single output for each set using a classifier fusion process. Therefore, fusion diagnostics results are obtained for each set of precursor parameters. The last step of the health diagnostics process is performing online diagnostics using the trained classifier fusion model, which involves the extraction of real-time data for the precursor parameters and diagnosis of the current health state using the trained classifier model. As more online data are collected, continuous learning is implemented by updating the initial diagnostic model with the newly collected data. The fusion process of diagnostic algorithms is explained in detail in the following subsection. 2.2. Multi-attribute classification fusion It is essential to develop a robust diagnostic solution that accurately classifies the different HSs using data features extracted from multi-dimensional sensory signals. To construct such a unified health diagnostic framework, this paper develops (i) a k-fold cross-validation (CV) approach to evaluate the error metric associated with a candidate classifier model; and (ii) a WMVD approach for the fusion of multi-attribute classification algorithms. Fig. 2 shows the overall procedure of the developed fusion approach with the k-fold CV and WMVD approaches. This data-driven fusion diagnostic approach is first carried out offline and is composed of two steps: fusion formulation and classifier fusion. The fusion formulation is done with the k-fold CV, and the classification rates of kfolds are computed, where the classification rate is defined as the ratio of the number of data points which are correctly diagnosed into their corresponding HSs to the total number of data points in the data set. k-fold CV is an effective CV process for evaluating different member algorithms of the classification fusion system
1120
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129
System Familiarization and Data Preprocessing
Multi-attribute Classification Fusion System
Online Monitoring
Fusion Formulation
Real time acquired health parameters
Identify precursor health parameters
Multiattribute Classifier Diagnostics
HS diagnostics using Fusion System
Derive different health states
Classifier Fusion
Fusion Diagnostics Results
Continuous Monitoring
Define Problem and System HS
Fig. 1. Framework of classification fusion systems for health diagnostics.
Multi-attribute Classification Fusion System Dataset
Fusion Formulation
Training & Testing Dataset
K-fold Cross Validation Classifier Fusion
Multi-attribute Member Classifiers Fusion Diagnostics Results
Accuracy Based Weighting WMVD Process
Fig. 2. Flowchart of classification fusion system for health diagnostics.
based on the classification error. The complete data set is divided into k subsets; k 1 subsets are utilized for training the classification model, and the trained classification model is tested with the remaining 1 subset. This process is continued until each of the k subsets is used for testing exactly once and is used for training k 1 times. The dominant algorithms and weights of member algorithms for each HS are determined based on the classification rate. The online diagnostic process combines the HS classifications from all member algorithms to form fusion diagnostic output using the dominant classifiers or using accuracy-based weighting and a weighted majority voting process. The computationally expensive training process with multiple algorithms is done offline; therefore, the online classification process with multiple algorithms requires only a relatively small amount of computational effort. In many engineered systems, the diagnostic accuracy is treated as more important than the computational complexity, since a catastrophic system failure causes more economic losses than the increased computational efforts. Therefore, in cases where the fusion approach considerably improves the diagnostic accuracy over any sole member algorithm, the fusion approach is preferred. In Section 2.2.1, the k-fold CV technique to evaluate diagnostic accuracy is explained first, and the developed WMVD fusion approach is then developed. 2.2.1. k-Fold cross validation for fusion formulation and evaluation of diagnostic accuracy The accuracy of the fusion process is determined by the fusion formulation process. The k-fold CV is used in this study to evaluate the accuracy of a given fusion process. Let Y = {y1, y2, . . ., ytot} be a data set consisting of multi-dimensional sensory signals (e.g., acceleration, strain, pressure) from different HSs, and it will be randomly divided into k mutually exclusive subsets (or folds), Y1, Y2,
. . ., Yk, having approximately equal size r [14]. Of the k subsets, one is used as the test set and the other k 1 subsets are put together as a training set. The CV process is performed k times, with each of the k subsets used exactly once as the test set. The important indexes used in the proposed WMVD are i, j, m and l. The index i represents the subset of the k-fold and it ranges from 1 to the total number of subsets (k). The index j represents a HS and it ranges from 1 to the total number of HSs (n). The index m represents the classifier and it ranges from 1 to the total number of classifiers (c). The index l represents the data point in a subset and it ranges from 1 to the total number of data points (r) in a subset. The classification decision of the testing data set of ith subset by each classifier method for jth HS is represented as CCi,j, as shown in Eq. (1). It is calculated for all k subsets and n number of HSs. Similarly, the actual classification decision of the testing data set of the ith subset for the jth HS is represented as ACi,j, in Eq. (2). The value dl,m of CCi,j is equal to one when the lth data point of the ith subset is the jth HS by the mth classifier; otherwise, it is set to zero. Similarly, for ACi,j, the value of al is equal to one when the lth data point of the ith subset is actually from the jth HS; otherwise, it is zero.
2
d1;1 6 . 6 CC i;j ¼ 4 .. dr;1
3 d1;c 7 .. ... 7 . 5 dr;c
ð1Þ
2
3 a1 6 . 7 7 AC i;j ¼ 6 4 .. 5 ar
ð2Þ
The classification evaluation metric measures the accuracy of one classifier diagnosing one HS with the actual classification results, as shown in Eq. (3). If the classifier-diagnosed result is the same as the actual classifier result, then the R value is set to one; otherwise, it is set to zero.
RðCC i;j ðl; mÞ; AC i;j ðl; 1ÞÞ ¼
1; if CC i;j ðl; mÞ ¼ AC i;j ðl; 1Þ 0; otherwise
ð3Þ
The evaluation metric is utilized for the determination of a classification index, vm,j, for each classifier and HS. The classification index (vm,j) is a metric used to determine the weights of each member algorithm in the developed WMVD approach, which is computed as the average classification rate of the mth classifier for the jth HS over all subsets, and it can be expressed as:
vm;j
" # k r X 1 X ¼ RðCC i;j ðl; mÞ; AC i;j ðl; 1ÞÞ k:r i¼1 l¼1
ð4Þ
The traditional hold-out approach is the basic kind of CV in which the complete data set is divided into a training data set and a testing data set. The classification model is trained using
1121
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129
the training data set and tested using the testing data set. The traditional hold-out approach has high classification errors due to the selection of training and testing data points from the data set. The classification errors from the k-fold approach are likely to be reduced compared to the traditional holdout approach, resulting in superior performance when employing a small data set and also when k is increased. However, it is important to note that the disadvantage of the k-fold CV against the holdout method is a greater computational expense, because the training process has to be executed k times. As a commonly used setting for CV, a 10-fold CV is employed in this study. 2.2.2. WMVD for classifier fusion The classifier fusion process consists of two steps: weight determination and WMVD. The first step determines the weight based on the accuracy-based weighting scheme, and the second step combines the diagnostic solution of different classification techniques into a unified robust solution using the WMVD approach. These two steps are detailed in the rest of this subsection. 2.3. Accuracy-based weighting The accuracy-based weighting scheme is utilized to determine the weights of member algorithms based on the classification rate of each algorithm. The classification rate of the mth member algorithm for detecting the jth HS is quantified by its classification index, as shown in Eq. (4). The weight, wtm,j, of the mth member algorithm for detecting the jth HS can then be defined as the normalization of the corresponding classification index, expressed as:
vm;j
wtm;j ¼ Pn Pc j¼1
m¼1
vm;j
ð5Þ
This definition indicates that a larger weight is assigned to a member algorithm with a higher classification rate. Thus, a member algorithm with a better classification rate has a larger influence on the fusion classification. This weighting scheme relies exclusively on the classification rate to determine the weights of member algorithms. 2.4. Weighted majority voting with dominance A simple voting system of HS diagnostics obtained using the member algorithms assigns equal weights to the member algorithms, and the current health condition is defined as the HS with the maximum number of votes by the member algorithms. This is acceptable only when the member algorithms provide the same level of accuracy for a given problem. However, it is more likely that a single classifier algorithm is more accurate than others for detecting one or more HSs, and the accurate classifier can be termed as the dominant classifier for the corresponding HSs. The classification rate of the dominant member algorithm in detecting its corresponding dominant HS can be utilized for the fusion process. The dominant classifier for each HS is determined by the maximum classification rate of the member algorithm for the corresponding HSs. The HS classification decisions made by the dominant classifier algorithms are the final decisions in the diagnostics process. However, there are some situations in which the dominance rule will not be effective: (i) When two or more dominant classifiers claim the incoming point as their corresponding dominant HSs. (ii) When none of the dominant classifiers claim the incoming point to any of their corresponding dominant HSs.
To handle these situations, an accuracy-based weighting process is utilized for determining the weights of each classifier algorithm for each HS, and the WMVD approach is utilized to combine the classifier results. It is ideal to assign a greater weight to the member algorithm with higher prediction accuracy in order to enhance the overall prediction accuracy and robustness. The incoming point classification result of each classifier is determined for each HS, as shown in the following equation:
( incm;j ¼
th
1; if classifer m classified the incoming point as j HS 0; otherwise ð6Þ
In the WMVD approach, the classification decision of mth classifier as jth HS of the incoming data point (incm,j) is multiplied with its corresponding weight value of mth classifier for classifying as jth HS (wtm,j) to determine the weighted classification decision of mth classifier for classifying as jth HS. The sum of weighted classification decision for all classifiers for classifying as jth HS is determined as the weighted-sum formulation for jth HS, Tj, as shown in Eq. (7). The jth HS corresponding to the maximum value of weighted sum, Tj, will be identified as the HS of the incoming point.
Tj ¼
c X
wtm;j incm;j
ð7Þ
m¼1
The stepwise procedure of the developed WMVD approach for health diagnostics is shown as pseudo code in Table 2. It is composed of two important modules: (i) weight determination and (ii) incoming point diagnostics. The first step in the weight determination process is to divide the data set Y randomly into k number of folds or subsets. The multi-attribute classifier decision CCi,j and the target classifier decision ACi,j for each data point in each subset are calculated for all HSs. The CCi,j and ACi,j values are compared, and the classification index for each HS is calculated for each member algorithm. Based on the classification index for each HS provided by the different member algorithms, the wtm,j weight values are determined by the accuracy of each classification method in detecting each HS. The member algorithm with the maximum classification index for an HS will be the dominant classifier of the corresponding HS. The next process is health diagnostics of the new incoming data point, xnew. The classification results, incm,j, for each classifier for all HSs are determined first. If a single dominant classifier diagnoses the incoming point as its corresponding dominant HS, then the final decision for the incoming point is made based on the dominant classifier algorithm. If any of the conflicting dominance rule situations occur, as discussed above, the incoming point is then classified by the accuracy-based weighting process and weighted majority voting system. With the developed WMVD approach, classification algorithms with any classification rate could be included for the classification fusion process. If a new member algorithm is added to the classification fusion, then the dominant member algorithm for the HS will not be changed unless the added algorithm is performing better than the corresponding dominant member algorithm. During the conflicting dominance rule, accuracy based weighting and weighted majority voting approaches are performed to determine the HS. The algorithms with less classification accuracy for the HS will have low weightage for the majority voting process and vice versa. Although the classification result of the added algorithm will be included to the weighted majority voting approach, the fusion result will not be affected unless the added algorithm is one of the highly weighted classifiers for the corresponding HS. Therefore, if more member algorithms are employed in the classification fusion process, the classification rate will not be worse than the fusion with less member algorithms. Moreover, the fusion
1122
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129
Table 2 Pseudo code for weighted majority voting with dominance. Input: Member algorithms, training data Y, and new incoming data – xnew Loop: (A) Weight determination of each classifier for each HS 1. Divide the data Y randomly into k-fold training and testing data subsets 2. Calculate the multi-attribute classifier decision, CCi,j, and formulate the target classifier decision, ACi,j, for each subset 3. Determine the classification index, vm,j, for each HS by each classifier method using Eq. (4) 4. Determine weights, wtm,j, for each HS by each classifier method using Eq. (5) 5. Determine the dominant classifiers for each HS based on the maximum classification index (B) HS diagnostics of incoming point, xnew 1. Calculate the incoming point classification result for each classifier and each HS, incm,j, using Eq. (6) 2. If the dominant rule of WMVD works, then the incoming point HS, the dominant classifier, determines the incoming point HS 3. Otherwise, the incoming points are classified based on the accuracy-based weighting process and the weighted majority voting system and calculated T using Eq. (7) Output: Incoming point HS, T
classification rate might be better if the added member algorithm becomes the dominant algorithm for some HSs. The next section discusses the multi-attribute classification techniques used as the member algorithms in the classification fusion system.
3. Member algorithms for classification fusion system This section discusses several multi-attribute classification techniques that will be used to develop the IGBT multi-attribute classification fusion system. To analyze the efficacy of the proposed classification fusion system, five different classification algorithms are selected as a representative of broader categories of machine learning and statistical distance based classification techniques. The algorithms for machine learning can be broadly classified as supervised and unsupervised learning techniques. The supervised learning process is a commonly used classification technique. The supervised learning process learns the relationship between the input data points and the target classes based on mechanisms such as network, kernel, and deep learning. BNN, SVM, and DBN are chosen as a representative of network based, kernel based, and deep learning based classification techniques respectively. SOM and MD are chosen as a representative of unsupervised learning techniques and statistical distance based classification techniques. Each of these classification algorithms are explained with their working principles and classification capabilities in this section.
3.1. Back-propagation neural networks Diagnostic methodology that employs an artificial neural network imitates a human brain’s neural network. There are different types of supervised artificial neural network techniques available, among which back-propagation neural network (BNN) is the most common. BNN is a supervised learning technique with a basic neural network structure with three types of layers: input layer, output layer, and hidden layers [8]. The size of the input layer depends on the problem dimensionality, and the output layer size depends on the number of classification classes of the problem. The number of hidden layers and the number of neurons vary based on the complexity of the problem. The model input is fed through the input layer of the network and is connected to hidden layers by synaptic weights. The training of the neural network is to learn the relationship between the input layer and the output layer through adjusting the weights and bias values of each neuron in the network for each training pattern. The BNN model is trained by optimizing the synaptic weights and biases of all neurons until the maximum number of epochs is reached.
3.2. Support vector machine Different from approaches employing the neuron-based learning process, there are kernel-based machine learning techniques, including SVM, the most popular method for HS diagnostics. SVM is one of the leading edge machine learning techniques for multidimensional classification based on supervised learning. Each xi is a p-dimensional real vector showing the preprocessed sensory data for ith class label. With the organized input data, SVM constructs hyper-planes with maximum margins to divide data points with different qi values, where qi is the ith class label (e.g., 0, 1, or 2) indicating the class to which the point xi belongs. A hyper-plane can be written as a set of points x satisfying:
wxb ¼0
ð8Þ
where vector w is a normal vector that is perpendicular to the hyper-plane, and b is the offset of the hyper-plane. The parameter b/ ||w|| determines the offset of the hyper-plane from the origin along the normal vector w. The optimization problem eventually yields a set of optimized w and b that define the different classification margins [22]. The diagnostic SVM results provide different classification classes as a solution when a set of preprocessed sensory data is provided as an input. The intricate problem in SVM diagnostics is to develop the relationship between d and x, to estimate the target d given x. The optimization problem of non-linear separable classes can be formulated as defined in Eq. (9) by including the error or slack variable ni and the penalty parameter C. The corresponding separating hyper-plane constraint is formulated in Eq. (10).
" # r X kwk2 þ C ni 2 i¼1
ð9Þ
s:t: yi ðw xi bÞ > 1 ni
ð10Þ
min
3.3. Deep belief networks Deep belief networks (DBNs) employ a multi-layered architecture which consists of one visible layer and multiple hidden layers. The visible layer of the DBN accepts the input data and transfers the data to the hidden layers in order to complete the machine learning process [49]. The DBN structure is similar to the stacked network of the Restricted Boltzmann Machine (RBM) [50]. The overall learning process of the DBN classifier model can be divided into two primary steps: the RBM learning process and back-propagation learning. The RBM units will be trained iteratively with input training data. Each training epoch consists of two phases: the positive phase and the negative phase. The positive phase transforms the input data from the visible layer to the hidden layer, whereas the negative phase reconstructs the data from the hidden
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129
layer to the successive visible layer. Eq. (11) denotes the sigmoid transformation of the hidden layer state to the visible layer state.
Pðv j ¼ 1jhÞ ¼ sigmðbj
X hi wi;j Þ
ð11Þ
i
where hi is the state of the ith neuron in the hidden layer and vj is state of the jth neuron in the visible layer. The learning process continues through an iterative process from a lower layer to a higher layer until the maximum number of layers is trained. The HS target information of the training data will be used during the succeeding supervised back-propagation classification training process. The trained DBN classifier model can be utilized for HS classification. Tamilselvan et al. [20] utilized the DBN HS classification by including the advantages of deep machine learning techniques in system health diagnostics.
the correlation between different input variables and determines the system HS based on the minimum MD values of the testing sample, compared to training samples from different HSs. Patil et al. [48,51] used Mahalanobis distance (MD) for anomaly detection for IGBTs and monitored collector–emitter voltage and collector–emitter currents as input parameters to calculate the MD. Wang et al. [29] employed MD to classify different HSs for the development of sensor networks for health monitoring. Given the variety of multi-attribute classification techniques that could be used for the health diagnostics of IGBTs, different methods have advantages for different applications. The following section presents an experimental study of the IGBT degradation process and identification of the failure precursor parameters for health diagnostics.
3.4. Self-organizing maps
4. Experimental study of IGBT failures
The methodologies discussed above were different machine learning processes where the target HS classes are known. If the user is not familiar with the different health conditions and their functional relationships with the input parameters, then an unsupervised learning process will help the user find the possible health conditions of the system. Moreover, the algorithm can segregate the data based on the possible health conditions. The SOM is a type of artificial neural network that is trained using unsupervised learning to produce a two-dimensional discretized representation of the input space of the training samples. The SOM uses a neighborhood function to preserve the topological properties of the input space and determine the closest unit distance to the input vector [26], which will be used to construct class boundaries graphically on a two-dimensional map. The weight vectors of the best matching unit (BMU) and its topological neighbors are finetuned to move closer to the input vector space [8]. The learning rule for updating the weight vectors can be expressed as:
4.1. IGBT failure modes
wi ðt þ 1Þ ¼ wi ðtÞ þ aðtÞhðnBMU ; ni ; tÞðX wi ðtÞÞ
ð12Þ
where wi(t + 1) is the updated weight vector, wi(t) is the weight vector from the previous iteration, a(t) is the monotonically decreasing learning coefficient with 0 < a < 1, and h(nBMU, ni, t) is the neighborhood function which decreases monotonically with an increase in the distance between the BMU nBMU and the neuron ni in the lattice. Thus, through the iterative learning, the input data is transformed into different HS clusters, and the overlapping of the different clusters will be determined as the misclassification. 3.5. Mahalanobis distance Besides the machine learning-based multi-attribute classification methods, which classify input data samples through learning the relationship between input and output, statistical inference can also be used for classifying health relevant input samples into different HSs based on their relative statistical distances. The Mahalanobis distance classifier is one of these classification techniques. In statistics, MD is a distance measure based on the correlations between variables by which different patterns can be identified and analyzed. It gauges the similarity of an unknown sample set to a known one. The MD measure shows the degree of the deviation of the measured data point xf from a reference training set (l), which can be calculated as:
Dðxf Þ ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðxf lf ÞT S1 f ðxf lf Þ
IGBTs are power transistors that are mainly used in medium to high power and low frequency applications [14], including railway traction motors, wind turbines, electric vehicles, and hybrid vehicles for uninterrupted power supplies. The failure of an IGBT can reduce the efficiency of a system and lead to system failure. Therefore, health diagnostics of IGBTs provides many benefits, such as improved safety, improved reliability, and reduced costs for operation and maintenance. An IGBT is a combination of a metal oxide semiconductor field effect transistor (MOSFET) and a bipolar junction transistor (BJT). The switching characteristics of IGBTs are similar to a MOSFET, and the high current and voltage capabilities are similar to a BJT. A schematic representation of an IGBT power module is shown in Fig. 3. The typical IGBT module consists of an IGBT silicon die attached to a direct bond copper on ceramic (Al2O3) substrate soldered to a copper base plate. The copper base plate is attached to a heat sink. The IGBT silicon die is interconnected to the ceramic substrate and the base plate by wire bond interconnects made of aluminum. The failure locations on the IGBT power module include: (a) the gate oxide in the IGBT die; (b) the wire bonds; (c) the die attach solder interface between the silicon die and the ceramic substrate; and (d) the solder interface between the ceramic substrate and the base plate. The different failure modes of IGBT power modules include short circuits, open circuits, leakage current increases, or loss of gate control (inability to turn off) [53]. The failures are generally due to environmental conditions, such as high temperatures, and operating conditions, such as thermal and electrical stresses. The IGBT power module failure mechanisms include solder fatigue, wire bond and wire flexure fatigue, hot electrons, and time dependent dielectric breakdown.
ð13Þ
where xf = (x1, x2, . . ., xF) is a multi-dimensional data vector, and l and S are the mean vector and variance matrices, respectively, of the reference training data set. MD health diagnostics considers
1123
Fig. 3. Schematic representation of an IGBT power module [52].
1124
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129
ICE ON (A)
8 6 4 IGBT N1
2
IGBT N2 0
0
5
10
15
20
25
30
35
TIME (hr)
VCE ON (V)
10 IGBT N1 IGBT N2
8 6 4 2 0
0
5
10
15
20
25
30
35
TIME (hr) Fig. 4. ICE ON and VCE ON vs. time for N1 and N2.
4.2. IGBT failure precursor parameter identification The identification of IGBT failure precursor parameters was carried out through accelerated aging with in-situ monitoring of selected parameters, such as collector–emitter current, transistor case temperature, transient and steady state gate voltages, and collector–emitter voltages. The thermal overstresses were conducted within the safe operating areas of the current and voltage limits, within which the IGBT can be operated without destructive failure. A detailed description of the IGBT accelerated aging study can be found in [41,47]. IGBT aging was conducted until a loss of gate control was observed. This behavior indicated latch-up attributed to a temperature-stimulated parasitic thyristor within the IGBT structure or secondary breakdown due to localized hotspots in the device leading to thermal runaway. The transient collector emitter voltage increased initially during aging and then decreased with an increase in die-attach degradation [47]. The degraded die-attach leads to increased temperature at the p–n junction above the collector due to thermal impedance. The increased temperature at this junction results in an increase in the intrinsic carrier concentration and a decrease in the voltage drop at the junction after aging. The increased device temperature due to the degraded die-attach increases the susceptibility of the IGBT to latch-up. During latch-up, the collector current is not controlled by the gate. When latch-up occurs, if it is not terminated quickly, then the device burns out due to excessive power dissipation. The collector current at which latch-up occurs is called the latching current. The magnitude of the collector emitter current required to induce latch-up decreases with an increase in device temperature. Also, the higher the device temperature, the more susceptible the IGBT is to latch-up. Therefore, the precursor parameters for IGBT failure diagnostics are identified as collector–emitter ON voltage (VCE ON), collector–emitter ON current (ICE ON), and case temperature (T). These precursor parameters are used for the development of the IGBT multi-attribute classification fusion system.
4.3. Experimental data In this case study, two non-punch-through IGBTs, N1 and N2, were tested, with the case temperature varying from 100 °C to
200 °C for a temperature swing of 100 °C. Each power cycle corresponds to one temperature swing, and there were up to 10,000 power cycles before an IGBT failed. The precursor parameters identified in the previous section, VCE ON, ICE ON, and case temperature T, are collected for both IGBT N1 and N2 until IGBT latch up. The ICE ON and VCE ON data of IGBT N1 and N2 are plotted against the time in hours, as shown in Fig. 4. With the increase in total working hours, the VCE ON increases, while the ICE ON decreases. The abrupt data change in the ICE ON and VCE ON curves indicates IGBT failure and also shows that IGBT N1 reached failure (at nearly 19 h) before IGBT N2 (at nearly 35 h). In Fig. 5, the ICE ON and VCE ON data of IGBT N1 are plotted against the case temperature, varying from 100 °C to 200 °C, in three different time zones (t1, t2, and t3), where t1 = 0– 5 working hours, t2 = 5–10 working hours, and t3 = 10–15 working hours. An increase in the time zone from t1 to t3 increases the VCE ON of the IGBT at the same temperature levels, whereas the ICE ON decreases with an increase in the time zone. Degradation at the wire bond and die interface occurs as a result of temperature cycling, which leads to an increase of the VCE ON (Figs. 4 and 5). Based on the collected precursor parameters, the classification of the current IGBT health condition into different HSs (normal, degrading, and failed) will warn the user about the early stages of IGBT failure. Therefore, it is essential to classify the IGBT HS based on its current health condition to avoid unexpected breakdown of the IGBT and, thus, total failure of the system. The following section presents the results of applying the developed multi-attribute classification fusion technique for IGBT health diagnostics.
5. IGBT diagnostics using the multi-attribute classification fusion system In this section, the developed multi-attribute classification fusion system is applied to IGBT health diagnostics with the experimental data discussed in the Section 4. The multi-attribute classification fusion system unifies the different classification approaches for IGBT health diagnostics based on the classification rate of each approach. The developed multi-attribute classification system for IGBT health diagnostics is demonstrated with IGBT
1125
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129
ICE ON (A)
7 6.5 6 Time Zone 1 Time Zone 2
5.5
Time Zone 3 5 100
110
120
130
140
150
160
170
180
190
200
Temperature (°C ) 4 Time Zone 1
VCE ON (V)
3.5
Time Zone 2 Time Zone 3
3 2.5 2 1.5 100
110
120
130
140
150
160
170
180
190
200
Temperature (°C ) Fig. 5. ICE ON and VCE ON vs. temperature for N1.
Section 3. The steps involved in IGBT HS diagnostics are shown in Fig. 6: (1) system familiarization and data processing; (2) multiattribute classification fusion system; and (3) online monitoring. This subsection details the development process of the IGBT health diagnostic system. 5.1. System familiarization and data preprocessing
Fig. 6. Multi-attribute classification for IGBT health diagnostics.
experimental data. The trained fusion model of the multi-attribute classification fusion system can be utilized for real-time condition monitoring of IGBT health diagnostics. Following the general procedure outlined in Fig. 1, this section presents IGBT health diagnostics using the multi-attribute classification fusion system based on the member algorithms discussed in
The IGBT diagnostic problem is the classification of the IGBT health condition into different HSs based on the precursor parameters. In this step, the observable precursor parameters of the IGBT are identified, and the relevant health conditions of the system are defined. From the testing of IGBTs discussed in Section 4, the precursor parameters for IGBT failure diagnostics are identified as the collector–emitter ON voltage, collector–emitter ON current, and case temperature. The different health conditions, including the healthy, degrading, and failed states, are derived, and the identified precursor parameters are utilized for categorizing the data into different HSs. The different health conditions are derived based on the abrupt change in the data of VCE ON and ICE ON. The abrupt change of IGBT N1 is found at 8 h, and the degrading state is defined to be 3 h before and after. Thus, the healthy HS is defined as the data from 0 to 5 h of operation, the degrading state is defined as the data from 5 h to 11 h of operation, and the failed state is from 11 h to 16 h for IGBT N1. Similarly, for IGBT N2, the abrupt data change occurs at 14.5 h, and the degrading state is defined to be 5 h before and after the abrupt data change point. For IGBT N2, the healthy state ranges from 0 to 9.5 h; the degrading state ranges from 9.5 to 19.5 h; and the failed state ranges from 19.5 to 30 h. Fig. 7 shows the partition of different HSs in the VCE ON–ICE ON time chart for IGBT N1. The data collected for the precursor parameters are categorized into different predefined HSs as selected features. Because of the difference in testing loading conditions, IGBT N1 and N2 have shown different length of useful lives. As in this study, we aim to demonstrate the proposed WMVD classification fusion approach for health diagnosis, thus to be simple we evenly defined three HSs over the total useful life of each IGBT. We
1126
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129
7
ICE ON (A)
6.5
6
Healthy State Degrading State
5.5
Failure State 5
0
2
4
6
8
10
12
14
16
10
12
14
16
TIME (hr) 4 Healthy State
VCE ON (A)
3.5
Degrading State 3
Failure State
2.5 2 1.5 0
2
4
6
8
TIME (hr) Fig. 7. ICE ON and VCE ON vs. time for IGBT N1.
SOM results for IGBT N1 1
Y
0.5
0
Healthy State Degrading State
-0.5
Failure State -1 -1
-0.5
0
0.5
1
X SOM results for IGBT N2 1
Y
0.5
0
Healthy State Degrading State
-0.5
Failure State -1 -1
-0.5
0
0.5
1
X Fig. 8. SOM results for IGBT N1 and N2.
assume that under generally same operation loading condition, the IGBT would represent similar characteristics for the identified failure precursor parameters, thus, new testing unit from the same operation loading condition could be classified to different predefined HSs based on the training data in that particular operation loading condition.
5.2. Multi-attribute classification fusion system The experimental study of two types of non-punch-through IGBTs, N1 and N2, discussed earlier, is demonstrated with the multi-attribute classification fusion system. The IGBT classification fusion system consists of three steps: fusion formulation,
1127
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129 Table 3 Weight values of different member algorithms using accuracy-based weighting. Member algorithm classifier
MD SVM BNN DBN SOM
N1
N2
Healthy state
Degrading state
Failed state
Healthy state
Degrading state
Failed state
0.2214 0.2319 0.2352 0.2323 0.0790
0.2344 0.2276 0.2301 0.2238 0.0839
0.2326 0.2336 0.2364 0.2340 0.0631
0.2188 0.2395 0.2242 0.2139 0.1035
0.2470 0.2445 0.1724 0.2398 0.0962
0.2181 0.2142 0.2137 0.2177 0.1364
Table 4 Classification results of member algorithms. Member algorithm classifier
MD SVM BNN DBN SOM
IGBT N1 Healthy state
Degrading state
Failed state
Total
Overall classification rate (%)
465 487 494 488 166
486 472 477 464 174
490 492 498 493 133
1441 1451 1469 1445 473
96.07 96.73 97.93 96.33 31.53
480 475 335 466 187
499 490 489 498 312
1425 1453 1281 1400 710
95.00 96.87 85.40 93.33 47.33
495 499
1472 1460
98.13 97.33
IGBT N2 MD SVM BNN DBN SOM
446 488 457 436 211
IGBT
Classification fusion system
IGBT N1 IGBT N2
495 481
482 480
multi-attribute classifier diagnostics, and classifier fusion. In the fusion formulation process, a k-fold CV model is developed with the training and testing data sets of the IGBT N1 and IGBT N2. For the CV process, the training data set with 500 data points from each HS are divided into 10 data subsets with 50 data points each. Each data subset is used nine times for training and once for testing. The multi-attribute classifier models with ten sets of training and testing data sets are developed to classify the HSs as healthy, degrading, and failed. The five different classification techniques are used as member algorithms for the development of the IGBT diagnostic system: BNN, SVM, DBN, SOM, and MD. The network architectures of each classifier model used in this IGBT health diagnostic system are as follows. The BNN architecture has three processing layers: input, hidden, and output, with 5, 1, and 3 neurons in each layer, respectively, and the transfer function used is TanH. The trained DBN model architecture has three network layers of 50, 50, and 100 neurons. A Gaussian kernel function is used to train the SVM model. The SOM is trained using the 10 10 architecture of neurons. In MD-based classification, the HS with the minimum determined MD is the corresponding HS of the data. The IGBT diagnostic model of each classification technique is trained using the training data subset, and the accuracy and efficiency of the trained diagnostic model is validated with the testing data subset. Being an unsupervised learning technique, SOM does not require any prior knowledge about the health conditions of the IGBT system. The multi-dimensional input data is represented as a twodimensional SOM topology with different clusters/HSs. The SOM classifier diagnostic results for IGBT N1 and N2 test data are shown in Fig. 8. The BMUs from the input vectors are determined and the different HS clusters are formed in the SOM topology based on the neighborhood distances as discussed in Section 3.4. The developed SOM classifier is tested using the testing data for the classification of different HSs based on the precursor parameters. The x and y
dimensions of Fig. 8 show the architecture of the neurons and determine the number of partitions used to segregate the multidimensional data, e.g. SOM model for the IGBT case study is trained using 10 10 architecture of neurons with 100 partitions. Each colored shape in Fig. 8 represents each HS of the IGBTs; HS clusters are formed based on the high accumulation of shapes of the same color within the neighborhood. The SOM results for IGBT N1 do not show distinct clusters formed for each HS of the IGBT. In contrast, the SOM results for IGBT N2 show that there is a slightly distinct cluster formed for the failed HS, but there is overlap of data points for the healthy and degrading states. The multi-attribute classification fusion system is developed using multiple member algorithms such as MD, SVM, BNN, DBN and SOM. The weight for each member algorithm is determined with accuracy-based weighting using Eq. (5) and is listed in Table 3. The classification results of each HS for each member algorithm are shown in Table 4. The diagnostic results of the member algorithms in Table 4 show that BNN provides more accurate diagnostic outcomes for the healthy state and the failed state of IGBT N1 compared to other member algorithms with correct classification of 494 and 498 out of 500 data points, respectively. Similarly, MD provides a better overall classification rate for the degrading state of IGBT N1 than the other member algorithms with correct classification of 486 out of 500 data points. For IGBT N2, SVM provides more accurate diagnostic outcomes for a healthy state compared to other member algorithms with correct classification of 488 out of 500 data points. MD provides a better overall classification rate for the degrading state and the failed state of IGBT N2 than the other member algorithms with correct classification of 480 and 499 out of 500 data points, respectively. From the classification rate for each of the member algorithms, the dominant member algorithm for each HS is determined. BNN is the dominant classifier of the healthy and failed HSs of IGBT N1. BNN can be applied to different
1128
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129
problems with diversified relationships, such as the non-linearity of the different health conditions and precursor parameters and dependencies within different precursor parameters. The IGBT BNN diagnostic classifier model gave a classification rate of 97.93% for IGBT N1 and 85.40% for IGBT N2. While BNN provided high classification results for IGBT N1, the classification rate for IGBT N2 was low due to the inability to learn the complex patterns of the IGBT precursor parameters, as in an under-fitting condition (inadequate training process). SVM is used as the dominant classifier of the IGBT N2 healthy state. The overall classification results of the SVM member algorithm for IGBT N1 and IGBT N2 are 96.73% and 96.87%, respectively. The main reason for the high classification rate is learning the non-linear separable nature of different HSs based on the precursor parameters of the IGBT. DBN supports classification of different health conditions in complex or non-linear separable health conditions based on the precursor parameters and results in classification rates of 96.33% and 93.33% for IGBT N1 and IGBT N2, respectively. Although DBN has a capability for learning the non-linear relationships between the precursor parameters and the different HSs of the IGBT, the classification rate for IGBT N1 is lower compared to MD and BNN, and the IGBT N2 results are lower than with MD and SVM. Among the five member algorithms, SOM did not perform well for the discussed IGBT case study; its classification rates for IGBT N1 and IGBT N2 were 31.53% and 47.33% respectively. MD is the dominant classifier for the degrading state of IGBT N1 and the degrading and failed states of IGBT N2. The overall classification rate for IGBT N1 is 96.07%, and for IGBT N2 it is 95%. The correlation between the multiple IGBT precursor parameters, ICE ON, VCE ON, and case temperature is utilized for analysis and results in a high classification rate of the MD classifier model for IGBT N1. MD provides good results for the linearly separable data, and the implementation of MD is straightforward and does not require a training process. Despite being the dominant classifier for both the degrading and failed HSs for IGBT N2, MD has the overall lowest classification rate due to the low healthy state classification in IGBT N2. MD cannot handle non-linear separable relationships between the different HSs based on their corresponding precursor parameters. The classification results of the member classifier models show that none of the single member algorithm performs better classification for all existing HSs in the IGBT health diagnostic system. To make the classification robust, the advantages of the member algorithms must be utilized to combine the classification results for both the IGBT N1 and IGBT N2 cases. The WMVD approach is utilized as the classifier fusion process to combine the results of the member algorithms to obtain a unified, robust classification result. The WMVD rules are utilized to perform the classifier fusion process based on the dominant classifiers and accuracy-based weighting process. The dominant classifiers of IGBT N1 are BNN and MD, and for IGBT N2, the dominant classifiers are MD and SVM. The WMVD approach is utilized to process the individual results of the member algorithms, and then the final classification results are determined. The classification fusion process results are listed in Table 4. The classification fusion system provided an overall classification rate for IGBT N1 of 98.13% and 97.33% for IGBT N2, which is higher than the member algorithms. Based on the IGBT N1 results, the classification fusion system provided higher healthy state classification compared to the other member algorithms due to the WMVD approach. Although the classification results of the degrading and failed states are lower than the dominant classifiers, the overall classification rate of the developed classification fusion system is higher than the member algorithms. The IGBT N2 results show that the classification fusion system provided classification rates for both the degrading and
failed states equal to the corresponding dominant classifier of both HSs. MD is the dominant algorithm for the degrading and failed states of IGBT N2 diagnostics, but it is the lowest performing algorithm for the healthy state. However, the healthy state diagnostics of the classification fusion system has been improved, leading to a higher overall classification rate of 97.33%. None of the individual member algorithms provided a robust classification rate for all HSs. For instance, in IGBT N1 diagnostics, MD is the dominant algorithm for the degrading state, but it is the lowest performing member algorithm for the healthy state. Similarly, in IGBT N2, MD is the dominant algorithm for degrading and failed HS; however, MD is one of the lowest performing algorithms for classifying healthy HS among the member algorithms. This shows that the member algorithms do not have a robust classification rate for each HS individually, which leads to a low overall classification rate. Although the different algorithms perform well for their dominant HSs, there is no robust member algorithm to classify all the different HSs accurately. On the other hand, the developed classification fusion system overcame these challenges and robustly classified all the existing HSs effectively for both IGBT N1 and N2 cases. The developed classification fusion algorithm provided better classification results than member algorithms to determine the health condition of the IGBTs. 6. Conclusions This paper presents a novel classification fusion approach for health diagnostics with three sequential stages: (i) fusion formulation using a k-fold cross-validation model; (ii) diagnostics using multiple multi-attribute classifiers as member algorithms; and (iii) classification fusion using a weighted majority voting with dominance system. State-of-the-art multi-attribute classification techniques (e.g., supervised learning, unsupervised learning, and statistical inference) were employed as the member algorithms. The developed algorithm was demonstrated with IGBT health diagnostics. By combining the classifications of all member algorithms, the classification fusion approach achieves better accuracy in HS classifications than any stand-alone member algorithm. Furthermore, the classification fusion approach has the inherent flexibility to incorporate any advanced diagnostic algorithm that may be developed. Since the computationally expensive training process is done offline and the online prediction process requires a small amount of computational effort, the fusion approach is computationally feasible. The classification fusion system can be applied for structural health diagnostics and fusion of different condition monitoring systems. Considering the enhanced accuracy in classification, the classification fusion approach leads to the possibility of effective condition-based maintenance and the development of IGBT failure prognostic systems. Acknowledgments This research is partially supported by National Science Foundation (CMMI-1200597), Kansas NSF EPSCoR program (NSF-0068316) and Wichita State University through the University Research Creative Project Awards (UCRA). References [1] Licht T, Deshmukh A. Hierarchically organized Bayesian networks for distributed sensor networks. Am Soc Mech Eng, Dyn Syst Control Div 2003;71:1059–66. [2] Dekker R. Applications of maintenance optimization models: a review and analysis. Reliab Eng Syst Safety 1996;51:229–40. [3] Ebeling CE. An Introduction to reliability and maintainability engineering. Long Grove, IL: Waveland; 1997. [4] Coit DW, Jin T. Gamma distribution parameter estimation for field reliability data with missing failure times. IIE Trans 2000;32(12):1161–6.
P. Tamilselvan et al. / Microelectronics Reliability 53 (2013) 1117–1129 [5] Elsayed EA. Perspectives and challenges for research in quality and reliability engineering. Int J Prod Res 2000;38(9):1953–76. [6] Alguindigue IE, Loskiewicz-Buczak A, Uhrig RE. Monitoring and diagnosis of rolling element bearings using artificial neural networks. IEEE Trans Ind Electron 1993;40(2):209–17. [7] Li Y, Billington S, Zhang C. Dynamic prognostic prediction of defect propagation on rolling element bearings. Lubric Eng 1999;42(2):385–92. [8] Huang R, Xi L, Li X, Richard Liu C, Qiu H, Lee J. Residual life predictions for ball bearings based on self-organizing map and back propagation neural network methods. Mech Syst Signal Process 2007;21:193–207. [9] Zhang L. Bearing fault diagnosis using multi-scale entropy and adaptive neurofuzzy inference. Expert Syst Appl 2010;37(8):6077–85. [10] Martin KF. Review by discussion of condition monitoring and fault diagnosis in machine tools. Int J Mach Tools Manuf 1994;34(4):527–51. [11] Macian V, Tormos B, Olmeda P, Montoro L. Analytical approach to wear rate determination for internal combustion engine condition monitoring based on oil analysis. Tribol Int 2003;36(10):771–6. [12] Booth C, McDonald JR. The use of artificial neural networks for condition monitoring of electrical power transformers. Neuro-Computing 1998;23:97–109. [13] Zhao X, Gao H, Zhang G, Ayhan B, Yan F, Kwan C, et al. Active health monitoring of an aircraft wing with embedded piezoelectric sensor/actuator network: I. Defect detection, localization and growth monitoring. Smart Mater Struct 2007;16:1208. [14] R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceeding of the international joint conference on, artificial intelligence – IJCAI’95’; 1995. [15] Srinivasan S, Kanagasabapathy P, Selvaganesan N. Fault diagnosis in deaerator using neural networks. Iran J Electr Computer Eng 2007;6:62. [16] Samanta B. Gear fault detection using artificial neural networks and support vector machines with genetic algorithms. Mech Syst Signal Process 2004;18:625–44. [17] Saxena A, Saad A. Evolving an artificial neural network classifier for condition monitoring of rotating mechanical systems. Appl Soft Comput 2007;7:441–54. [18] Yang BS, Hwang WW, Kim DJ, Chit Tan A. Condition classification of small reciprocating compressor for refrigerators using artificial neural networks and support vector machines. Mech Syst Signal Process 2005;19:371–90. [19] Arel I, Rose DC, Karnowski TP. Deep machine learning – a new frontier in artificial intelligence research. Comput Intell Mag, IEEE 2010;5(4):13–8. [20] Tamilselvan P, Wang P. Failure Diagnosis Using Deep Belief Learning Based Health State Classification. Reliability Engineering and System Safety 2013;115:124–35. [21] Saimurugan M, Ramachandran KI, Sugumaran V, Sakthivel NR. Multi component fault diagnosis of rotational mechanical system based on decision tree and support vector machine. Expert Systems With Applications 2010;115:24–5. [22] Ge M, Du R, Zhang G, Xu Y. Fault diagnosis using support vector machine with an application in sheet metal stamping operations. Mech Syst Signal Process 2004;18:143–59. [23] Abbasion S, Rafsanjani A, Farshidianfar A, Irani N. Rolling element bearings multi-fault classification based on the wavelet denoising and support vector machine. Mech Syst Signal Process 2007;21:2933–45. [24] Sun J, Rahman M, Wong Y, Hong G. Multiclassification of tool wear with support vector machine by manufacturing loss consideration. Int J Mach Tools Manuf 2004;44:1179–87. [25] Geramifard O, Xu JX, Pang C, Zhou J, Li X. Data-driven approaches in health condition monitoring—a comparative study. In: 8th IEEE international conference on control and automation (ICCA); 2010. p. 1618–22. [26] Wong M, Jack LB, Nandi AK. Modified self-organising map for automated novelty detection applied to vibration signal monitoring. Mech Syst Signal Process 2006;20:593–610. [27] Breikin T, Kulikov G, Arkov V, Fleming P. Dynamic modelling for condition monitoring of gas turbines: genetic algorithms approach. In: 16th IFAC World Congress; 2005.
1129
[28] Pawar PM, Ganguli R. Genetic fuzzy system for online structural health monitoring of composite helicopter rotor blades. Mech Syst Signal Process 2007;21:2212–36. [29] Wang P, Youn BD, Hu C. ‘‘A probabilistic detectability-based structural sensor network design methodology for prognostics and health management,’’ presented at the Annual Conference of the Prognostics and Health Management Society; 2010. [30] Polikar R. Ensemble based systems in decision making. IEEE Circuits Syst Mag 2006;6(3):21–45. [31] Gao J, Fan W, Han J. On the power of ensemble: supervised and unsupervised methods reconciled. In: Tutorial on SIAM data mining conference (SDM), Columbus, OH; 2010. [32] Perrone MP, Cooper LN. When networks disagree: ensemble methods for hybrid neural networks. In: Mammone RJ, editor. Neural networks for speech and image processing. Chapman-Hall; 1993. [33] Bishop CM. Neural networks for pattern recognition. Oxford University Press; 2005. [34] Zerpa LE, Queipo NV, Pintos S, Salager JL. An optimization methodology of alkaline–surfactant–polymer flooding processes using field scale numerical simulation and multiple surrogates. J Petrol Sci Eng 2005;47(3–4):197–208. [35] Goel T, Haftka R, Shyy W, Queipo N. Ensemble of surrogates. Struct Multidiscip Optim 2007;33(3):199–216. [36] Acar E, Rais-Rohani M. Ensemble of metamodels with optimized weight factors. Struct Multidiscip Optim 2009;37(3):279–94. [37] Hu J, Yang YD, Kihara D. EMD: an ensemble algorithm for discovering regulatory motifs in DNA sequences. BMC Bioinform 2006;7(342). [38] Chen S, Wang W, Zuylen H. Construct support vector machine ensemble to detect traffic incident. Expert Syst Appl 2009;36(8):10976–86. [39] Baraldi P, Razavi-Far R, Zio E. Classifier-ensemble incremental-learning procedure for nuclear transient identification at different operational conditions. Reliab Eng Syst Safety 2011;98(4):480–8. [40] Evensen G. The ensemble Kalman filter: theoretical formulation and practical implementation. Ocean Dyn 2003;53(4):343–67. [41] Hu C, Youn BD, Wang P, Yoon JT. Ensemble of Data-Driven Prognostic Algorithms for Robust Prediction of Remaining Useful Life. Reliability Engineering and System Safety 2012;103:120–35. [42] Breiman L. Bagging predictors. Mach Learn 1996;24:123–40. [43] Breiman L. Random forests. Mach Learn 2001;45:5–32. [44] Schapire RE. The strength of weak learnability. Mach Learn 1990;5:197–227. [45] Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 1997;55:119–39. [46] Friedman JH, Popescu BC. Predictive learning via rule ensembles. Ann Appl Stat 2008;2:916–54. [47] Pecht M. Prognostics and health management of electronics. Wiley Online Library; 2008. [48] Patil N, Celaya J, Das D, Goebel K, Pecht M. Precursor parameter identification for insulated gate bipolar transistor (IGBT) prognostics. Reliab, IEEE Trans 2009;58(2):271–6. [49] Patil N, Das D, Pecht M. Mahalanobis distance approach for insulated gate bipolar transistors (IGBT) diagnostics. In: Proceedings of the 17th international conference on concurrent engineering, Cracow; 2010. p. 583–91 [50] Hinton GE, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Comput 2006;18:1527–54. [51] Hinton GE. A practical guide to training restricted boltzmann machines. Momentum 2010;9:1. [52] Patil N, Menon S, Das D, Pecht M. Anomaly detection of non punch through insulated gate bipolar transistors (IGBT) by robust covariance estimation techniques. In: International conference on reliability, safety & hazard (ICRESH-2010); 2010. [53] Patil N, Das D, Yin C, Bailey C, Pecht M. A fusion approach to IGBT power module prognostics. In: 10th International conference on thermal, mechanical and multiphysics simulation and experiments in micro-electronics and microsystems EuroSimE; 2009.