Deep Laplacian Auto-encoder and its application into imbalanced fault diagnosis of rotating machinery

Deep Laplacian Auto-encoder and its application into imbalanced fault diagnosis of rotating machinery

Journal Pre-proofs Deep Laplacian Auto-encoder and its Application into Imbalanced Fault Diagnosis of Rotating Machinery Xiaoli Zhao, Minping Jia, Min...

2MB Sizes 4 Downloads 83 Views

Journal Pre-proofs Deep Laplacian Auto-encoder and its Application into Imbalanced Fault Diagnosis of Rotating Machinery Xiaoli Zhao, Minping Jia, Mingyao Lin PII: DOI: Reference:

S0263-2241(19)31184-4 https://doi.org/10.1016/j.measurement.2019.107320 MEASUR 107320

To appear in:

Measurement

Received Date: Revised Date: Accepted Date:

10 September 2019 31 October 2019 22 November 2019

Please cite this article as: X. Zhao, M. Jia, M. Lin, Deep Laplacian Auto-encoder and its Application into Imbalanced Fault Diagnosis of Rotating Machinery, Measurement (2019), doi: https://doi.org/10.1016/j.measurement. 2019.107320

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Published by Elsevier Ltd.

Deep Laplacian Auto-encoder and its Application into Imbalanced Fault Diagnosis of Rotating Machinery Xiaoli Zhao1, Minping Jia1*, Mingyao Lin2 1School

of Mechanical Engineering, Southeast University, Nanjing 211189, China

2Electrical

Engineering Department, Southeast University, Nanjing 210096, China

Abstract: Generally, the measured health condition data from mechanical system often exhibits imbalanced distribution in real-world cases. To enhance fault diagnostic accuracy of the imbalanced data set, a novel rotating machinery fault imbalanced diagnostic approach based on Deep Laplacian Auto-encoder (DLapAE) is firstly developed in this paper. First of all, the collected vibration signals are immediately entered into the constructed DLapAE algorithm for layer-by-layer feature extraction, afterwards the extracted deep discriminative sensitive features are flowed into Back Propagation (BP) classifier for health condition diagnosis. More specifically, it is well worth mentioning that Laplacian regularization term can be reasonably added into the original objective function of Deep Auto-encoder (DAE) for smoothing the manifold structure of data in DLapAE. Namely, the proposed DLapAE algorithm with Laplacian regularization can improve the generalization performance of this fault diagnosis framework and make it more suitable for feature learning and classification of imbalanced data. Last but not least, two case of the experimental bearing systems can prove the effectiveness of proposed methodology. Compared with other existing fault diagnosis methods based on deep learning, the proposed fault diagnosis method can effectively implement the accurate fault diagnosis for rotating machinery balanced and imbalanced datasets. Keywords: Rotating machinery; Fault diagnosis; Imbalanced dataset; Deep Laplacian Auto-encoder (DLapAE); Laplacian regularization term

1. Introduction As an irreplaceable industrial device in modern industrial system, rotating machinery has been playing a crucial character in industrial production and intelligent manufacturing [1-3]. Therefore, the real-time condition monitoring and fault diagnosis of rotating machinery can not only ensure the normal operation of mechanical equipment, but also can timely detect the equipment fault and avoid unnecessary economic damage or personal injury [4-5]. According to statistics, 30% of rotating machinery failures are mainly caused by the faults of its core components, such as rolling bearings and rotors [6-7]. Therefore, it is necessary to effectively analyze and diagnose the different health condition of the core components of mechanical systems [8-9]. Currently, the general fault diagnosis methods for rotating machinery are mostly applied to determine the operating condition through the detection and analysis of various state parameters [10-11]. Generally, fault diagnosis method based on vibration signal analysis is one of the most common and effective means for mechanical fault diagnosis, that is to say, the measuring equipment such as a certain amount of sensors are correctly arranged in the key points of mechanical system to collect and analyze vibration signals [11-12]. With the continuous development of modern measurement and sensor control technology, a variety of the novel data acquisition and measurement methods are unremittingly emerging [13-14]. The large size of the equipment group for condition monitoring and measurement in the current industrial applications is a common phenomenon, a number of the measuring points are required, and the high sampling frequency of each measuring point and the longtime of data collection from the beginning of service to the end of health monitoring and fault diagnosis is perpetual [15-16]. Thus, the amount of data collected from rotating machinery is large, and its data complexity is high. Summarily, the massive measured data and signals should be acquired for health condition of mechanical equipment, which can make health monitoring and management of mechanical system enter into the era of ‘mechanical big data’ [16-17]. Meanwhile, the phenomenon of "mechanical big data" also brought a number of issue that fault diagnosis of mechanical equipment and historical data acquisition is difficult, owing to the mechanical equipment uptime is greater than the downtime [16-18]. Due to the inherent characteristics of industrial production, the collected data of mechanical equipment failure condition is difficult to be obtained, and the number of failure samples is often far less than the number of

samples under normal conditions [17-20]. That is to say, this kind of issue has put forward a series of challenges for the traditional intelligent diagnosis method. It is generally known that the traditional intelligent fault diagnosis methods have a common assumption that the number of training samples for different mechanical health conditions is roughly equant [18-19]. On this basis, most of the traditional intelligent fault diagnosis methods are based on manifold learning [21], Artificial Neural Network (ANN) [22], Support Vector Machine (SVM) [23] and other machine learning algorithms. Nevertheless, the mathematical model-based for mining the diagnostic information hidden from the collected fault data can be mostly utilized by the above-mentioned methods. A new supervised manifold learning algorithm (S-Lapeig) for fault features extraction of rotating machinery was firstly designed by Jiang et al. [24]. Zhao et al. [21] first developed a new global-local margin Fisher analysis (GLMFA) algorithm for fault diagnosis of rolling bearings. In [25], an intelligent fault diagnosis model for the gear box based on wavelet support vector machine (WSVM) and immune genetic algorithm was for the first time proposed. And in the Ref. [26], a framework was proposed to perform fault detection and classification in nonstationary environments. This framework can detect new errors without receiving any updates. Finally, the designed framework was applied to the diagnosis of bearing defects in induction motors, and the feasibility of its industrial application was verified. In [27], an efficient sampling technique for Ensemble Learning (EL) and diagnosing bearing defects under Class Imbalanced Condition (CIC) was proposed, the constructed diagnostic solution can effectively diagnose a variety of bearing defects of the induction motor in the case of class imbalance. Farajzadeh-Zanjani et al. [28]. first of all, designed an effective fault diagnosis method for asynchronous motor bearing defects. This method mainly employed the empirical mode decomposition technique to analyze the vibration signal and extract the information modal function as a set of features. The maximum compression metric learning (MCML) method can be employed to reduce the dimension of the extracted feature set, and a set of small feature information sets for fault classification was generated. Nevertheless, the numerous researchers can suggest that if the number of training samples for different categories is unequal, it will generate a serious impact on model learning [11, 29]. However, the above-mentioned intelligent learning methods based on Machine Learning (ML) are mostly based on the approximate class balanced. For the class imbalanced samples, the performance of the traditional intelligent diagnosis based on ML can be severely deteriorated. Furthermore, those trained fault diagnosis model will lose its practical diagnostic significance in the case of the extreme imbalance, which will bring about some difficulties for fault diagnosis [11, 29]. At present, numerous research work has gradually focused on the intelligent imbalanced fault diagnosis based on ML. Mao et al. [19] firstly pointed out that the mechanical data usually presents the non-equilibrium distribution. Lan et al. [30] designed a two-stage fault diagnosis framework for the condition diagnosis of rolling bearings for the unbalanced fault data. A hybrid fault diagnosis method for asynchronous motor bearing faults under unbalanced conditions was firstly proposed by Roozbeh et al. [18]. This two-stage fault diagnosis framework can be categorized into signal segmentation, feature extraction, feature reduction and fault classification, respectively. Whereas, the above-mentioned unbalanced fault diagnosis methods are mostly based on the classification process, and its model learning process is mostly shallow learning. For the actual fault diagnosis model, fault diagnostic effect is mainly depended on the quality of feature extraction [21-31]. With the deepening research on the imbalanced problem, many researchers deemed that the traditional unbalanced fault diagnosis methods are aimed at minimizing the misclassification rate, but they do not fully consider the distribution and feature expression of data [11,18]. Thereby, there is an urgent need to diagnose the imbalanced fault based on feature extraction. In summary, this paper mainly studies the aspects of deep feature learning to improve the generalization performance of imbalanced data fault diagnosis. Distinctly, the traditional imbalanced fault diagnosis method mainly relies on factors such as expert knowledge and artificial features, which will seriously restrict the intelligent and automatic development of mechanical equipment fault diagnosis [11, 32-33]. As one of the most advanced data and information processing methods, deep learning has widely used in machine vision [32], speech processing [35], and fault diagnosis [33]. In 2013, Tamilselvan P. et al. [36] firstly employed Deep Belief Networks (DBN) into the health classification of aircraft engines to improve the safety and reliability of aircraft engine evaluation systems. At present, deep learning technology are unremittingly applied to the field of fault diagnosis, but there are few literatures applied to the diagnosis of imbalanced data. In [11], a mechanical fault imbalanced classification method based on deep normalized convolutional neural network (DNCNN) was for the first time proposed.

Ultimately, this imbalanced classification method was validated by constructing three different levels of imbalanced rolling bearing data sets. Wu et al. [37] proposed a weighted long convolution LSTM model based on sampling strategy (wLRCL-D) to solve the imbalance fault diagnosis. More importantly, some researches illustrated that feature representation extracted by a good deep neural network has one measuring standard. Namely, the perturbation of inputting data to a certain extent does not deform, which can avoid the phenomenon of over-fitting [38]. Compared to other deep learning model, Auto-encoder (AE)/Deep Auto-encoder (DAE) model can learn effective characteristics of reconstructing inputting and outputting data [38-39]. Therefore, fault diagnosis based on AE/DAE has been widely applied [38-40]. Sun et al. [38] first of all constructed a deep neural network algorithm based on sparse denoising Auto-encoder for fault diagnosis of induction motor. Shao et al [39] proposed an intelligent fault diagnosis method for rolling bearings based on Wavelet Auto-encoder (DWAE) with Extreme Learning Machine (ELM). Actually, the traditional DAE/AE has preferable characteristic learning ability, but its over-fitting problem is still subsistent. To make the extracted features are more robustness from the imbalanced data, it is necessary to explore a deep learning algorithm with the strong generalization ability. Fortunately, the regularization technology has been widely applied as one of the effective means to improve the generalization performance of deep learning [40-42]. In practical application, it is always found that the best fitting model (in the sense of minimizing generalization errors) is a large model with appropriate regularization. As a typical regularization, Laplacian graph theory was first proposed by Belkin et al. [43] in 2003. Generally, Laplacian graph theory are very well understood intuitively as neighboring samples in high dimensional space are neighboring in low dimensional manifolds or tend to be tagged with the same category [38]. Currently, Laplacian graph theory has been widely used in supervised learning, semi-supervised learning and other fields to improve the generalization performance of the model [21, 43-45]. In Ref. [40], a semi-supervised diagnostic framework based on the surface estimation of faulty distributions was designed. And induction motor bearing defect diagnosis verified the effectiveness of the proposed semi-supervised method. R. Razavi-Far et al. [42] developed a fault diagnosis method for a semi-supervised deep learning induction motor shaft gearbox. The program consists of two main modules: information fusion and decision making. That is to say, Laplacian regularization will be further introduced into DAE/AE model to improve its generalization performance in this paper. Therefore, the improved deep learning algorithm in this paper (namely, Deep Laplacian Auto-encoder, DLapAE) will be developed to maintain the internal structural invariance of the low-dimensional embedding data. To our best of knowledge, other works on Laplacian regularization technique and Auto-encoders should be reviewed as below. In [46], the new novel unsupervised manifold learning method was designed, which was so-called as Laplacian Auto-Encoders (LAE), and then the benchmark visual datasets were applied to validated the effectiveness of the designed unsupervised manifold learning method. More obviously, LAE is just a multi-layer feature extraction process. It has no fine-tuning process, so it is an unsupervised learning feature extraction method. Our designed DLapAE is a fine-tuned deep neural network. Yu et al. [47] first of all proposed a novel stacked denoising auto-encoders (SDAE) algorithm, termed as Manifold Regularized SDAE (MRSDAE) based on particle swarm optimization (PSO). It is equivalent to embedding two Laplacian matrices into the SDAE model in local and non-local scope respectively to hold the local and non-local manifold structure information of the data. The vibration signal of gearbox verified the superiority of the proposed method. Actually, MRSDAE was a two-constrained Laplacian matrix that can maintain local and non-local structural information of the data, and our designed DLapAE placed emphasis on maintaini local manifold structure information. In [48], Laplacian pyramidal Auto-encoder (LPAE) was firstly proposed by Zhao et al., it was a directly modified auto-encoder framework for unsupervised representation learning. This method reconstructed the original image and the low-pass filtered image by using multiple encoder sub-networks within the Laplacian pyramid framework. The experimental results demonstrated that the Laplacian pyramid can make the training process more stable and effective, and it also can improve the performance of scale information learning representation. In the field of image processing, the Laplacian pyramid is a linear reversible image representation method that can divide the image into layers like a pyramid. The image pyramid is made up of a series of band-pass images. Hence, its basic principle is different from our designed DLapAE. Shao et al. [49] also employed an enhanced depth feature fusion method for rotational mechanical fault diagnosis. Firstly, a new deep auto-encoder was

constructed by using the denoising auto-encoder (DAE) and contractive auto-encoder (CAE) to enhance the feature learning ability. Secondly, Locality Preserving Projection (LPP) fusion depth features were applied to further improve the quality of learning features. Finally, the fusion depth feature was entered into soft-max for fault diagnosis. In other words, tis DAE-CAE with LPP based-fault diagnosis method is a multi-stage feature extraction process, which is different from our proposed fault diagnosis method. Our designed fault diagnosis based on DLapAE is a one-step process of fault feature extraction method. To sum up, a span-new deep learning algorithm (Deep Laplacian Auto-encoder, DLapAE) in this paper is proposed for intelligent imbalanced fault diagnosis of rotating machinery. The proposed fault diagnosis framework can be categorized into two major steps: First of all, the collected vibration signals from rotating machinery are entered into the constructed the bran-new DLapAE algorithm for layer-by-layer feature extraction, afterwards the extracted deep discriminative sensitive features are entered into Back Propagation (BP) classifier for health condition diagnosis. It is well worth mentioning that Laplacian regularization term can be added and employed as the objective function to design Laplacian Auto-Encoder (LapAE) for smoothing the manifold structure of data. And a designed DLapAE is constructed with multiple LapAEs to improve the generalization performance of fault diagnosis and make it more suitable for feature learning of imbalanced data. Last but not least, two case of the experimental bearing systems can validate the effectiveness and superiority of the proposed fault diagnosis method. The organization of this paper can be arranged as follows: In Section 2, some basic theoretical background of Deep Auto-encoder (DAE) and Laplacian regularization technology are briefly introduced. Afterwards, a new deep Laplacian Auto-encoder (DLapAE) algorithm will be proposed in Section 3. And a novel rotating machinery fault imbalanced diagnostic method based on DLapAE can be also developed in Section 3. Subsequently, two case of experimental bearing systems can validate the effectiveness of our proposed method in section 4. Some valuable conclusions are relegated into Section 5.

2. Theoretical background 2.1. Brief review of Deep Auto-encoder (DAE) Generally speaking, Auto-encoder (AE) is generally a 3-layer symmetric neural network, which can consist of two parts: Encoder and Decoder [37-39]. And its structural principle is roughly described in Fig. 1. The general AE can convert the D-dimensional training sample set x∈ {xm(D)} (m is the number of samples) into d'-dimensional encoding vector h ∈ {h(D’) } by Encoder. Then, D'-dimensional encoding vector can be reconstructed into the original D-dimensional space z∈ {zm

(D)},

and z = x is utilized as the training target to obtain the optimal data representation in the hidden layer. In other

words, the outputting target of AE can also be devoted to approach the inputting data itself, and it does not need to input the labeled information of data. Meanwhile, its encoding process can be expressed as

h  f ( x )  S f (Wx  b)

(1)

where the above-mentioned network parameter set of encoders is θ = {W, b}, W is the D'×D dimension weight matrix, b is offset vector, and Sf is activation function. Typically, sigmoid or tanh activation function can be selected. Accordingly, its decoding process can be written as

z  g ' ( h )  S g (W ' x  b' )

(2)

where its decoding network parameter set is θ' = {W', b'}, and W' is weight matrix of d×d' dimension. b' is offset vector, and Sg is activation function for decoder.

z1

z2

z3

Decoder

zD W’

h2

h1

hd W

x1

x2

x3

xD Encoder

Fig. 1. Schematic diagram of the general Auto-Encoder (AE)

Among them, the general training objective of AE can be applied to optimize the parameter set and minimized its reconstruction error. Therefore, its cost function for measuring the auto-encoder reconstruction error between inputting vector and reconstructed vector can be defined as

J { z , x}  min

1 N

N

 i 1

z i  xi

(3)

More specifically, the purpose of the training AE can be devoted to obtain the optimized θ and θ'. DAE is a deep neural network composed of the multiple Auto-encoders (AEs) [50]. That is to say, the outputting of the previous layer in DAE is regarded as the inputting of the latter layer to implement the depth features learning. The characteristics of each layer extracted by DAE have better fault tolerance performance and stronger robustness. Just as Refs. [38-39] demonstrated that the same greedy layer-by-layer pre-training algorithm can be applied to training the multiple AEs. Last but not least, more details of DAE and AE are described in Refs. [37-38], respectively.

2.2. Brief review of Laplacian regularization technology At present, the regularization technology has been widely applied in machine learning, image processing, computer science and other fields [38-39]. From different perspectives, regularization techniques have been playing different roles. For machine learning, regularization is devoted to prevent over-fitting. For function fitting, regularization is adopted to make the fitting smoother. For semi-supervised learning, regularization terms can keep the internal structure of the inputting control unchanged. Numerous studies have shown that high-dimensional nonlinear data can be embedded in a low-dimensional manifold, namely, the high-dimensional homogeneous data also maintains low-dimensional embedding in the mapping of low-dimensional processes. Based on this, Belkin et al. [43-45] introduced Laplacian matrix of the intrinsic geometry as a penalty term into the regularization framework of kernel learning, and a unified Laplacian famework that can be used for supervised learning and semi-supervised learning. More specifically, Laplacian regularization can be intuitively understood as: the adjacent sample data in high-dimensional space should also be contiguous adjacent in low-dimensional manifolds. Summary, Laplacian regularization technique can be described in detail as below. Given a set of sample dataset X= {x1, x2... xi} in the high dimensional space RD, assuming that its mapping data-set on the low dimensional embedded space Rd is Y= {y1, y2..., yi}, d<
   w ij   e  0

xi  xi t2

2

if

x i and x j is neighbors

(4)

else

Afterwards, Laplacian feature map can reconstruct the local structural features of the data manifold by constructing a graph with an adjacency matrix of W. Where t is the kernel parameter. Hence, the objective function of Laplacian feature map optimization can be expressed as

R



2

xi  x j Wij

i, j

n

n

  ( xiT xi  2 xiT x j  xTj x j )Wij

(5)

i 1 j 1 n

n

n

n

n

n

j 1

i 1

i 1

i 1

  ( Wij )xiT xi   ( Wij )xTj x j  2 ( Wij )xiT x j i 1

j 1

n

n

 2 D x xi  2 D jj x x j i 1

T ii i

j 1

T j

 2trace( X T DX )  2trace( X T WX )  2trace( X T LX )

where the constraint condition YTDY=I can be devoted to guarantee that the optimization problem has a solution, T means the transposition of the matrix. And W is the adjacency matrix of the graph, the diagonal matrix D is degree matrix of the graph (Dii=∑j=1nWij, L=D−W), L is Laplacian matrix, and D is a diagonal matrix.

3. The proposed deep learning algorithm and fault diagnosis method In this paper, a span-new deep learning algorithm (Deep Laplacian Auto-encoder, DLapAE) is proposed for intelligent imbalanced fault diagnosis of rotating machinery. Specifically, this section can be segmented into two subdivisions: 1). the proposed Deep Laplacian Auto-encoder algorithm. 2). the general procedure of the proposed fault diagnosis method based DLapAE.

3.1. The proposed Deep Laplacian Auto-encoder (DLapAE) This subsection is mainly composed of two parts: 1) based on the general AE, Laplacian AE is designed by introducing Laplacian regularization technique. 2) based on the proposed Laplacian AE, the designed DLapAE is constructed with multiple LapAEs to improve the generalization performance of fault diagnosis and make it more suitable for feature learning and classification of imbalanced data. Namely, the proposed DLapAE algorithm with Laplacian regularization can improve the generalization performance of fault diagnosis and make it suitable for feature learning of imbalanced data. 3.1.1 Constructing the new objective function of Laplacian AE (LapAE) When neural network’s weight is updated every iteration by introducing Laplacian regularization matrix, so that the adjustment of each layer parameter can not only ensure the error of the actual output value and the label as small as possible, but also make the distance of the same kind of sample close or the heterogeneous samples are far from each other. For such purposes, the trained network was the iterative updated to extract sensitive features that are more conducive to fault recognition. Hypothetically, there are sample sets {(x1, l1)... (xn, ln)}, which belong to class c. And li is the category label corresponding to the sample xi, zi is reconstructed outputting. To make a long story short, the loss function of the general AE is defined as

J

1 n 1 n 1 2 LMSE ( xi , zi )   ( zi  xi )  n i 1 n i 1 2

(6)

In order to make homogeneous data closer, the separation of heterogeneous samples is more obvious. In this paper, Laplacian regularization term is added into loss function of the general AE, which makes the generalization ability of the proposed model stronger. Wherein, Laplacian regularization term R is defined as Eq. (5), a sufficiently smooth projection in the low-dimensional space is obtained, and this matrix can maintain the manifold structure of the original space data. Compared with other regularization techniques, Laplacian regularization is based on the smooth hypothesis of manifold learning, and the goal of embedding can be employed to make the neighboring ability of the same kind of data from high-dimensional embedded into low-dimensional space. In other word, this kind of constraint can make the extracted feature intra-class distance is tighter, and the inter-class distance is dispersed, so that the generalized performance of the extracted feature is excellent. To make the extracted feature of each layer is more favorable to classification, the new defined Laplacian AE’s loss function combined with Laplacian regularization is rewritten as

J Lap  J  R

(8)

where J is loss function of the general AE, ɛ is adjustment parameter of Laplacian regularization term, and R is Laplacian regularization term. Thus, the span-new loss function JLap is employed to make intra-class distance and inter-class distance of learning feature more distinct. In the case of a small number of training samples, the target of JLap is more suitable for classification. In Eq. (10), the value of the regularization adjustment parameter can be determined experimentally. Usually, its parameter adjustment factor ɛ ranges from 0 to 1. The goal of the training network is devoted to find the minimum value of JLap, and the objective function can be optimized by the gradient descent method. Therefore, its iterative equation is written as Wijl  Wijl  

 J Lap ( w, b) Wijl

(9)

bil  bil  

 J Lap ( w, b) bil

(10)

where β is learning rate. After each sub-function finds the residual of the last layer, all updated weights are iteratively obtained. 3.1.2. Constructing Deep Laplacian Auto-Encoder Overtly, the above-mentioned Laplacian Auto-encoder is a shallow network. Due to the complexity of the health features contained in the training samples, it is difficult to learn the fault features with good representation ability through only one hidden layer. Therefore, multiple Laplacian Auto-encoders are stacked, and the classification layer is added to constructing a deep Laplacian auto-encoder neural network. Specifically, when stacking multiple layers of Laplacian AEs, the outputting of the previous layer of Laplacian AE is employed as the inputting to the next layer of Laplacian AEs. Eventually, the encoding process of each layer Laplacian auto-encoder can be presented as (l ) (l ) a  f ( z )  ( l 1) ( l ,1 ) ( l ) ( l ,1 ) z  W a  b

(11)

where a (l) refers to the outputting of layer 1, z (l) and z (l+1) are the inputting of layer l and layer l+1, respectively Similarly, its decoding process of the multi-layer neural network is defined as ( n l )  g ( z ( n l ) ) a (12)  ( n l 1) ( n  l ,1 ) ( n  l ) ( n l , 2 ) z W a b    where a (n+l) is the outputting of the deepest hidden unit and is the highest-order expression of the inputting. The training of

each layer is unsupervised, that is, the auto-encoder is applied to make the output value equal to the input value. After the multi-layer Laplacian auto-encoder is stacked, the outputting of last hidden layer can be deemed as an approximation of the original inputting. In this paper, BP is selected as classification layer, and the number of neurons in the classification layer is the number of categories. And the basic steps of DLapAE can be summarized as follows: 1)

Training first layer of Laplacian Auto-encoder neural network in an unsupervised manner;

2)

The above-mentioned layer of Laplacian Auto-encoder is applied as the inputting of the next layer of Laplacian Auto-encoder;

3)

Repeating step 2) to complete all training of Laplacian Auto-encoders;

4)

The outputting of the last hidden layer is applied as the inputting of the classification layer, and the number of neurons in the classification layer is the number of health condition categories, which is prepared for the next step of supervised fine-tuning.

3.1.3 Pre-training and supervised fine-tuning for DLapAE Deep Laplacian Auto-encoder can be established by using pre-training, and the learned features in each layer are different order representations of data features. Meanwhile, it is necessary to supervise the network through the labeled samples to improve the performance of DLapAE. The supervised learning algorithm is devoted to further adjust the pre-trained neural network. After multiple iterations, the weight and offset are optimized. And its process is described as below. 1). For pre-training, the labeled samples (x(i), l(i)) is entered into neural network by using feed-forward transfer. Summarily,

the forward propagation algorithm is applied to obtain the activation values at each layer; 2). For the output layer l, the residual is defined as below. (13)

 ( nl )  ( a ( l ) J Lap )  f ' ( z ( l ) )

Where  J is the partial derivative. a ( l ) Lap 3). For l=nl-1, nl-2...2 layers, the residual of this layer is written as follow.

 ( l )  ((W ( l ) )T  ( l 1) )  f ' ( z ( nl ) )

(14)

4). Calculating the partial derivative is described as ( l 1 ) (l ) T W J Lap (W , b)   ( a )  ( l 1 ) b J Lap (W , b)  

(15)

(l )

(l )

By this way, the network parameters can be adjusted slightly according to Eq. (13-15). The pre-training and supervised fine-tuning DLapAE neural network is completed to realize the organic combination of unsupervised Auto-learning and supervised fine tuning. Different hidden layers can learn different features. Hence, deep Laplacian Auto-encoder has a complete topological structure and strong nonlinear fitting ability. In summary, this paper proposes a deep Laplacian Auto-encoder model as shown in Fig. 2: Feature extraction layer

Classification layer Laplacian regularization

W

W

W

Fine-tuning And weight updates

Fig 2. Schematic diagram of Deep Laplacian Auto-encoder (DLapAE)

3.2 The general procedure of the proposed fault diagnosis method based DLapAE According to the above-mentioned discussion, the proposed imbalanced fault diagnosis method of rotating machinery based on DLapAE is shown in Fig. 3. And its main implementation steps of fault diagnosis methods are illustrated as below. 1)

Signal acquisition and data processing stages. Specifically, the vibration signal of the rotating machine can be collected by the different sensor and Data Acquisition Card, and the sample set of rolling bearing in different health condition is obtained, then the training sample set and the testing sample set are divided and normalized, respectively;

2)

Pre-training DLapAE stages. Setting the network structure parameters (including the number of DLapAE N, the number of neurons, the regularization coefficient, etc.) to construct DLapAE. Afterwards, the raw signal values of training samples are entered into DLapAE for layer-by-layer training;

3)

Supervised fine-tuning network structure stages. The parameters of DLapAE are inversely fine-tuned by inputting a small number of labeled training samples, and the obtained parameters are fine-tuned to complete the training of the network parameters;

4)

Fault Diagnosis by DLapAE stages. The testing sample is naturally entered into the trained DLapAE, and its network outputting is diagnostic results. Signal acquisition and processing

Rotating machinery

Establishing unbalanced health data sets Training samples Testing samples

Verification and application of the proposed method

Fault Diagnosis by DLapAE Feature extraction layer

Classification layer Laplacian regularization

W

F1 F2 F3 F4 F5 F6 F7 F8 F9 F10

60

3rd PC

40 20 0 -20

W

-40 60 40 20 0 -20

fine-tuning and weight updates

2nd PC

-40

-60

-40

-20

0

20

40

1st PC

Fig 3. General procedure of the proposed method

4. Experimental Validation and Analysis To illustrate the availability of the proposed imbalanced fault diagnosis method, this section we can validate the superiority of the proposed method through two cases of rolling bearing fault experimental dataset (i.e. the motor bearing fault data of Case Western Reserve University (CWRU) [42] and our laboratorial rolling bearing fault dataset of Accelerated Bearing Life Tester (ABLT-1A) in Southeast University (SEU).

4.1 Experimental validation and analysis for CWRU bearing data set As we all know, bearings are one of the most common and vulnerable parts of rotating machinery. Our fault diagnosis of rotating machinery is effective condition monitoring and health management of typical components of rotating machinery. The CWRU bearing fault data set is the most widely publicized data set for fault diagnosis and has been the benchmark data set for fault diagnosis. Therefore, after the general fault diagnosis is proposed, it is necessary to carry out effective verification through a similar benchmark CWRU bearing fault data set. 4.1.1 Data acquisition and parameter setting To validate the effectiveness of the proposed imbalanced fault diagnosis method, the rolling bearing experimental data simulated by Case Western Reserve University (CWRU) bearing test bench shown in Fig. 4 is applied to verify the validity of the proposed method according to Ref. [51]. And the data collected in this experiment was carried out under the following experimental conditions: the motor load was 3 hp, the sampling frequency was 48 kHz, its rotate speed was 1730 r/min, and the vibration signal of mechanical equipment simulating various working conditions was collected by the acceleration sensor on the drive end bearing. And the fault rolling bearing by Electrical Discharge Machining (EMD) were three fault levels: 0.18 mm (slight fault level), depth 0.36 mm (moderate fault level), and depth 0.54 mm (serious fault level), respectively. To sum up, the experiment simulated 10 kinds of health conditions of rolling bearing, namely, ball slight

failure (BS), inner ring slight failure (IRS), outer ring slight failure (ORS), ball moderate failure (BM), inner ring moderate fault (IRM), outer ring moderate fault (ORM), ball late fault (BL), inner ring late fault (IRL), outer ring late fault (ORL), normal state (N). Each health condition is intercepted into one sample by 1024 vibration signals, and 100 samples are available for each fault condition. Among them, to reflect the balance and imbalance of fault data set, two fault data sets B (balanced data set) and UB (unbalanced data set) are separately constructed and shown in Tab. 1. Driving end bearing

Measuring transducer

Fan end bearing

Load

Motor

Engine base

Fig. 4. The motor-bearing test rig of CWRU [51] Tab. 1 The Case Western Reserve University bearing balanced data set B and unbalanced data set UB Failure types

The proportion of training samples to the total samples Data set B

Data set UB

The proportion of testing samples to the total samples

H1 (BS)

50%

20%

50%

H2 (IRS)

50%

20%

50%

H3 (ORS)

50%

20%

50%

H4 (BM)

50%

30%

50%

H5 (IRM)

50%

30%

50%

H6 (ORM)

50%

30%

50%

H7 (BL)

50%

40%

50%

H8 (IRL)

50%

40%

50%

H9 (ORL)

50%

40%

50%

H10 (N)

50%

50%

50%

Notes: The balanced data set is denoted as B; the unbalanced data set is denoted as UB

As shown in Tab. 1, two datasets (B and UB) are composed of different balanced degree data. In data set B, 50% of the health condition of each rolling bearing was used for the training samples, and the remaining samples were used for testing samples. Since the number of training samples for each health condition is the same, dataset B is a balanced dataset. In practical industrial applications, fault samples are more difficult to collect than normal samples, so the number of training samples for the fault samples of data set B is reduced, which in turn constitutes the dataset UB to simulate the imbalance of health data set. In the above-mentioned rolling bearing fault dataset, time domain waveform of vibration signal under the different health condition is revealed in Fig 5. It can be seen from Fig. 5 that the traditional time-frequency domain analysis method is more difficult to quantify different degrees and types of faults. In other words, it is relied heavily on a large amount of expert knowledge and field experience. Therefore, an intelligent fault diagnosis method is needed to quantify the fault diagnosis result. At present, the intelligent fault diagnosis methods based on machine learning have been widely utilized. To improve the diagnostic accuracy of unbalanced dataset, a novel deep learning algorithm (DLapAE) is first of all proposed in this paper, and a rotating mechanical fault imbalanced diagnostic approach based on DLapAE is proposed.

0

0.5 Time(s)

-5

1

0

0

-5

0

0.5 Time(s)

1

Bearing ball late(BL)

0

0.5 Time(s)

0.5 Time(s)

-5

1

0

-5

1

0

0.5 Time(s)

Amplitude

Amplitude 0

0

bearing inner ring late(IRL) 5

0

-5

-5

1

Amplitude

Amplitude

Amplitude

Bearing outer ring moderate(ORM) 5 5

0.5 Time(s)

0

0

0.5 Time(s)

Bearing outer ring late(ORL) 5

1

0

-5

0

0.5 Time(s)

1

0

-5

1

Amplitude

-5

0

Bearing inner ring moderate(IRM) 5

Bearing ball moderate(BM) 5

Amplitude

0

Bearing outer ring slight(ORS) 5

Amplitude

Amplitude

Amplitude

Bearing inner ring slight(IRS) 5

Bearing ball slight(BS)

5

0

0.5 Time(s)

1

Bearing normal(N)

5

0

-5

0

0.5 Time(s)

1

Fig. 5 Time domain waveform diagram of vibration signals under different health conditions

According to the Refs. [7, 9], the proposed network model parameter settings of DLapAE is mainly set as below. And the number of layers of neurons can be set to [1024-200-100-1024-100-10]. Furthermore, according to the proposed fault diagnosis method (Fig. 3) and the data set B (Tab. 1), and the adjustment regularization parameter (ɛ) can be passed through the grid search method. Afterwards, the effect of the regularization adjustment factor on DLapAE model are described in Fig. 6. It can be seen that the regularization adjustment factor ɛ =0.6 is more suitable. Therefore, considering the stability and convergence speed of the algorithm, the parameter settings of DLapAE are specifically written in Tab. 2.

Fig. 6. The influence of adjustment coefficient (ɛ) for fault diagnosis based on DLapAE Tab. 2. The parameters setting up of DLapAE Parameter type

Value

Parameter type

Value

Learning rate

0.1

nn.output

'sigm'

The number of hidden layers

2

nn.inputZeroMaskedFraction

0

The number of units in the input layer

1024

nn.weight Penalty L2

0

The number of units in the first hidden layer

200

Sparsity Penalty

0

The number of units in second hidden layer

100

Batch-size

20

The input units of BP

100

Sparsity Target

0.05

The hidden units of BP

50

Activation_function

optimal tanh

The output units of BP

10

Number of iterations

200

Momentum

0.5

Regularization adjustment coefficient

ɛ=0.6

4.1.2 Diagnosis results for balanced dataset and imbalanced dataset For the experimental comparison, some existing deep learning models (for instance, other standard DAE (Deep Auto-encoder), DRAE (Deep Regularization Auto-encoder), and DSAE (Deep Sparse Auto-encoder)) were applied to compare with the constructed DLapAE algorithm, respectively. In this section, two experimental case studies were carried out. Thus, the specific experimental conditions are described as below. Experiment 1: For dataset B, signal pretreatment or artificial feature extraction is not required, and the normalized vibration data is directly utilized as the fault diagnosis inputting for the above-mentioned four types of fault diagnosis models; Experiment 2: For the data set UB, the normalized vibration data can be directly applied as the fault diagnosis input for the above-mentioned four types of fault diagnosis models without signal pre-processing or artificial feature extraction. According to the diagnostic flow-chart in Fig. 3. And data set B is diagnosed and classified by the proposed fault diagnosis method (DLapAE) and DAE, DRAE, and DSAE based on fault diagnosis methods, respectively. Therefore, the recognition results of testing sample based on the aforementioned four diagnostic methods are displayed in Fig. 8 a)-d), respectively. It can be seen from the results that based on different deep learning fault diagnosis methods, the original signal can be directly entered into the deep learning method. In the Fig. 8, the diagnostic accuracy of DAE-based fault diagnosis method is lower than DRAE, DSAE and other fault diagnosis models. However, the accuracy rate of the proposed DLapAE based fault diagnosis method is on the verge of 100%, which can be proved that the proposed method can completely eliminate the interference of different health conditions and accurately identify 10 health conditions of rolling bearing. And the average diagnostic accuracy of the various fault types are described as the following equation and the diagnostic details as displayed in Table 3. N

Accave 

H i 1

i

(16)

N

where N=the total number of fault types, Hi indicates the diagnostic accuracy of various faults H1 H2 H3 H4 H5 H6 H7 H8 H9 H10

60 50 40 30

3rd PC

20 10 0 -10 -20 -30 -40 50

40

30

20

10

0

-10

-20

-30

-40

-30

-20

-10

2nd PC

(a)

50

40

30

20

10

0

1st PC

Low-dimensional embedded scatter distribution of testing samples based on DAE H1 H2 H3 H4 H5 H6 H7 H8 H9 H10

60 40

3rd PC

20 0 -20 -40 -60 60 40 20 0 -20 -40 -60 2nd PC

(b)

-40

-30

-20

-10

0

10

20

30

40

1st PC

Low-dimensional embedded scatter distribution of testing samples based on DSAE

H1 H2 H3 H4 H5 H6 H7 H8 H9 H10

30 20 10 0

3rd PC

-10 -20 -30 -40 -50 -60 -70 40 30 20 10 0 -10 -20 -30

-30

-20

-10

2nd PC

(c)

60

50

40

30

20

10

0

1st PC

Low-dimensional embedded scatter distribution of testing samples based on DRAE H1 H2 H3 H4 H5 H6 H7 H8 H9 H10

50 40 30

3rd PC

20 10 0 -10 -20 -30 50

40

30

20

10

0

-10

-20

-30

-40

-50

-40

-30

-20

2nd PC

(d)

0

-10

10

30

20

1st PC

Low-dimensional embedded scatter distribution of testing samples based on DLapAE Figure 7. 3-D features visualization for different diagnostic models Predicted vaule

10

Actual value

9 8

Health status

7 6 5 4 3 2 1 0

(a)

50

100

150

200 250 300 Testing Samples

350

400

450

500

Diagnostic results of testing samples (dataset B) based on DAE (DAE=0.83) Predicted vaule

10

Actual value

9

Health status

8 7 6 5 4 3 2 1 0

(b)

50

100

150

200 250 300 Testing Samples

350

400

450

500

Diagnostic results of testing samples (dataset B) based on DSAE (DSAE=0.90) Predicted vaule

10

Actual value

9

Health status

8 7 6 5 4 3 2 1 0

50

100

150

200 250 300 Testing Samples

350

400

450

500

(c) Diagnostic results of testing samples (dataset B) based on DRAE (DRAE=0.9560)

Predicted value

10

Actual value

9

Health status

8 7 6 5 4 3 2 1 0

(d)

50

100

150

200 250 300 Testing Samples

350

400

450

500

Diagnostic results of testing samples (dataset B) based on DLapAE (DLapAE=0.99) Fig. 8 Diagnostic results for dataset B based on different fault diagnosis models

Table 3. A statistic of the averaged precision for dataset B based on different fault diagnosis models Different conditions

H1

H2

H3

H4

H5

H6

H7

H8

H9

H10

Average

DAE

0.26

1

1

0.98

0.18

0.98

0.94

0.98

0.98

1

0.83

DSAE

0.94

1

1

0.84

0.48

0.86

0.94

0.96

0.98

1

0.90

DRAE

0.98

1

1

1

0.82

0.84

0.92

1

1

1

0.956

DLapAE (the proposed)

0.96

1

1

1

0.96

0.98

1

1

1

1

0.99

To validate the feature extraction ability of the proposed fault diagnosis method, the last layer features extracted by the above-mentioned four diagnostic methods were reduced into 3-D visualization through t-distributed stochastic neighbor embedding (t-SNE) technology [52], and the extracted sensitive features by DLapAE were compared and evaluated with other deep learning method by DAE, DSAE and DRAE, respectively. Taking experiment 1 as an example, t-SNE was employed for visualization considering that the extracted sensitive features of DLapAE (100 dimension), DAE (100 dimension), DSAE (100 dimension) and DRAE (100 dimension) were all high dimensional data, respectively. Afterwards, the 3-D visualization of the characteristics based on different fault diagnosis model learning by t-SNE is shown in Fig. 7, where PC1, PC2 and PC3 represent the first three principal components, respectively. As can be seen from Fig. 8, compared with the other three features, the depth features learned by DLapAE can identify the inputting data more easily. And the reasons for the above-mentioned phenomenon may be summarized as follows: 1). the proposed deep learning model has a strong ability to learn representational information from the inputting data. 2). And Laplacian regularization can make the extracted feature classification performance more obvious. In the experiment 2, the unbalanced data (dataset UB) can be recognized by DLapAE. Similarly, DAE, DRAE and DSAE are also applied for the classification of unbalanced dataset as the comparative experiment. According to the diagnostic flowchart in Fig. 3, the recognition results of testing sample (dataset UB) based on the before-mentioned four methods are respectively shown in Fig. 9 a), Fig. 9 b), Fig. 9 c), and Fig. 9 d), respectively. Predicted vaule

10

Actual value

9 8 Health status

7 6 5 4 3 2 1 0

50

100

150

200 250 300 Testing Samples

350

400

450

500

a)Diagnostic result of testing sample based on DLapAE and dataset UB (DLapAE=0.896)

Predicted vaule

10

Actual value

9 8

Health status

7 6 5 4 3 2 1 0

50

100

150

200 250 300 Testing Samples

350

400

450

500

b)Diagnostic results of testing samples based on DRAE and dataset UB (DRAE=0.726) Predicted vaule

10

Actual value

9

Health status

8 7 6 5 4 3 2 1 0

50

100

150

200 250 300 Testing Samples

350

400

450

500

c) Diagnostic results of testing samples based on DSAE and dataset UB (DSAE=0.758) Predicted vaule

10

Actual value

9

Health status

8 7 6 5 4 3 2 1 0

50

100

150

200 250 300 Testing Samples

350

400

450

500

d) Diagnostic results of testing samples based on DAE and dataset UB (DAE=0.862) Fig. 9. Diagnostic results of data set UB based on different fault diagnosis models

Compared with the diagnostic accuracy of data set B, the accuracy of dataset UB based on four diagnostic methods is gradually decreasing. The diagnostic accuracy of DLapAE is 0.896, the diagnostic accuracy of DRAE is 0.726, the diagnostic accuracy of DSAE is 0.758, and the diagnostic accuracy of DAE is 0.862. Owing to Laplacian regularization term, the degree of decline obtained by DLapAE is alleviated. Therefore, the performance of the proposed DLapAE is superior to other three unbalanced fault diagnosis method. At the same time, to evaluate the feature extraction ability of the proposed fault diagnosis method, the last layer of features extracted by the above-mentioned four diagnostic methods can be reduced into 3D visualization by t-SNE technique. A three-dimensional visualization of the characteristics based on different fault diagnosis models is displayed in Fig.10. As can be seen from Fig. 10, in addition to DLapAE, it is difficult to separate 10 types of bearing faults from the other three types of diagnostic models. H1 H2 H3 H4 H5 H6 H7 H8 H9 H10

30 20

3rd PC

10 0 -10 -20 -30 -40 40 30 20 10 0 -10 -20 -30 2nd PC

-30

-20

-10

0

10

20

30

40

50

60

1st PC

a)DLapAE-based dataset UB testing sample low-dimensional embedded scatter distribution

H1 H2 H3 H4 H5 H6 H7 H8 H9 H10

30 20 10

3rd PC

0 -10 -20 -30 -40 -50 80 60 40 20 0 -20 -40 -60

-10

-20

-30

2nd PC

60

50

40

30

20

10

0

1st PC

b) DRAE-based dataset UB testing sample low-dimensional embedded scatter distribution H1 H2 H3 H4 H5 H6 H7 H8 H9 H10

80 60

3rd PC

40 20 0 -20 -40 30

20

10

0

-10

-20

-30

-40

-50

-60

-40

-50

-30

2nd PC

30

20

10

0

-10

-20

1st PC

c)DSAE-based dataset UB testing sample low-dimensional embedded scatter distribution H1 H2 H3 H4 H5 H6 H7 H8 H9 H10

40 30 20 10

3rd PC

0 -10 -20 -30 -40 -50 -60 40 30 20 10 0 -10 -20

-50

-40

-10

-20

-30

2nd PC

0

10

20

30

40

50

1st PC

d)DAE-based dataset UB testing sample low-dimensional embedded scatter distribution Fig. 10. Diagnostic results of dataset UB based on different fault diagnosis models

4.1.3 Evaluation and analysis of the extracted deep sensitive features To evaluate the separation performance and clustering performance of the features extracted by different fault diagnosis methods, the two parameters, namely the inter-class covariance Sb and the intra-class covariance Sw, which are calculated to reflect the classification and clustering degree of the samples [53]. In the clustering analysis, the intra-class distance Sw can describe the compactness of each type of sample distribution, and the inter-class covariance Sb can be employed to define the degree of separation between classes and classes. Let the feature vector be {v1, v2...vd}, and d is the target dimension of the feature vector. Therefore, Sw and Sb are defined as Equations (17) and (18), respectively l

Sw  v r 1

kLr

l

v



k



 mvr vk  mvr





T

, k  1,2...d

Sb   mvr  mv mvr  mv



T

(17)

(18)

r 1

where mrv represents the mean of the sample feature vectors of the r-th class (r∈L=1, 2...l), mv = 1 / r l mvr is the average of r all types of feature vectors. Obviously, the inter-class covariance Sb can be applied to describe the degree of dispersion between different classes, and intra-class covariance Sw can be employed to represent the degree of clustering within the same class. In a word, the ratio of the Sb/Sw and (Sb+Sw)/Sb is larger, the effect of extracted features is the better. Generally speaking, this paper can utilize two evaluation indices to comprehensively and quantitatively describe the quality of different features, it is defined as:

Sb Sw Sw  Sb E2  Sw

(19)

E1 

(20)

Accordingly, the calculation results of the above-mentioned four diagnostic methods are and then shown in Tab. 4. According to the feature evaluation criteria, the larger Ei (i=1,2) is, the better the classification result is. It can be found that the evaluation index is the best feature distinguishability and clustering learned by DLapAE. Tab.4 The quantitative evaluation of different features by different diagnostic methods Extracted features by diagnostic methods

Evaluation index of features E1

E2

DAE

25.9506

28.7505

DRAE

32.9868

37.8542

DSAE

16.8570

24.5344

DLapAE

72.0195

81.1752

4.2 Experimental validation and analysis for ABLT-1A Bearing data set In practical industrial applications, the failure of rotating machinery does not occur suddenly, and the failure of the bearing has a complete life cycle. Bearing failures generally go through four stages: normal state, early failure, failure development phase, and late failure phase (failure). The ABLT-1A test rig is capable of simulating the full life cycle of the bearing. This experimental data is the fault simulation data made by our own laboratory. ABLT-1A bearing fault data can be applied as a supplement to the CWRU bearing fault data set for the effectiveness of the proposed method. 4.2.1 Data acquisition and parameter setting To evaluate the performance of the proposed method based on DLapAE for rolling bearing fault imbalance dataset, it is necessary to simulate the health of various types of rolling bearings. The fault simulation testing bench used in this section is ABLT-1A (Accelerated Bearing Life Tester provided by Hangzhou Bearing Test Research Center). And the physical object is described in Fig. 11. And its main components of the test bench are computer control system, test head, lubrication system, transmission system, loading system, measurement and data acquisition system, respectively. Accordingly, the testing bearings are mounted on a shaft driven by an AC motor. Afterwards, the drive system consists of a rubber band that connects the AC motor to the shaft via two pulleys. Generally, normal bearings often take a lot of time from normal to final failure, it often required a lot of manpower and material resources. In view of this, this paper takes advantage of the method of the enhanced accelerated loading experiment to simulate bearing faults. Under the premise of not changing the bearing contact fatigue mechanism, the load of the tested bearing is approach to one-third of the rated dynamic load C, thus the operation is running until the bearings reached the speed of extreme fatigue failure. Eventually, the loading diagram of the four bearings in the test rig is displayed in Fig. 12a). Loading system

Test bearings and sensors

Driving system

Data acquisition

Control system

Fig. 11. Real show of Accelerated Bearing Life Tester (ABLT-1A) Test Bench

At the same time, the NI 9234 data acquisition card and PCB acceleration sensor are employed for vibration data

acquisition. And its sampling frequency is 25600 Hz, the rotation speed is 3000r/min. And the loading conditions of the test rig are shown in Tab. 5. The rated dynamic load of the bearing 6308 is 42.3 kN, and its actual weight is 30 kg, that is, the rated dynamic load on each bearing is 15 kN. After running for 10 hours to filling up the load, the final testing machine was shut down due to the index RMS reaching the threshold. Since this experiment is acceleration fatigue, the life degradation process of the rolling bearing in this experiment is shorter than the actual condition. After the test bench was stopped, the four bearing outer rings were cut, and the surface of the rolling element of the bearing 2 was peeled off, which is specifically shown in Fig. 12 b). Bearing rolling element damage

Sensor

Thermocouple

a) The schematic of load loading and sensor

b) The failure condition of the bearing rolling element

Fig. 12. Bearing installation and fault conditions for Accelerated Bearing Life Tester Tab. 5. Experimental conditions of Accelerated Bearing Life Tester Test conditions

Specification

Bearing Type

6308 single row deep groove ball bearings

Test speed

3000 r/min

Bearing test number

4 sets

Sampling frequency

25600 Hz

Data save interval

30s

Vibration signal data length

25600

Radial load on each bearing

15kN

4.2.2 Experimental results and analysis Accordingly, the kurtosis curve of the channel 2 during the full life of the bearing 2 is displayed in Fig. 13. And the theoretical kurtosis index is more sensitive to the fault impact characteristics of the rolling bearing. It can be seen from Fig. 13 that kurtosis value was abrupt in the X=1131 data point, the bearing reached to the failure stage. That is to say, the amount of data in the normal situation is far greater than the time when the fault occurred. Here, the time of X=1131 is regarded as the early weak fault stage of rolling bearing. Therefore, the life cycle of the bearing will be divided into: normal phase (Full N) - early failure phase (Full E) - failure development phase (Full D) - failure later (Full L) stage, respectively. Each health condition can be intercepted into one sample by 1024 vibration signals, and 100 samples are available for each fault condition. Afterwards, the waveform diagram of the above-mentioned four stages is displayed in Fig. 14. Correspondingly, two types of the constructed data set Full life B (balanced data set) and Full life UB (unbalanced data set) are described in Tab. 6, respectively.

10 Full life perform anc e degradation index of rolling bearing 6308

9 8

The point at the earlier failure occurred

Kurtosis index

7

X: 1131 Y: 6.854



6 5 4 3 2 1 0 0

200

400

600

800

1000

1200

Time (min)

-5

0

0.2

0.4

0.6

0.8

1

Amplitude

Time(s) Time domain waveform diagram of Bearing early failure period(Full E)signal 5 0 -5

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0

4000 6000 8000 10000 Frequency(Hz) Frequency domain waveform diagram of Bearing early failure period(Full E)signal 0.5

Amplitude

0.2

Frequency domain waveform diagram of Bearing normal period(Full N)signal 0.5

0

2000

0

0 2000 4000 6000 8000 10000 Time(s) Frequency(Hz) Time domain waveform diagram of Bearing failure development period(Full D)signal Frequency domain waveform diagram of Bearing failure development period(Full D)signal 5 0.5 Amplitude

0

Amplitude

0

Amplitude

Amplitude

Fig. 13. The full life performance degradation curve of rolling bearing 6308 Time domain waveform diagram of Bearing normal period(Full N)signal 5

0 -5

4000 6000 8000 10000 Frequency(Hz) Frequency domain waveform diagram of Bearing failure later period(Full L)signal 1

Amplitude

Amplitude

Time(s) Time domain waveform diagram of Bearing failure later period(Full L)signal 10 0 -10 0

0.2

0.4

0.6

0.8

1

0

0

2000

0

2000

0.5 0

Time(s)

4000 6000 Frequency(Hz)

8000

10000

Fig. 14. Vibration signal waveforms of different health conditions

Accordingly, time domain and unilateral spectral frequency domain waveforms of the vibration signals based on four stages are shown in Fig. 14. The previous section debugged the experimental parameter settings and applied the diagnostic model from the previous section to this experimental data. Tab. 6. Full life balanced data set Full Life B and unbalanced dataset Full Life UB Failure types

The proportion of training samples to the total samples Dataset Full life B

Dataset full life UB

The proportion of testing samples to the total samples

Full L

50%

20%

50%

Full D

50%

30%

50%

Full E

50%

40%

50%

Full N

50%

50%

50%

Notes: Full life balanced data set is denoted as FB; and the Full life unbalanced data set is denoted as FUB

To reveal more fault diagnosis information, the confusion matrix of the training samples and testing samples obtained by using DLapAE and DAE on the ABLT-1A rolling bearing unbalanced fault dataset can be given in Fig. 15. It can be seen from Fig.15 that the proposed method can basically distinguish various faults.

1

1 Full N

1.00

0.00

0.9

0.00

0.00

Full N

0.00

0.00

1.00

0.9

0.00

0.8

0.8

0.7

Full E

0.00

0.72

0.28

0.00

Tru e L ab els

True Labels

0.7 0.6 0.5 Full D

0.00

0.22

0.78

0.4

0.00

Full E

0.00

1.00

0.00

0.00

0.6 0.5

Full D

0.00

0.00

1.00

0.4

0.00

0.3

0.3

0.2 Full L

0.00

0.00

0.2 Full L

0.1

0.00

1.00

Predicted Labels

a)

0.1 0

L ll Fu

D ll Fu

E ll Fu

N ll Fu

F

Predicted Labels

0.00

0.00

0

ul lL

F

ul lD

F

ul lE

F

ul lN

1.00

0.00

b)

Fig. 15 The confusion matrix for fault diagnosis of rolling bearing dataset (UFB) based on: a). DAE; b). DLapAE

To validate feature extraction ability of the proposed fault diagnosis method, the last layer of features extracted by the above two diagnostic methods can be reduced into two-dimensional visualization by t-SNE technique, and Fig. 16 is two-dimensional visualization of different types of features, where PC1 and PC2 represent the first two principal components, respectively. It can be seen from Fig. 16 that compared with the standard DAE feature, the depth features learned by DLapAE can be completely separated, and DAE-based fault diagnosis method can only separate Normal stages and Late faults stages. Therefore, the effectiveness of the proposed method is demonstrated. Full N

Full E

Full D

Full N

Full L

40

Full E

Full D

Full L

20 15

30 10 5

Separated

0

10 2nd PC

2nd PC

20

Aliasing 0

-5 -10 -15

-10

-20

-20 -25

-30 -15

-10

-5

0

5 1st PC

10

15

-30 -20

20

a)

-15

-10

-5

0

5

10

15

1st PC

b)

Fig.16. 2-D feature distribution map based on different diagnostic models:a). DAE; b). DLapAE (the proposed method)

4.2.3 Diagnosis performance comparison To further demonstrate the advantages of the proposed fault diagnosis method, this section will compare with the diagnostic performance of other existing deep learning models and shallow learning based-fault diagnosis methods. For deep learning, DRAE (deep regularized AE model), DBN (a common proactive deep learning model composed of multiple restricted Boltzmann machines (RBMs)) are simultaneously selected. At the same time, the shallow learning based-fault diagnosis method is traditional feature dataset +classifier (in which time-frequency domain mixed feature dataset reference in [21]), and its classifier is BPNN or SVM. The following points need to be emphasized as below: 1). For deep learning method for intelligent fault diagnosis of rolling bearings, no signal preprocessing or feature extraction is required. It is different from the traditional shallow learning based-fault diagnosis method. The inputting of deep learning methods is 1024-dimensional original vibration data. And the fault diagnosis method based on deep learning classifier is BPNN classifier by default; 2) And the inputting of fault diagnosis method based on shallow layer is mixed artificial characteristic dataset in time domain, frequency domain and time frequency domain, and so forth. Summarily, the specific fault diagnosis methods of shallow learning and deep learning can be described in Tab. 7.

Tab. 7. The description of different fault diagnosis models Fault diagnosis method

Specific description

DLapAE (DF1)

Its architecture is 1024–200–100 Learning rate is 0.1; momentum is 0.9,

iteration number is 200,

and regularization coefficient λ= 0.6; BPNN. Its architecture is 1024–200–100. Learning rate is 0.1, momentum is 0.9; Noise coefficient =0.5,

DRAE (DF2)

and regularization coefficient λ= 0.0001; iteration number is 200; BPNN. DBN(DF3)

Its architecture is 1024-400-200-100;Learning rate is 0.1;batch=20;iteration number is 200; BPNN

Artificial features+SVM (DF4) Its architecture is 1000–100–7. Learning rate is 0.05, momentum is 0.95, and iteration number is 200; BPNN. Artificial features +BP (DF5)

RBF kernel is applied. Penalty factor is 20, and kernel radius is 0.7017.

The testing sample results based on the mentioned above fault diagnosis method for balanced data set and imbalanced dataset are displayed as Tab.8. Tab. 8. The different fault diagnosis results based on unbalanced and balanced data Dataset

Balanced data set

Imbalanced dataset

Method

Diagnostic results

Run time(s)

Diagnostic results

Run time(s)

DLapAE (DF1)

1

40

1

20

DRAE (DF2)

0.95

35

0.9125

21

DBN (DF3)

0.91

33

0.85

15

Artificial features +SVM (DF4)

0.85

21

0.66

12.56

Artificial features +BP (DF5)

0.8

16

0.72

10.22

To validate the anti-noise and generalization performance of the proposed fault diagnostic method, the original dataset (Dataset full life UB) is added into the mixed noise of different interference coefficients, and fault data set with mixed noise is entered into the mentioned above fault diagnosis methods. And its random noise with interference coefficients (a) is 0.1, 0.2, 0.4, 0.5, 0.6 and 0.8, respectively [25]. Therefore, the noise-added imbalanced dataset is UXnew=ux+a.rand (size(ux)), where size(ux) represents the size of signals, and the rand (*) function is a random function that generates random numbers in MATLAB 2018a ® (The MathWorks, Inc., Natick, MA, USA). Afterwards, the above random noised dataset were flowed into different fault diagnosis to verify the generalization and anti-noise of different diagnostic methods, and the diagnostic results of ten average for testing samples with different interference coefficients are obtained in Fig.17.

Fig. 17. The recognition accuracy of different fault diagnosis models for random noise with interference coefficients

As can be seen from Fig. 17 that the anti-noise performance of the proposed fault diagnosis method is relatively stable, and its generalization performance is stronger. Due to the proposed DLapAE algorithm can improve the generalization performance of the diagnosis model by adding Laplacian regularization into the original Auto-encoder. That is to say, this fault diagnosis model is beneficial to the feature extraction and fault recognition performance of the imbalanced data. 4.2.4 Comparison with other existing regularized Auto-encoder based fault diagnosis methods

To our best of knowledge, other works on Laplacian regularization technique and Auto-encoders should be reviewed as below. In Ref.[46], the new novel unsupervised manifold learning method was designed, which was so-called as Laplacian Auto-Encoders(LAE). More obviously, LAE is just a multi-layer feature extraction process. It has no fine-tuning process, so it is an unsupervised learning feature extraction method. Our designed DLapAE is a fine-tuned deep neural network. Ref. [47] first of all proposed a novel stacked denoising auto-encoders (SDAE) algorithm, termed as Manifold Regularized SDAE (MRSDAE) based on particle swarm optimization (PSO). It is equivalent to embedding two Laplacian matrices into the SDAE model in local and non-local scope respectively to hold the local and non-local manifold structure information of the data. Actually, MRSDAE was a two-constrained Laplacian matrix that can maintain local and non-local structural information of the data, and our designed DLapAE placed emphasis on maintaini local manifold structure information. Shao et al.[49] also employed an enhanced depth feature fusion method for rotational mechanical fault diagnosis. Firstly, a new deep auto-encoder was constructed by using the denoising auto-encoder (DAE) and contractive auto-encoder (CAE) to enhance the feature learning ability. Secondly, local retention projection (LPP) fusion depth features were applied to further improve the quality of learning features. Finally, the fusion depth feature was entered into soft-max for fault diagnosis. In other words, this DAE-CAE with LPP based-fault diagnosis method is a multi-stage feature extraction process, which is different from our proposed fault diagnosis method. Our designed fault diagnosis based on DLapAE is a one-step process of fault feature extraction method. Generally speaking, the Deep Regularization Auto-encoder (DRAE) model refers to the L2 norm penalty and constraint for the weight w, and its weight penalty coefficient is generally small, and this experiment is chosen to be 0.0001. Based on the above-mentioned references and algorithm theory, LAE [46], DRAE, MRSDAE [48], and DAE-CAE-LPP [49] were employed as comparison methods for the proposed DLapAE model. Furthermore, it can be recorded as {LAE=RD1; DRAE=FD2; MRSDAE=RD3; DAE-CAE-LPP=RD4; DLapAE=RD5}. The experimental parameters are set according to the relevant references , and DRAE’s parameters are set in accordance with the parameters of DLapAE. In order to illustrate the generalization performance of the proposed regularized fault diagnosis model, the above five methods (RD1-RD5) were applied to statistically verify the balanced data set of the above ABLT-1A according to the diagnostic flow chart by using the five-fold cross-validation technique. To measure the regular generalization performance of the above five fault diagnosis models. Therefore, the specific experiments of the five-fold cross-validation method in this section for the above five fault diagnosis methods are described as follows: 

Firstly, the original ABLT-1A rolling bearing balance fault data set (100 samples per state) can be divided into 5 equal subsets of fault samples;



Then, one of the sample subsets is taken as the testing sample, and the remaining four fault sample subsets are used as training samples to calculate the average recognition accuracy of the 10 times fault diagnosis;



In order to achieve cross-training and verification of different levels of testing and training samples, step 2) should be repeated, and the averaged testing accuracy are employed as the estimated value of the unknown data prediction accuracy to obtain the diagnosis results of different regularized Auto-encoder based fault diagnosis methods.

Fig. 18. The diagnostic precision of 5-fold cross-validation based on existing regularized Auto-encoder fault diagnosis methods

In summary, the averaged diagnostic accuracy of the above five different fault diagnosis models under the processing of five-fold cross-validation is shown in Fig. 18. Cross-validation can fully explain the generalization performance of the

fault diagnosis method. As can be seen from Fig. 18, when the data training samples are constantly changing, the proposed fault diagnosis methods DLapAE, MRSDAE and DAE-CAE-LPP have achieved good diagnostic performance, but only DLapAE has the best diagnostic stability and generalization performance. And LAE based diagnostic performance is poor, because LAE is only unsupervised and there is no effective parameter fine tuning, so its diagnostic performance is lower than other deep learning networks. At the same time, the average diagnostic accuracy, diagnostic time, and standard deviation of the experimental results of the above 10 times cross-validations are shown in Tab. 9, respectively. It can be seen that the fault diagnosis result corresponding to the method is closer to 1 and the standard deviation is the smallest compared with other diagnostic methods. The comparison with other diagnostic methods shows that the proposed fault diagnosis method is more stable. Tab. 9. the statistic of the average and standard deviation of the diagnosis results based on different methods Methods

Average accuracy

Standard deviation

LAE=RD1

0.8958

1.0286

DRAE=RD2;

0.945

0.8150

MRSDAE=RD3;

0.97437

0.4918

DAE-CAE-LPP=RD4;

0.96874

0.5625

DLapAE=RD5

0.99484

0.2514

At the same time, to visualize the diagnostic results of the ABLT-1A imbalance data through the above-mentioned five diagnostic models, a box plot of the above-mentioned five regularized fault diagnosis methods for ABLT-1A imbalanced data set for 10 times diagnose is obtained in Fig.19.

Fault diagnosis accuracy

1

0.95

0.9

0.85

0.8

0.75

RD1

RD2

RD3

RD4

RD5

The regularized Auto-encoder fault diagnosis methods

Fig. 19. Box plot of different regularized Auto-encoder fault diagnosis methods

It can be seen from the box plot that the proposed fault diagnosis model RD5 (DLapAE) has a relatively high recognition rate and excellent stability. As we all know, the complexity of the algorithm can reflect the important performance of the designed algorithm performance, and the complexity analysis generally involves the analysis of time complexity analysis and training runtime[54-56]. To illustrate and analyze the diagnostic performance of the fault diagnosis method proposed in this paper, we evaluate the algorithm performance of the regularized fault diagnosis method from two aspects, such as complexity and diagnostic computational complexity and model training time. Assuming that the inputting data dimension is n1, the number of intermediate neurons is n2, and the outputting is n1, the computational complexity of the feed-forward process is O(n1*n2+n2+n2*n1+n1) =O(n2). If the stack is two layers, the middle layer is n2, n2’, then the computational complexity is O(n22+n22'). For the classifier (assuming m-type data), the computational complexity of the classifier is O(n2+n2' +n2'*m), the computational complexity of the back-propagation process is O(n*C*l), where n is the training batch size, C is the time complexity for calculating a sample gradient, and l is the number of iterations . According to the Ref. [55-58], the complexity of LPP is O (dnlog n)+O(n2)+O(pn2), and the characteristic analysis of n*n matrix requires O(pn2). (p is the ratio of non-zero elements to zero elements in the sparse matrix). For the manifold learning algorithm in the framework of graph embedding theory, LPP and other algorithm steps involve constructing k-nearest neighbor graphs. The maximum time complexity of this step is O(dn2). Only the number of samples n is related to the initial dimension d, and the calculation of

the neighboring weight matrix requires O(n2). Since the computational complexity of deep learning is a linear addition of the computational complexity of each layer, the above analysis is combined to list the time complexity of the above-described corresponding regularized Auto-encoding model for fault diagnosis which can be displayed in Tab. 10. In general, the DLapAE algorithm increases the complexity of diagnostic recognition rate and classification accuracy, but the complexity of the above diagnostic methods, such as MRSDAE and DAE-CAE-LPP is higher. Owing to the above-mentioned fault diagnosis is a multi-stage feature extraction. At the same time, LAE has the lowest complexity of the algorithm, because it is an unsupervised process. Tab. 10. Time complexity analysis and training runtime of different regularized Auto-encoder fault diagnosis methods Regularized methods

Training runtime(s)

Computational complexity

LAE

14.278

O(n2) +O(n2+n2’+n2’*m) +O(n2)

DRAE

25.929

O(n2) +O(n2+n2’+n2’*m) +O(n*C*l)

MRSDAE

27.515

O(n2) +O(n2+n2’+n2’*m) +O(n*C*l) +O(n2) +O(n3)

DAE-CAE-LPP DLapAE

26.93 22.23

O(n2) +O(n2+n2’+n2’*m) +O(n*C*l) +O(n2) +O (dn log n) +O(n2) +O(pn2) O(n2) +O(n2+n2’+n2’*m) +O(n*C*l) +O(n2)

4.3 Discussion To promote the validity and practicability of fault diagnosis for rotating machinery imbalanced dataset, a novel intelligent rotating machinery fault imbalanced diagnostic method based on Deep Laplacian Auto-encoder (DLapAE) is firstly developed in this paper. To make the research value of our paper more solid, a discussion section in Section 4.3 can be added at the end of the experiment. And there are still some potential questions and research directions remained to be improved and studied. And some questions of practical significance and worthy of future research can be discussed as below. 

One of the discussions is about the impact of the two variables of fault types and imbalance rate on fault diagnosis. When the unbalanced fault data were set up initially, we may have ignored the impact of different fault types on the entire data set. Diagnostic performance caused by different levels of unbalanced fault data may also be one of the types of fault data that are sensitive to diagnostic results, leading to some uncontrollable factors in our experimental results. Therefore, in the future research work, we will continue to study the specific impact of fault type and imbalance degree on fault diagnosis, set up more reasonable experiments, and study the internal relationship.



It can be seen from Table 3 that the diagnostic recognition rates of health types of H1, H5 and H6 are not high, while the diagnostic performance of H2-H3 and H10 has been good. This displayed that different fault types have different importance to the overall fault diagnosis, that is to say, the fault type and fault imbalance are two independent variables. Some faults can be easily diagnosed, and some faults are generally difficult to diagnose, but our proposed method have superior diagnostic performance for different fault types.



On the one hand, fault datasets of CWRU and ABLT-1A are different application scenarios. This is because the CWRU fault data set is an distinguished open public data set. This is a fault data set manufactured by EDM. And its fault result is obvious and ideal. And our own experimental dataset (ABLT-1A) is a full-life industrial production simulation experiment, which allows the entire machine to run until the fault occurs with a life-cycle life cycle, it is more realistic industrial production. On the other hand, the generalization performance of of a new algorithm is required to validated by at least two data set, which makes the proposed method more convincing.

5. Conclusion Actually, the health condition dataset of mechanical systems often presents the imbalanced distribution. To promote the validity and practicability of fault diagnosis for rotating machinery imbalanced dataset, a novel intelligent rotating machinery fault imbalanced diagnostic method based on Deep Laplacian Auto-encoder (DLapAE) is firstly developed in this paper. First of all, the collected vibration signals are entered into the constructed the bran-new DLapAE algorithm for layer-by-layer feature extraction, afterwards the extracted deep discriminative sensitive features are flowed into Back Propagation (BP) classifier for health condition diagnosis. It is worth mentioning that Laplacian regularization term can be

added and employed as the objective function to design Laplacian auto-encoders (LapAEs) for smoothing the manifold structure of data. A designed DLapAE is constructed with multiple LapAEs to improve the generalization performance of fault diagnosis and make it suitable for feature learning and classification of imbalanced data. Last but not least, two case of the experimental bearing systems validate the effectiveness of proposed methodology. It is very interesting to implement the accurate fault diagnosis for rotating machinery balanced and imbalanced dataset with deep learning and regularization technique. And we would continue to investigate this topic in the future.

Acknowledgments This paper would like to acknowledge the supports by the State Key Program of National Natural Science Foundation of China (No.51937002), the National Natural Science Foundation of China (Grant No.51675098), and Postgraduate Research & Practice Innovation Program of Jiangsu Province, China (No.SJKY19_0064). This research is also funded by China Scholarship Council (CSC). Meanwhile, the authors would like to appreciate the anonymous reviewers and the editor for their valuable comments.

References [1]

J. Lee, F. Wu, W. Zhao, et al., Prognostics and health management design for rotary machinery systems—Reviews, methodology and applications. Mech. Syst. Signal Process. 2014; 42 (1-2): 314-34.

[2]

A. Heng, S. Zhang, A C C. Tan, et al., Rotating machinery prognostics: State of the art, challenges and opportunities. Mech. Syst. Signal Process. 2009; 23(3): 724-39.

[3]

C. Sun, M. Ma, Z. Zhao, et al. Sparse Deep Stacking Network for Fault Diagnosis of Motor. IEEE Trans. Ind. Electron. 2018; 14(7): 3261-70.

[4]

X. Zhang, Y. Liang, J. Zhou, A novel bearing fault diagnosis model integrated permutation entropy, ensemble empirical mode decomposition and optimized SVM, Measurement 69 (2015) 164-179.

[5]

K. Yu, T. R. Lin, H. Ma, H. Li, and J. Zeng, A combined polynomial chirplet transform and synchroextracting technique for analyzing nonstationary signals of rotating machinery. IEEE Transactions on Instrumentation and Measurement. 2019; DOI: 10.1109/TIM.2019.2913058.

[6]

R. Razavi-Far, M. Saif, Ensemble of extreme learning machines for diagnosing bearing defects in non-stationary environments under class imbalance condition, in: Proceedings of IEEE Symposium Series on Computational Intelligence (SSCI), Athens, Greece, 2016.

[7]

A. Glowacz, W. Glowacz, Z. Glowacz, et al., Early fault diagnosis of bearing and stator faults of the single-phase induction motor using acoustic signals, Measurement 113(2018)1-9.

[8]

Y. Li, Y. Yang, G. Li, et al., A fault diagnosis scheme for planetary gearboxes using modified multi-scale symbolic dynamic entropy and mRMR feature selection. Mech. Syst. Signal Process. 2017; 91: 295-312.

[9]

H. Shao, H. Jiang, H. Zhao, et al., A novel deep autoencoder feature learning method for rotating machinery fault diagnosis. Mech. Syst. Signal Process. 2017; 95:187-204.

[10] J. Jiao, N. Zhao, G. Wang, S. Yin. A nonlinear quality-related fault detec-tion approach based on modified kernel partial least squares. ISA Trans 2017; 66: 275-83. [11] J. Feng, L. Yaguo, L. Na, et al. Deep normalized convolutional neural network for imbalanced fault classification of machinery and its understanding via visualization. Mech. Syst. Signal Process. 2018; 110: 349-67. [12] X. Yan, M. Jia. Application of CSA-VMD and optimal scale morphological slice bispectrum in enhancing outer race fault detection of rolling element bearings. Mech. Syst. Signal Process.2019; 122:56–86. [13] Y. Lei, N. Li, L. Guo, et al. Machinery health prognostics: A systematic review from data acquisition to RUL prediction. Mech. Syst. Signal Process. 2018; 104: 799-834. [14] Z. Zhang, S. Li, J. Wang, Y. Xin, Z. An, General normalized sparse filtering: A novel unsupervised learning method for rotating machinery fault diagnosis. Mech. Syst. Signal Process 2019; 124: 596-612. [15] C. Li, R.V. Sanchez, G. Zurita, et al., Multimodal deep support vector classification with homologous features and its application to gearbox fault diagnosis. Neurocomputing 2015; 168: 119–27.

[16] Y. Lei, F. Jia, J. Lin, and S. Xing, An intelligent fault diagnosis method using unsupervised feature learning towards mechanical big data. IEEE Trans. Ind. Electron 2016; 63 (5): 3137-47. [17] X. W. Dai and Z. W. Gao, From model, signal to knowledge: A datadriven perspective of fault detection and diagnosis. IEEE Trans. Ind. Informat 2013; 9 (4): 2226-8. [18] R. Razavi-Far, M. Farajzadeh-Zanjani, M. Saif. An Integrated Class-Imbalanced Learning Scheme for Diagnosing Bearing Defects in Induction Motors. IEEE Trans. Ind. Informat 2017; 13(6): 2758-69. [19] W. Mao, L. He, Y. Yan, et al, Online sequential prediction of bearings imbalanced fault diagnosis by extreme learning machine. Mech. Syst. Signal Process. 2017; 83: 450-73. [20] I. Martin-Diaz, D. Morinigo-Sotelo, Q. Duque-Perez, et al, Early fault detection in induction motors using AdaBoost with imbalanced small data and optimized sampling,pattern recognition techniques. IEEE Trans. Ind. Appl 2017; 53(3): 3066-75. [21] X. Zhao, M. Jia, Fault diagnosis of rolling bearing based on feature reduction with global-local margin Fisher analysis. Neurocomputing 2018; 315: 447-64. [22] M. Kordestani, M. F. Samadi, M. Saif, et al, A new fault diagnosis of multifunctional spoiler system using integrated artificial neural network and discrete wavelet transform methods. IEEE Sensors J 2018; 18 (12): 4990-5001. [23] X. Zhu, J. Xiong, Fault diagnosis of rotation machinery based on support vector machine optimized by quantum genetic algorithm. IEEE Access 2018; 6: 33583-8. [24] Q. Jiang, M. Jia, J. Hu, et al, Machinery fault diagnosis using supervised manifold learning. Mech. Syst. Signal Process. 2009; 23(7): 2301-11. [25] F. Chen, B. Tang, R. Chen, A novel fault diagnosis model for gearbox based on wavelet support vector machine with immune genetic algorithm. Measurement 2013; 46(1): 220-32. [26] R. Razavi-Far, E. Hallaji, M. Saif, and G. Ditzler, A Novelty Detector and Extreme Verification Latency Model for Nonstationary Environments. IEEE Trans. Ind. Electron. 2019; 66 (1): 561-570. [27] M. Farajzadeh-Zanjani, R. Razavi-Far, and M. Saif, Efficient sampling techniques for ensemble learning and diagnosing bearing defects under class imbalanced condition. in IEEE Symposium Series on Computational Intelligence (SSCI); 2016, pp. 1–7 [28] M. Farajzadeh-Zanjani, R. Razavi-Far, and M. Saif, Dimensionality reduction-based diagnosis of bearing defects in induction motors. in 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Oct 2017, pp. 2539–2544. [29] W. Zong, G. B. Huang, Y. Chen. Weighted extreme learning machine for imbalance learning. Neurocomputing 2013; 101: 229-42. [30] Y. Lan, X. Han, W. Zong, X. Ding, X. Xiong, J. Huang, B. Ma, Two-step fault diagnosis framework for rolling element bearings with imbalanced data based on GSA-WELM and GSA-ELM. Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science. 2018; 232(16): 2937-47. [31] H. Liu, C. Liu, Y. Huang. Adaptive feature extraction using sparse coding for machinery fault diagnosis. Mech. Syst. Signal Process. 2011; 25(2):558-74. [32] M. Gan, C. Wang, C. Zhu, Construction of hierarchical diagnosis network based on deep learning and its application in the fault pattern recognition of rolling element bearings. Mech. Syst. Signal Process. 2016; s72-73(2): 92-104. [33] R. Zhao, R. Yan, Z. Chen, et al., Deep learning and its applications to machine health monitoring. Mech. Syst. Signal Process 2019; 115: 213-237. [34] Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 2015; 521 (7553): 436–444. [35] G. Hinton, L. Deng, D. Yu, G. Dahl, A.Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Process. Mag. 2012; 29(6)): 82-97. [36] P. Tamilselvan, P. Wang, Failure diagnosis using deep belief learning based health state classification. Rel. Eng. Syst. Safety 2013; 115: 124-135. [37] Z. Wu, Y. Guo, W. Lin, et al. A Weighted deep representation learning model for imbalanced fault diagnosis in Cyber-Physical Systems. Sensors 2018; 18(4): 1096. [38] W. Sun, S. Shao, R. Zhao, et al, A sparse Auto-encoder-based deep neural network approach for induction motor faults classification. Measuremen 2016; 89 (ISFA): 171-8. [39] H. D. Shao, H. K. Jiang, X. Q. Li and S.P. Wu, Intelligent fault diagnosis of rolling bearing using deep wavelet auto-encoder with

extreme learning machine, Knowl.-Based Syst

2018; 140 (15):1-14.

[40] R. Razavi-Far, E. Hallaji, M. Farajzadeh-Zanjani, and M. Saif, A semisupervised diagnostic framework based on the surface estimation of faultydistributions. IEEE Trans. Ind. Informat. 2019; 15(3): 1277-1286. [41] Z. Chen, W. Li, Multisensor feature fusion for bearing fault diagnosis using sparse Autoencoder and deep belief network. IEEE Trans. Instrum. Meas 2017; 66(7): 1693-702. [42] R. Razavi-Far, E. Hallaji, M. Farajzadeh-Zanjani, M. Saif, S.H. Kia, H. Henao, G. Capolino, Information fusion and semi-supervised deep learning scheme for diagnosing gear faults in induction machine systems. IEEE Trans. Ind. Electron. 2019; 66 (8): 6331–6342. [43] M. Belkin, P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and clustering, Adv. Neural Inf. Process. Syst 2002; 15: 585-92. [44] M. Belkin, P. Niyogi, V. Sindhwani, Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res 2006; 7: 2399–434. [45] M. Belkin, P. Niyogi. Semi-supervised learning on Riemannian Manifolds. Machine Learning, 2004; 56(1-3): 209-39. [46] K. Jia, L. Sun, S. Gao, Z. Song, and B. E Shi. Laplacian auto-encoders: An explicit learning of nonlinear data manifold. Neurocomputing. 2015; 160:250–260. [47] Jian-Bo Yu. Evolutionary manifold regularized stacked denoising autoencoders for gearbox fault diagnosis. Knowl.-Based Syst., 2019; 178: 111-122. [48] Zhao Q, Li Z. Unsupervised Representation Learning with Laplacian Pyramid Auto-encoders. 2018; arXiv preprint arXiv: 1801.05278. [49] H. Shao, H. Jiang, F. Wang, and H. Zhao, “An enhancement deep feature fusion method for rotating machinery fault diagnosis,” Knowl.-Based Syst., 2017; 119: 200–220. [50] P.Vincent, H. Larochelle, I. Lajoie, et al, Stacked denoising autoencoders: learning useful representations in a Deep Network with a local denoising criterion. J. Mach. Learn. Res 2010; 11(12): 3371-408. [51] K.A.

Loparo,

The

Case

Western

Reserve

University.

Bearing

data

center.

[EB/O-L]

[2018-10.25].

http://csegroups.case.edu/bearingdata-center/home. [52] L. Maaten, G. E. Hinton, Visualizing data using t-SNE. J. Mach. Learn. Res 2008; 9(3): 2579–605. [53] X. Ding, Q. He, N. Luo, A fusion feature and its improvement based on locality preserving projections for rolling element bearing fault classification. J. Sound Vib 2015; 335: 367-83. [54] Time complexity of 3-layer stacked auto-encoder. [EB/O-L] [2019-10.27]. https://blog.csdn.net/PoGeN1/article/details/88595162. [55] X. Zhao, R. Zhao. A Method of Dimension Reduction of Rotor Faults Data Set Based on Fusion of Global and Local Discriminant Information. Acta Automatica Sinica, 2017; 43(4): 560-567 [56] A. Majumdar. Graph structured autoencoder. Neural Networks. 2018; 106: 271-280 [57] S. Feng, M. F. Duarte. Graph Regularized Autoencoder-Based Unsupervised Feature Selection. 2018 52nd Asilomar Conference on Signals, Systems, and Computers, DOI: 10.1109/ACSSC.2018.8645362 [58] D. Charte, F. Charte, S. Garcíaa, M. J. del Jesus, Francisco Herreraa, A practical tutorial on auto-encoders for nonlinear feature fusion: Taxonomy, models, software and guidelines. Information Fusion. 2018; 44:78-96.

Highlights: A novel Laplacian Auto-encoder (LapAE) model is first of all proposed. Deep Laplacian Auto-encoder (DLapAE) algorithm can be constructed with the multiple LapAEs.

An intelligent imbalanced fault diagnosis method of rotating machinery based on DLapAE is developed.