Mechanism and Machine Theory 36 (2001) 157±175
www.elsevier.com/locate/mechmt
Arti®cial neural network design for fault identi®cation in a rotor-bearing system Nalinaksh S. Vyas *, D. Satishkumar Department of Mechanical Engineering, Indian Institute of Technology, Kanpur 208016, India Received 1 June 1998; received in revised form 26 April 1999; accepted 12 January 2000
Abstract A neural network simulator built for prediction of faults in rotating machinery is discussed. A backpropagation learning algorithm and a multi-layer network have been employed. The layers are constituted of nonlinear neurons and an input vector normalization scheme has been built into the simulator. Experiments are conducted on an existing laboratory rotor±rig to generate training and test data. Five dierent primary faults and their combinations are introduced in the experimental set-up. Statistical moments of the vibration signals of the rotor-bearing system are employed to train the network. Network training is carried out for a variety of inputs. The adaptability of dierent architectures is investigated. The networks are validated for test data with unknown faults. An overall success rate up to 90% is observed. Ó 2001 Elsevier Science Ltd. All rights reserved.
1. Introduction Fault identi®cation and diagnosis has become a vigorous area of work during the past decade. Attempts have been made towards classi®cation of the most common type of rotating machinery problems, de®ning their symptoms and search for remedial measures [1±3,7]. Diagnostic techniques like waveform analysis, orbital analysis, spectrum and Cepstrum analysis, Expert Systems are routinely used for fault identi®cation in operational rotating machinery and also for design and development processes. Time domain or wave form analysis involves analysis of shape of the vibration. It provides information on signal shape, i.e. truncation, pulses, modulation, glitch or shaft-induced signals obtained from a proximity probe that are caused by irregularities in the shaft cross-section. Time
*
Corresponding author. Tel.: +91-512-597-040; fax: +91-512-590-007. E-mail address:
[email protected] (N.S. Vyas).
0094-114X/01/$ - see front matter Ó 2001 Elsevier Science Ltd. All rights reserved. PII: S 0 0 9 4 - 1 1 4 X ( 0 0 ) 0 0 0 3 4 - 3
158
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
Nomenclature a av
t ah
t dz
t E
Ep fjh l L M nethpj ohpj oopk p
x rxy uk wkj yk xj
xp1 ; xp2 ; . . . ; xpN 0 z(t) a u
hk hhL lk mk g Dp whji Dp woji
logistic constant acceleration in vertical direction acceleration in horizontal direction dierential complex time series expectation operator mean square error activation function at node j of the hidden layer h time lag number of nodes in a hidden layer number of nodes in the output layer net input values to the hth hidden layer unit output from the hidden layer h, at node j, for the pth input output of the kth node of the output layer, for the pth input vector probability density function second-order cross-moments output from kth the summing junction weight factor at the kth node, for the jth input output from the kth neuron jth input signal to the network pth input vector to the network complex time series moment activation function threshold at the kth neuron threshold for node L of the layer h central moments modi®ed output from kth the summing junction learning rate, mean weight change for the jth input at the ith node, hth hidden layer weight change for the jth input at the ith node, output
between events represents the frequency components within the machine. The phase between two signals provides information about vibratory behavior that can be used to diagnose a fault such as misalignment. Orbital analysis, whereby the horizontal and vertical motions of the rotor with respect to a sensor mounted on the bearing are simultaneously obtained to get the instantaneous position of the rotor, has been eectively used for identi®cation of phenomenon like oil whirl and other asynchronous motions as well as synchronous phenomena such as mass unbalance and misalignment. Spectrum analysis, as the most popular diagnostic tool, provides crucial information about the amplitude and phase content of the vibration at various frequencies. Frequencies of vibration response are related to direct excitation frequencies or their orders, natural frequencies, sidebands, subharmonics and sum±dierence frequencies. The peaks at multiple
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
159
orders of the fundamental are identi®ed with the variety of faults that are present in the rotating machinery. The fault is predicted by comparing with good test data that is directly proportional to the information available on the design of a machine and its working mechanisms. A Cepstrum plot, which can be viewed as a modi®cation of spectrum analysis, is eective for accurately measuring frequency spacing, harmonic and sideband patterns in the power spectrum. An arti®cial intelligence (AI) scheme, such as an Expert System on the other hand, is an algorithm based on available human expertise. Knowledge is stored in the form of facts and rules. Knowledge is controlled by an inference engine, which interacts with the user and the knowledge base according to the rules contained in it. Since the knowledge or data in most cases may be incomplete or uncertain, the models employ probabilistic reasoning technique such as Bayes's rule, fuzzy logic, Dempster±Shafer calculus, etc. Arti®cial neural networks (ANN) are often thought of as distinct from the AI, though the ANNs are common in AI literature. ANNs simulate the biological process of human brain and nervous system. The development of an algorithm for a neural network simulator for prediction of faults in rotating machinery, is discussed in this paper. Neural networks are knowledge-based systems. A relationship is developed between observed symptoms and probable causes. Existence or creation of a knowledge base is essential in order to train the network. Reference can be made to [7], where an existing knowledge base, from the work of Sohre [8], has been employed to develop an Expert System. In situations, where such a knowledge base does not exist, one needs to be created. Collection of such data is facilitated in cases of machinery, where inspection and maintenance are carried out at regular intervals. For example, in aircraft, overhauling and balancing of rotating components of the engine are carried out on a regular basis, during routine checks. The engines are run in the test-beds, prior to and after overhaul and the vibration levels are noted at speci®c points on the engine casing. Data of this type, in conjunction with the inspection report provide good information towards creation of a knowledge base for the aeroengine. In the present study, the neural network simulator is developed and its use for fault prediction is illustrated, by employing a knowledge base, which is created through laboratory experiments on a rotor±rig. Five dierent primary faults and their combinations were introduced in the experimental set-up. The vibration signals collected through piezoelectric transducers from the bearing blocks were employed to train the network. A nonlinear model of a neuron is employed and the network use a back-propagation learning algorithm. An input vector normalization scheme has been built into the simulator. Adaptability of various neural network architectures has also been investigated. Neural network training was carried out with the chosen architecture till a desired degree of convergence is achieved. The network was ®nally tested and validated for test data with unknown faults. An overall success rate up to 90% was observed. 2. Network design The basic features of the network designed for rotating machinery fault diagnosis are described here. Reference, can be made to the text by Hayken [4] for a detailed review of neural networks procedures. The simulator employs a nonlinear model of a neuron described in Fig. 1. The connecting links, called synapses, specify the connection between a signal xj at the input of the sample j,and a neuron k, through a weighting factor, wkj . An adder sums up the input signals
160
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
Fig. 1. Nonlinear model of neuron.
weighted by the respective synapses of the neuron. The operation is similar to that of a linear combiner. The activation function limits the amplitude of the output of the neuron. The model includes an externally applied threshold function, hk , that has the eect of lowering the net input of the activation function. If the net input of the activation function is to be increased the function is to be called a bias term rather than a threshold. The activative function, u
, de®nes the output of a neuron in terms of the activity level at its input. The sigmoidal function is the most common activation function used in the construction of the arti®cial neural networks. It is de®ned as a strictly increasing function exhibiting smoothness and asymptotic properties, e.g. the logistic function u
m 1=1 exp
ÿam;
1
where a is the slope parameter of the sigmoid function (Fig. 2). Mathematically, the neuron has been described as uk
p X
wkj xj ;
yk u
uk ÿ hk u
mk :
2
j1
A back-propagation algorithm (BPA) learning algorithm (Fig. 3) has been employed in the present case. BPA is found to be the most common learning algorithm which has been tested with a number of dierent problems and has been found to perform well in most cases. The BPA performs the input to output mapping by minimizing a cost function using a gradient search technique. The cost function (which is equal to the mean squared dierence between the desired and the actual net output) is minimized by making wide connection adjustments according to the error between the computed and target output processing element values. In the ®rst stage, of the development of a BAP ± namely forward pass, all the weights of the network are initialized randomly and the network outputs and the dierence between the actual and target output (i.e. error) is calculated for the initialized weights. During the second stage of the backward step, the
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
Fig. 2. The sigmoidal function.
Fig. 3. Back-propagation network architecture.
161
162
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
initialized weights are adjusted to minimize the error by propagating the error backwards. The network outputs and error are calculated again with the updated weights and the process repeats till the error is acceptably small. Referring to Fig. 3, for the pth input vector, xp
xp1 ; . . . ; xpN , the net input to the jth unit in the hth hidden layer is nethpj
N X i1
whji xpi hhj ;
3
where whji and hhj represent the respective weight and threshold values. The output from the hidden layer is ohpj fjh
nethpj
4
where fjh is the activation function at node j of the hidden layer h. The outputs, ohpj , from the last hidden layer form the input to the output layer o. The net input at the kth unit in the output layer is netopk
L X j1
wokj opj hok
5
and the output from the kth unit in the output layer is oopk fko
netopk
6
If the target output at the kth unit in the output layer is ypk , the total mean square error at the output layer is Ep
M 1X
ypk ÿ opk 2 : 2 k1
7
The weights at the output and hidden layer are adjusted, during the backward pass, to minimize the mean square error. The adjustment, that is required in the weights, is computed as X oEp ypk ÿ oopk oopk 1 ÿ oopk wokj
8 Dp whji ÿ h xpi ohpj 1 ÿ ohpj owji k In practice, instead of directly applying the above weight change to the weight at the output layer, two network parameters are introduced, namely, Learning rate coecient, g, and momentum, a, in order to make the learning progress smooth and to ensure that the weight changes take place always in the same direction [4]. X
9 whji
t 1 whji
t gxpi ohpj 1 ÿ ohpj ypk ÿ oopk oopk 1 ÿ oopk wokj awhji
t ÿ 1; where wji
t ÿ 1; wji
t and wji
t 1 are the weights during successive passes. Usually, g is a small number (0.05±0.9). A small value of g implies that the network will have to make a large number of iterations. It is often possible to increase the value of g as network error decreases, thereby increasing the speed of convergence towards the target output. The other way to increase the convergence speed is by adopting a momentum term, a (equal to a fraction of the previous
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
163
weight change Dp w), while updating the weights. This additional term tends to keep the weight changes in the same direction.
3. Network training and testing The training and the test data, during the present study were generated on a laboratory rotor± rig. The rig (Fig. 4) consists of a 10 mm diameter shaft carrying a centrally located steel disc weighing 0.5 kg. It is supported in identical rolling element bearings (type 6200 SKF ball bearings) at two ends and driven by a 50 W, 230 V AC/DC electric motor through a ¯exible spider coupling. The following faults were deliberately introduced in the rig for generating training data: Rotor with no fault. The Rotor is balanced, the alignments and ®ttings are done properly so as to presumably classify it as a system with no fault. Rotor with mass unbalance. A mass of about 0.05 kg was added at radius of 25 mm on the rotor disk and unbalance was created. Rotor with bearing cap loose. The cap on the bearing block was loosened so as to create a gap of approximately 1 mm between the outer race of the bearing and the cap of the bearing block. Rotor with misalignment. Misalignment in the rotor created by shifting the bearing block sideways by about 3 mm, so that the axis of the two bearing blocks is out of alignment by about 3 mm. Play in spider coupling. The ¯exible rubber spider in the coupling was removed and a small cut was introduced such that a radial clearance was created at the two halves of the coupling at the outer diameter of 15 mm. Rotor with both mass unbalance and misalignment. In this case mass unbalance and misalignment were both introduced simultaneously in the rig.
Fig. 4. Laboratory arrangement of the rotor±rig.
164
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
Fig. 5. Time and frequency domain signals for various rotor conditions. (a) No-fault case; (b) coupling loose case; (c) mass unbalance case; (d) bearing cap loose case; (e) misalignment case.
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
165
These faults were introduced one at a time and the vibration signals of the rotor were picked up from the bearing-block caps by piezoelectric accelerometers, for a range of rotor speeds from 500 to 900 rpm. The signals were ampli®ed by charge ampli®ers. The ampli®ed signals were stored in ¯oppy diskettes using a portable dual channel FFT analyzer for further processing. Typical rotor vibrations, sensed at the bearing cap, by the accelerometer, for one of the faults introduced in the rotor, are shown in Fig. 5(a)±(e). The signals are shown in time domain along with their FFTs in these ®gures and pertain to a typical rotor speed of 800 rpm. Twenty such vibration signals were taken, for each of the faults introduced, at every speed of rotation. Signals were obtained for ®ve rotor speeds in the range 500±900 rpm. It can be seen from the frequency domain signals that the faults cannot be distinguished by selecting a limited number of frequencies from the FFT spectra. In order to retain all the relevant features of the signal, amplitude and phase information at a reasonably large number of frequencies may have to be provided to be used as training data. Other high resolution methods such as autoregressive analysis, time frequency distribution methods, e.g. wavelet transformation, also generate a large number of inputs. An eective way to reduce the number of inputs is to employ
Fig. 5. (Continued).
166
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
the moments of the vibration signals acquired in the time domain [5]. The moments, of such a time series, characterize the probability density function of the vibration signal [6]. The characteristic function of the series, which is the Fourier transform of the probability density function can be approximated by a linear combination of the moments of the time series. If the probability density function is dierent for each fault condition, then fault classi®cation should be possible by using the moments of the time series.
Fig. 6. Statistical moments for the no-fault case. (a) Moments of jz
tj; (b) moments of jdz
tj; (c) cross-moments.
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
167
Representing the data collected by z
t ah
t jav
t
and dz
t z
t ÿ z
t ÿ 1;
10
where ah
t is the vibration signal from the bearing in the horizontal direction and av
t is the signal in the vertical direction, the following moments are computed: · Mean Z 1 z
tp
x dx:
11 Efz
tg g ÿ1
Fig. 7. Statistical moments for the coupling loose case. (a) Moments of jz
tj; (b) moments of jdz
tj; (c) cross-moments.
168
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
· Central moments k
lk Ef
z
t ÿ g g
Z
1 ÿ1
z
t ÿ gk p
x dx;
k 1; 2; 3; . . .
· Cross-moments (cross-correlation) 1 X rxy ax
tay
t ÿ l; l 0; 1; 2:
12
13
tÿ1
The moments thus computed are shown as functions of operating speed in Figs. 6±10.
Fig. 8. Statistical moments for the mass unbalance case. (a) Moments of jz
tj; (b) moments of jdz
tj; (c) crossmoments.
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
169
The network architecture is referred in terms of numerals as (i; j1; j2; . . . ; jn; k), where i denotes the number of neurons in the input layer, jn is the number of neurons in the nth hidden layer and k denotes the number of output neurons. The computed moments are normalized between 0.1 and 0.9 before feeding them to the network. The target output, for training, is made to be a ®ve-dimensional vector. The dimension of the target output vector is equal to the number of primary faults for which the data are available. Since, ®ve types of primary faults were introduced in the
Fig. 9. Statistical moments for the bearing cap loose case. (a) Moments of jz
tj; (b) moments of jdz
tj; (c) crossmoments.
170
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
Fig. 10. Statistical moments for the misalignment case. (a) Moments of jz
tj; (b) moments of jdz
tj; (c) cross-moments.
laboratory rig, the dimension of the target output vector was made to be ®ve. The sixth fault is a combination of mass unbalance and misalignment. The target output vector is chosen to have a value equal to either 1 or 0, in each dimension. A value equal to 1 in a dimension is symbolic of the presence of a particular fault, while a value equal to 0, signi®es its absence. It should be noted that `no fault' condition is treated as a fault, and is present in all other cases too. However, the indication of the presence or absence of fault by 0 or 1, is notional. For simplicity the target vectors for the various cases, have been kept as
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
171
Table 1 Training dataa No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Input vector
Output vector
m1
m2
m3
m4
m5
NF
COU
MU
BCL
MA
3.825 9.104 1.058 3.073 1.348 3.144 1.062 2.725 1.127 4.481 4.063 3.338 3.452 4.282 4.920 4.174 2.024 4.490 3.669 3.605
5.497 35.538 0.628 5.475 1.505 5.434 0.546 5.196 0.695 7.322 7.876 5.774 5.385 7.427 6.883 7.779 4.243 6.139 5.560 5.023
50.216 407.055 3.292 38.027 7.460 37.044 3.242 33.661 3.972 71.231 64.686 40.471 43.947 62.638 70.051 68.666 24.412 65.094 48.572 43.976
201.730 3702.857 5.597 163.432 22.315 150.688 4.753 150.329 7.054 329.811 338.588 182.959 173.109 267.333 251.523 334.536 120.046 253.583 204.294 166.848
1356.723 40905.828 19.738 974.312 91.944 858.668 17.142 866.723 26.745 2471.105 2596.214 1142.190 1057.272 1737.198 1715.546 2387.193 716.274 1866.117 1369.507 1082.105
0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 1.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
1.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 1.000 0.000 1.000 1.000 1.000 0.000 0.000 0.000 1.000 0.000
0.000 0.000 1.000 0.000 0.000 0.000 1.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000
1.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 1.000 1.000 0.000 1.000 0.000 0.000 1.000 0.000 1.000 1.000 1.000
a
m1±m5: moments; NF: no fault; COU: coupling loose; MU: mass unbalance; BCL: bearing cap loose; MA: misalignment.
No-fault case (NF) Play in spider coupling (COU) Mass unbalance (MU) Bearing cap loose (BCL) Misalignment (MA)
10000 01000 00100 00010 00001
and have been trained accordingly. The vector size could, however, be kept small, for example, by choosing the set of target vectors as NF 0 0 0; COU 1 0 0 etc. The training scheme with the central moments of jz
tj, alone is described in Table 1. Six hundred such data are used for training the network. Initially, all the weights in the network are adjusted between )0.5 and 0.5 randomly. Once the learning process starts, the neural network is so designed that the weights and the thresholds between dierent layers adjust automatically, so as to minimize the mean square error, between the actual network output and the targeted output. In minimizing the error, the other network parameters like moment, learning rate, number of hidden layers and nodes in each layer are adjusted. The network algorithm is trained for dierent architectures and training schemes. The training process was started with the mean and the second to ®fth central moments of jz
tj. The network needs to have a (5; j1; j2; . . . ; jn; 5) architecture ± since there are ®ve input nodes,
172
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
each node corresponding to a moment input; and an output layer consisting of ®ve nodes, each node corresponding to one of the ®ve primary faults under consideration. The number of hidden layers can be varied. The number of neurons in each hidden layer can also be varied. A total of 600 samples, comprising of 100 samples for each fault, was used for the training the network. A numerical value is required initially, to quantitatively de®ne the accuracy of the training process. A training process will be successful if it is able to correctly identify the fault. The degree of such accuracy, called the target error, is the mean square dierence between a target value (0 or 1) and the achieved output. To start with, the achievable accuracy was arbitrarily chosen as 0.01. The output vectors which are generated by the network do not always consist of exactly either 0 or 1. In other words, if a very high degree of target accuracy is to be achieved, an enormous amount of iterations of the forward and backward pass will be required. Therefore, the number of iterations is kept to a manageable level by considering a dimension value greater than or equal to 0.6 as 1 and similarly one less than or equal to 0.3 as 0. If the vector consists of a quantity less than 0.6 for all variables in vector, prediction of fault is ambiguous. Tables 2 and 3 describe the training success of dierent architectures with ®ve moments of jz
tj alone and jdz
tj alone, respectively, as inputs. An attempt was made to improve the convergence by training with the moments of jz
tj and jdz
tj simultaneously. For a
10; 10; 5 architecture (the input layer now has 10 neurons, since there are 10 input vectors), the training process was carried out with ± number of iterations 2000, learning rate coecient g 0:1 and momentum a 0:9. The values of g and a were then changed to achieve a network which was able to correctly identify the faults in about 68% of the cases. Table 4 gives the results obtained for such tests. Since the results were encouraging, an additional hidden layer with 10 neurons was introduced. The architecture now was
10; 10; 10; 5. The performance of this architecture was noted for Table 2 Testing success with ®rst ®ve moments of jz
tja
a
No.
Network architecture
Testing success (%)
1 2 3 4 5
5; 5; 5 5; 10; 5 5; 5; 5; 5 5; 6; 6; 5 5; 8; 8; 5
35.0 47.5 41.0 42.5 42.5
Number of iterations 2000, learning rate coecient g 0:1, momentum a 0:9.
Table 3 Testing success with ®rst ®ve moments of jdz
tja
a
No.
Network architecture
Testing success (%)
1 2 3 4 5
5; 5; 5 5; 10; 5 5; 5; 5; 5 5; 6; 6; 5 5; 8; 8; 5
50.0 52.5 57.5 60.0 57.5
Number of iterations 2000, learning rate coecient g 0:1, momentum a 0:9.
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
173
Table 4 Performance of
10; 10; 5 architecturea
a
No.
Network architecture
Moment
Learning rate
Testing success (%)
1 2 3 4 5
10; 10; 5 10; 10; 5 10; 10; 5 10; 10; 5 10; 10; 5
0.90 0.90 0.80 0.70 0.80
0.10 0.20 0.10 0.20 0.25
60.00 65.00 65.00 61.25 68.67
Number of iterations 2000.
various learning rates. A learning rate of 0.25 generated, with a 0:80, the highest success of 83%. The performance of this training scheme is given in Table 5. Training was further attempted for data comprising of the mean, the second-, third- and fourth-order central moments of horizontal and vertical displacements along with six crossmoments up to the fourth-order. The number of input nodes in this case becomes 14, while the output nodes remain equal to 5. Architectures with two hidden layers were trained and tested. The number of neurons in each hidden layer was kept variable. It was found that the best results were obtained with 14 neurons in each hidden layer. Increasing the number of neurons further resulted
Table 5 Network and target outputs for
10; 10; 10; 5 architecturea No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 a
Network output
Actual output
NF
COU
MU
BCL
MA
NF
COU
MU
BCL
MA
0.009 0.000 0.003 0.000 0.232 0.018 0.000 0.006 0.953 0.000 0.000 0.995 0.030 0.000 0.093 0.090 0.000 0.081 0.009 0.034
0.001 0.003 0.000 1.000 0.000 0.001 1.000 0.003 0.003 1.000 1.000 0.002 0.002 0.000 0.002 0.002 1.000 0.002 0.000 0.871
0.065 0.595 0.058 0.001 0.222 0.038 0.000 0.640 0.048 0.001 0.000 0.014 0.663 0.000 0.679 0.679 0.001 0.677 0.944 0.021
0.677 0.023 0.000 0.000 0.077 0.827 0.000 0.004 0.001 0.000 0.000 0.003 0.001 1.000 0.001 0.001 0.000 0.001 0.005 0.001
0.064 0.860 0.832 0.000 0.000 0.029 0.000 0.710 0.016 0.000 0.000 0.002 0.598 0.000 0.510 0.513 0.000 0.521 0.000 0.041
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 1.000 0.000 0.000 1.000 0.000 0.000 1.000 1.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 1.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 1.000 1.000 0.000 1.000 1.000 0.000
1.000 0.000 0.000 0.000 1.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 1.000 1.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 1.000 0.000 1.000 1.000 0.000 0.000 0.000 0.000
Number of iterations 2000, learning rate coecient g 0:25, momentum a 0:8.
174
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
Table 6 Testing success for 14 input-architecturea
a
No.
Network architecture
Moment
Learning rate
Testing success (%)
3 2 5 6 8
14; 14; 14; 5 14; 14; 14; 5 14; 14; 14; 5 14; 14; 14; 5 14; 14; 14; 5
0.90 0.90 0.90 0.90 0.80
0.10 0.15 0.20 0.40 0.10
82.85 80.00 87.14 88.57 91.42
Number of iterations 2000.
Table 7 Network and target outputs for the 14 input-architecturea No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 a
Network output
Actual output
NF
COU
MU
BCL
MA
NF
COU
MU
BCL
MA
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.602 0.000 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.004 0.025 0.000
0.000 0.000 0.000 1.000 0.000 0.079 1.000 0.000 0.000 1.000 1.000 0.003 0.000 0.003 0.000 0.000 1.000 0.000 0.000 1.000
0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.157 0.409 0.000 0.000 0.000 0.984 0.000 0.766 0.700 0.000 0.005 0.610 0.000
0.999 0.013 0.000 0.000 0.999 0.960 0.000 0.002 0.000 0.000 0.000 1.000 0.000 1.000 0.000 0.006 0.000 0.938 0.431 0.000
0.000 1.000 1.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 1.000 0.000 0.996 1.000 0.000 0.000 0.341 0.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
0.000 0.000 0.000 1.000 0.000 0.000 1.000 0.000 0.000 1.000 1.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 1.000
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 1.000 1.000 0.000 0.000 1.000 0.000
1.000 0.000 0.000 0.000 1.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 1.000 0.000 0.000
0.000 1.000 1.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 1.000 0.000 1.000 1.000 0.000 0.000 0.000 0.000
Number of iterations 2000, learning rate coecient g 0:1, momentum a 0:8.
in overtraining and decreased the success rate. The learning rate and moment were now varied and the network performance was recorded. Table 6 gives this record. The tests have been carried out for a number of combinations of the learning rate coecient g and momentum a, other than those reported in Table 6. The testing success is not found to be any linear function of these two training parameters and their in¯uence needs to be investigated further. However, for the tests carried out the best results are obtained for the case of learning rate coecient g 0:1 and momentum a 0:8, as reported in Table 6. The success rate here is 91% and the network is found to converge to a mean square error of 0.05. A comparison of the network and the actual output for fault identi®cation in this case is shown in Table 7.
N.S. Vyas, D. Satishkumar / Mechanism and Machine Theory 36 (2001) 157±175
175
4. Remarks The study illustrates the eectiveness of the arti®cial neural network procedures for fault diagnosis in a rotor-bearing system. The simulator is found to identify an unknown fault to a good degree of accuracy. However, no attempt has been made in this work on quanti®cation of the fault, once it is identi®ed (e.g. estimate of the amount of unbalance, if the fault is identi®ed as unbalance). The focus, presently, was to generate data for healthy and faulty rotor systems and develop a preliminary neural network diagnosis frame. It has been found that the testing success in addition to the input and hidden layer architecture, is crucially dependent on the two training parameters, namely the learning rate coecient and the momentum. These parameters do not show a linear pattern of behavior and their role needs to be investigated further. References [1] [2] [3] [4] [5]
M.D. Childs, Turbomachinery Rotordynamics, Wiley, Chichester, 1993. M.F. Dimentberg, Statistical Dynamics of Nonlinear and Time-varying Systems, Wiley, Chichester, 1998. F.F. Ehrich, Handbook of Rotordynamics, McGraw-Hill, New York, 1992. S. Haykin, Neural Networks: A Comprehensive Foundation, Macmillan, New York, 1994. A.C. McCormick, A.K. Nandy, Real-time classi®cation of rotating shaft loading conditions using arti®cial neural networks, IEEE Transactions on Neural Networks 8 (3) (1997) 748±757. [6] N.C. Nigam, Introduction to Random Vibrations, MIT Press, Cambridge, MA, 1983. [7] J.S. Rao, Rotor Dynamics, third ed., Wiley Eastern, New Delhi, 1996. [8] J.S. Sohre, Turbomachinery problems and their correction, Standardisation and Condition Monitoring Workshop, Chapter 7, Houston, 1991.