V ol. 18
No. 1
CH INES E JO U RN A L O F AER ON A U TICS
February 2005
The Application of Support Vector Machines to Gas Turbine Performance Diagnosis 1, 2
1
1, 3
2
HAO Ying , SUN Jian guo , YANG Guo qing , BAI Jie ( 1 . Coll ege of Energy and Pow er Engineering, N anj ing University of A er onautics and A str onautics , N anj ing
210016 , Chi na)
( 2 . A eronaut ical Mechanics and A vionics Engineeri ng College, Ci vil A viat ion University of Chi na, T i anj in 300300 , China) ( 3 . Ci vil A viat ion A dmi nistr ati on of Chi na , Beij i ng Abstract:
100000 , Chi na)
SV M s( support v ector machines) is a new artificial intelligence methodolog y der ived from
Vapnik s statistical lear ning theor y, which has better gener alization than artificial neural networ k. A C support vector classifiers Based F ault Diagnostic M odel ( CBF DM ) w hich gives the 3 most possible fault causes is constructed in this paper . Five fold cross validation is chosen as t he method of model selection for CBFDM . T he simulated data are generated from PW4000 94 eng ine influence coefficient matrix at cruise, and the results show that the diagnostic accuracy of CBF DM is over 93% even when the standard dev iation of noise is 3 t imes larg er than the no rmal. T his mo del can also be used for other diagnostic problems. Key words:
aerospace propulsio n system; per for mance diagnosis; suppo rt vector machines; mo del se
lection 支持 向量机 在燃气 涡轮性能 诊断中 的应用. 郝 英, 孙 健国, 杨国 庆, 白杰. 中国航 空学报 ( 英 文 版) , 2005, 18( 1) : 15- 19. 摘 要: 由 V apnik 统计学习理论得到的支持向量机是一 种新的人工 智能方法, 它具有 比人工神 经 网络更好 的泛化性。文中构建了一种基于 C SVC 的故障 诊断模 型( CBF DM ) , 并 采用 5 重交叉 验 证法来选 择模型参数, 该模型可给出 3 个最可能的故障原因。利用 PW4000 94 发动机巡航态 影响 系数矩阵产生仿真数据, 对 CBF DM 研究结果表明, 即 使在噪声 级别为正 常情况 下的 3 倍 时, 该 模 型诊断准确率仍超过 93% 。该诊断模型也可用于其它领域诊断问题。 关键词: 航空、航天推进系统; 性能诊断; 支持向量机; 模型选择 文章编号: 1000 9361( 2005) 01 0015 05
中图分类号: V235. 1
文献标识码: A
Gas t urbine eng ine condit ion monit oring and
mum, genetic algorithm w as used by Zedda[ 3] . In
performance diagnosis are a usef ul tool t o realize t he
the last t wo decades, artif icial neural net works
on condition maintenance. T he goal of gas t urbine
( ANN) [ 4] are w idely used in gas t urbine perf or
performance diagnosis is t o accurately detect, iso
mance diagnost ics[ 5] .
lat e and ident ify the fault s. Gas pat h analysis( GPA) , a linear model based [ 1]
met hod, w as f irst introduced by Urban in 1972
In order t o overcome t he disadv ant ag e of neu ral netw orks, over learning or under learning , a
.
new art if icial intelligence met hod, support vect or
w ere de
veloped to improve GPA. Due to the severe non
machines ( SVM s) w as developed by Vapnik in 1995[ 6] . T he advant age of SVMs is that t hey have
linearity of engine behavior, a non linear model
bet ter generalization than ANN.
[ 2]
Since then, many dif ferent derivatives
based met hod w as introduced in 1990 by Stamat is et al. In order to solve the problem of a local m ini
T he purpose of t his paper is t o investigate t he feasibility of SVM m et hod in gas turbine perf or
R eceived dat e: 2004 06 07; R evision received dat e: 2004 11 15 Foundation it em: Civil A viat ion Science Foundat ion of China ( 2003 193 22) Science Foundat ion of Civil A viat ion U niversit y of China ( 04 CA U C 11E)
16
HAO Y ing, SU N Jian guo, YAN G Guo qing, BA I Jie
mance diagnost ics.
1
CJA
H ow ever for noisy dat a, slack variables are in
T he F undamentals of SVMs
t roduced to balance the training error and t he gen
[ 6]
eralizat ion abilit y of t he decision funct ion. T he
In order to state t he t heory sim ply , f irst a training sample set is assumed t o be separable by a
slack variables change t he hard margin int o a soft
hyper plane( decision funct ion) . T he decision f unc
st raint s equat ion( 3) become n 1 max ( w, !) = ! w !2 + C !i 2 i= 1
t ion is f ( x ) = sgn( wT
x + b)
( 1)
w here w is the normal vect or of the hyper plane, b
margin, so t he opt im izat ion equat ion ( 2) and con
s. t . y i ( wT
xi + b) ∀ 1 - !i , i = 1, 2, #, n ( 7)
is the of fset and x is a pat tern. According to Vapnik s statist ical learning t he ory, the hyper plane wit h t he best generalizat ion is found by solving t he follow ing opt imization 1 !w !2 2
max ( w) = s. t. y i ( w
( 2)
where C is a const ant ( C > 0) determ ining t he trade of f. T he dual problem of Eq. ( 6) w ill still be Eq. ( 4) and t he f ollow ing constraints n
s. t .
xi + b) ∀ 1, i = 1, 2, #, n ( 3)
T
( 6)
0∃
i
∃ C,
i
yi
i
= 0
( 8)
i= 1
T he opt im al decision funct ion is
w here y i is the label of pat t ern x i and n is t he
n
number of t raining samples.
f * ( x ) = sgn
iy ik (
x, x i) + b
( 9)
T he above problem can be chang ed int o its du al problem, and aft er kernel subst itution, it is got
When t he SVMs w ith trade off constant C is used
t en t hat
for classificat ion, it is also called C SVC ( C Sup port Vector Classifier ) . T his paper now uses C
n
max Q( ) =
ii= 1
1 2
n
n
y iy j k ( x i , x j )
i j
i= 1 j = 1
i= 1
SVC t o diag nose g as turbine performance f aults.
( 4) n
0 ∃
s. t .
i,
i
yi
i
= 0
( 5)
i= 1
w here k( x i , x j ) = ( plicit mapping and g range mult ipliers.
( x i) , i
( xj )),
is an im
( i = 1, 2, . . . , n ) are La
Since t he training sample set is usually non linearly separable, here, an implicit mapping
Fig. 1
An implicit non linear mapping intr oduced by kernel substitutio n
is
introduced, w hich maps t he t raining dat a in input space into a higher dimensional feature space ( see F ig!1) . T his is t he kernel substit ut ion t echnique
2
PW4000 94 Engine Fault Diagnosis PW4000 94 engine faults can be classif ied int o
w hich can convert a non linear problem in input space into a linear problem in a hig her dimensional
tw o t ypes: module perf orm ance loss and syst em/
feature space. For ex ample, it is know n t hat st an
instrument at ion malf unct ion. When a module de
dard PCA ( principle component analysis) in input
g rades, its eff iciency and flow capacity w ill change
space is a linear project ion and not suit able to ex
simult aneously. By st at ist ical analysis, Prat t &
t ract t he non linear structure of a dat a set . By
Whitney get s t he couple factors of modules .
means of kernel subst itution, how ever, the st an dard P CA in feature space w hich is called kernel
T able 1 show s the couple factors of PW4000 engine modules. So one parameter, module performance
PCA can well ext ract t he non linear st ruct ure of
loss is used to measure the degradat ion. T here are
the dat a set
[ 7]
.
[ 8]
20 f aults of P W000 94 engine t o be diagnosed.
February 2005
T he A pplicat ion of Support V ect or M achines to G as Turbine Performance Diagnosis
17
T his paper focuses on single fault diagnosis.
are g iven which are ranked by t heir vot es in de
2. 1
scending order. T he archit ect ure of C SVCs Based
Training and testing fault samples
T he quality of the t raining data plays an es sent ial role on t he accuracy of SVMs diagnosis.
Fault Diagnost ic Model ( CBFDM ) is show n in f ig ure 2. T he t raining algorithm of a binary C SVC is
T he smoothed delt as provided by engine condit ion
the sequent ial minimal opt imization ( SMO ) by
monitoring sof tw are are used as t he input parame t ers of t he diagnost ic model.
Plat t
[ 6]
.
F ault samples are generat ed by using PW 4000 94 engine influence coef ficient matrix at cruise. T he formula is st at ed below Sensor dat a = clean data + K ∀ randn ( 10) w here randn is a normally dist ributed random num ber w ith mean zero and variance one, ∀ is the st an dard deviat ion of smoothed data and K is the con t rol paramet er governing the noise level. T he st an dard deviat ions used here are f rom Ref. [ 9]
Not e: SV Ci is t he i t h binary C SV C. VO TE BLOCK ranks t he vot es of each f ault and gives t he
T he t raining and testing samples are normal
3 most possible faults.
ized. H ere, t he isolat ion of single faults is focused on. Table1
N1C2: low pressure rot or speed N2C2: high pressure rotor speed
Couple f actors of PW4000 modules
F ig. 2 T he ar chitecture of C SVCs based fault
M odul e
FC
N ot es
FA N
1. 25
Coupled FAN ( - 1% #, - 1. 25% F C)
LPC
1. 10
Coupled LPC ( - 1% #, - 1. 10% F C )
HPC
0. 80
Coupled HPC ( - 1% #, - 0. 80% F C)
HPT
- 0. 75
Coupled HPT ( - 1% #, + 0. 75% F C)
LPT
- 1. 65
Coupled LPT ( - 1% #, + 1. 65% F C )
N ot e: # is t he ef ficiency of a module. F C is t he flow capacit y of a module.
2. 2
T49C2: exhaust gas temperat ure, W F: fuel f low
Using C SVC in multi class diagnosis T he C SVC discussed in Sect ion 1 is a binary
classifier, w hich means t hat it can only discern one
diagnostic model ( CBFDM )
2. 3
C SVC model selection T o obt ain high diagnostic accuracy, t he pa
ramet ers of a binary C SVC must be properly cho sen w hich include C ( the t rade off const ant ) , t he kind of kernel funct ion and t he kernel coefficient. Since the kind of kernel funct ion has less effect on the support vector machine predictive accuracy, t he radial based funct ion ( RBF ) is chosen in t his
class from anot her. However, PW4000 94 engine has 20 f ault s to be diag nosed. T here are usually
study. T he RBF kernel is
tw o approaches of using binary classif iers to solve
where x i is t he i t h f ault patt ern; ∃ is the kernel
multi class problem , t he one against one approach and one against all approach. Since each binary
coeff icient .
classifier in the second approach must be t rained
k ( x i , x j ) = ex p(- ∃ | x i - x j | 2 )
( 11)
Now t here are tw o paramet ers C and ∃ to be chosen. In t his st udy, cross validat ion via parallel paramet er grid search is used for model select ion.
w it h all samples, it is not suitable for t hose cases w it h many classes. T heref ore, the one ag ainst one
Here the first pick diagnost ic accuracy of t he
approach is used in this study.
CBF DM is used to determine which combination of
In order to diagnose t he 20 classes of engine fault s, 190 binary classifiers ( C SVC ) are con
C and ∃is the best , theref ore t he all 190 C SVCs
st ruct ed. T hen each one gives a vot e, t he f ault w hich get s t he most vot es will be considered as t he most possible fault . Here t he 3 most possible f ault s
have t he same paramet ers. T he training sample set generat ed cont ains 2, 000 samples, each fault has 100 samples and the noise level K = 1. Here 5 fold cross validat ion is performed. C and ∃vary as t he
18
HAO Y ing, SU N Jian guo, YAN G Guo qing, BA I Jie
follow ing
CJA
value ( 93!6% ) t houg h K = 3.
C = 2 , 2 , #, 2 , 1
2
18
∃= 2
- 10
, 2 , #, 2 -9
8
Table 2 The accuracy of 4 testing sample sets( Unit: %)
T herefore t here are 342 combinations w hen per
K= 1
K= 3
f orm ing grid search. T he process of model select ion is time consuming, for most combinat ions, cross
Top2
T op3
FT01
96. 0
100. 0
100. 0
10. 0
100. 0
100. 0
FT02
97. 0
100. 0
100. 0
19. 0
98. 0
99. 0
v alidat ion t akes several minut es or even less, but in
FT03
95. 0
99. 0
100. 0
28. 0
90. 0
94. 0
some cases it may t ake 1 2h or even more. Fortu
FT04
95. 0
99. 0
100. 0
39. 0
88. 0
93. 0
nat ely, t he accuracy can keep high in a relat ively w ide area. T he results are show n in F ig. 3.
FT05
95. 0
99. 0
100. 0
46. 0
85. 0
92. 0
FT06
93. 0
99. 0
100. 0
54. 0
78. 0
87. 0
FT07
92. 0
99. 0
100. 0
57. 0
76. 0
85. 0
FT08
90. 0
100. 0
100. 0
57. 0
78. 0
86. 0
FT09
92. 0
100. 0
100. 0
52. 0
72. 0
83. 0
FT10
91. 0
100. 0
100. 0
51. 0
68. 0
81. 0
FT11
93. 0
100. 0
100. 0
54. 0
67. 0
80. 0
FT12
89. 0
100. 0
100. 0
51. 0
66. 0
83. 0
FT13
91. 0
100. 0
100. 0
48. 0
70. 0
87. 0
FT14
91. 0
100. 0
100. 0
47. 0
74. 0
89. 0
FT15
89. 0
100. 0
100. 0
47. 0
80. 0
91. 0
FT16
88. 0
100. 0
100. 0
46. 0
83. 0
96. 0
FT17
93. 0
100. 0
100. 0
45. 0
83. 0
94. 0
FT18
95. 0
100. 0
100. 0
44. 0
80. 0
92. 0
FT19
94. 0
100. 0
100. 0
46. 0
81. 0
93. 0
FT20
96. 0
100. 0
100. 0
40. 0
77. 0
93. 0
A verage 95. 0
99. 9
100. 0
61. 7
85. 8
93. 6
Fig . 3 T he contour plot of the accuracy
T op1
T op2
Top3
Top1
N ot e: K is t he noise level.
As show n from F ig . 3, t he area w it h hig h ac curacy is located near a line of log 2 ( C ) log 2 ( ∃) .
Table 3
Lin s study [ 10] shows the similar results w hen he
N o.
The CBFDM s output of 5 FT01 samples( K = 2) 1st Pick
2nd Pick
3rd Pick
Label
Vot es
Label
V otes
Label
V ot es
st udied the asympt otic behaviour of parameters such as C and ∃. Af ter grid search, C = 16 and ∃
1
FT 01
19
FT 18
18
FT07
17
2 3
FT 18 FT 18
19 19
FT 01 FT 01
18 18
FT07 FT07
17 17
= 2 are chosen. T he predict ion accuracy of t his
4
FT 18
19
FT 01
18
FT07
17
combination is 95!35% .
5
FT 18
19
FT 01
18
FT07
17
2. 4
Fault diagnosi s and the analysis of resul ts Aft er model selection, t he whole training sam
ple set w hose K is 1 is used to ret rain t he CBFDM. T hen, four test ing sample set s w ith dif ferent noise
T he cause that results in t he decrease of accu racy of T op 1 is t hat t he f eature of f ault pat tern is dist orted as K increases. L et F T 01 be taken as an example to ex plain it. T able 2 show s that t he T op
levels ( K = 1, 1!5, 2, 3) are generated to test t he CBF DM. Each testing sample set contains 2, 000
1 accuracy of FT 01 is only 10% w hen K = 3.
samples and each f ault has 100 samples. T he re
f orm ance loss w hile FT 18 is FAN discharge in
sult s ( K = 1, 3) are show n in T able 2 and t he f irst
crease or reverser leak. T able 3 gives the predict ive output of 5 samples belonging to FT 01 randomly
pick ( T op1) , t he T op2 and T op3 accuracies are g iven.
FT 01 is very similar t o FT 18. FT 01 is FAN per
select ed f orm t he sample set ( K = 2) . For No. 2
T able 2 shows t hat
sample, t he out put of t he CBFDM is t hat FT 18
1) f or K = 1, t he accuracy of T op 1 is 95.
get s 19 vot es, F T 01 get s 18 vot es and F T 07 get s
0% . As the noise level K increases, t he accuracy of T op 1 decreases sig nificant ly.
17 vot es. T hat show s t hat high noise level distort s the feature of fault pat t ern, how ever, t he accuracy
2) t he accuracy of T op 3 still rem ains at high
of t he T op 3 of CBFDM st ill remains at hig h level.
February 2005
3
T he A pplicat ion of Support V ect or M achines to G as Turbine Performance Diagnosis
现状与展望[ J] . 航空动力学报, 2003, 18( 6) : 753- 760.
Concluding Rem arks
Hao Y, Sun J G, Bai J. St at e of t he art prospect of aircraf t
T his paper has present ed an applicat ion of t he
engine f ault diagnosis using gas path paramet ers[ J] . Journal of A erospace Pow er, 2003, 18( 6) : 753- 760. ( in Chinese)
support vector machines t o aircraf t eng ine perf or mance diagnosis. T he conclusions drawn from t his
[ 6]
st udy are as follow s:
[ 7]
1) T he support vector m achines based method is inherent ly nonlinear by int roducing a proper ker
scare or lacking. 2) Model select ion is of very importance to t he
accuracy can keep high in w ide area. 3) T he C SVCs Based Fault Diagnost ic Model ( CBFDM ) t hat gives t he 3 most possible fault s is feasible. T his model is also suit able f or other diag nost ic problems.
References [ 1]
U rban L. A gas pat h analysis applied t o turbine engine condi t ion monitoring[ R ] . A IAA 72 1082, 1972.
[ 2]
孙春林, 范作民. 发动机故 障诊断的 主成分 算法[ J] . 航 空 学报, 1998, 19( 3) : 342- 345. Sun C L, Fan Z M . Principle component algorithm f or aero engine diagnosis [ J] . A ct a A eronautica et A stronaut ica S in i
[ 3]
ca, 1998, 19( 3) : 342- 345. ( in Ch inese) Zedda M , Singh R . Gas t urbine engine and sensor fault diag nosis using opt im ization t echniques [ R ] . A IA A 99 2842, 1999.
[ 4]
陈恬, 孙健国, 杨蔚华, 等. 自组织神经网 络航空发动机 气 路故障诊断[ J] . 航空学报, 2003, 24( 1) : 46- 48. Chen T, Sun J G, Y ang W H, et al . S elf organiz ing neural net w orks based fault diagnosis for engin e gas pat h[ J] . Acta A eronautica et Ast ronautica Sinica, 2003, 24( 1) : 46- 48. ( in Chinese)
[ 5]
郝英, 孙健国, 白杰. 航空燃气涡轮发动机气路故障诊 断
S cholkopf B, Smola A, M uller K R. N onlinear component analysis as a kernel eigenvalue problem[ J] . N eural Comput a t ion , 1998, 10( 5) : 1299- 1319.
[ 8]
Pratt & W hitney cust omer t raining cent er. M odule analysis program net w ork ( M APN ET ) t raining guide [ M ] . U SA :
[ 9]
Prat t & W hitey, 1997. Lu P J, Zhang M C, Hsu T H, e t al . An evaluat ion of en gine f aults diagnost ics using art ificial neural netw orks [ A ] . In: Proceedings of ASM E T U RBO EX PR O 2000[ C ] . M u nich, G ermany, AS M E 2000 GT 29. 2000.
accuracy of diagnost ics. Alt hough t he process of model select ion is t ime consuming, fortunat ely t he
Scholkopf B. St at ist ical learning and kernel methods [ R ] . M icrosoft : M SR TR 2000 23, 2000.
nel funct ion and can be used in applicat ions w here severely non linearity ex ists or model informat ion is
19
[ 10]
K eerthi S S, Lin C J. Asympt ot ic behaviors of support vect or m achines w it h G aussian kernel [ J ] . N eural Comput at ion 2003, 15( 7) : 1667- 1689.
Biographies: HAO Ying Born in 1973, he receiv ed M . S. from N anjing U niv. o f Aeronau tics and A stronautics ( N UAA ) in 1997. His research field includes condition mon itor ing & diagnosis, contr ol and the op erational reliability of civil aeroeng ine. T el: 022 24093541, E mail: cauc3541 @ eyou. com SUN Jian guo Born in 1939, a v isiting scholar in Columbia University in 1982 1984, he is a professor and doctoral su perv isor o f N U AA. His research interests include modeling and control of aeroengine, and integrated flight/ pr opulsion control. T el; 025 84893186, E mail: jgsunpe@ nuaa. edu. cn YANG Guo qing Born in 1949, he is a professor and doc toral super visor of N U AA, Vice M inister of Civil A viation A dministration of China. His research interests includes im age pro cessing, neur al networ ks and etc. BAI Jie Born in 1963, he received M . S. from Harbin In st itute of T echnology in 1988. He is a professor and vice president of Civ il Aviation U niversity of China. His research filed includes aeroengine condition monitoring & diagnosis, maintenance manag ement and etc. T el: 022 24092005, E mail: jbai@ cauc. edu. cn