16th European Symposium on Computer Aided Process Engineering and 9th International Symposium on Process Systems Engineering W. Marquardt, C. Pantelides (Editors) © 2006 PubHshed by Elsevier B.V.
Fault Detection and Diagnosis of Pulp Mill Process Gibaek L e e \ Thidarat Tosukhowong^, Jay H. Lee^ Chungju National University, Chungju 380-702, Korea^ Georgia Institute of Technology, Atlanta, GA 30332, U.S.A.^ Abstract The hybrid fault diagnosis method based on a combination of the signed digraph and the partial least-squares (PLS) has the advantage of improving the diagnosis resolution, accuracy and reliability, compared to those of previous qualitative methods, and of enhancing the ability to diagnose multiple fault. In this study, the method is applied for the fault diagnosis of the pulp mill process, which is used to produce pulp from wood chips. It is one of the biggest processes tested in the fault diagnosis area, and it will be a new benchmark process to test the fault diagnosis method. In order to consider large time delay in the process, the diagnosis model was modified to include the information of the time delay between process vaiables. Through case studies, the proposed method demonstrated good diagnosis capability compared to the previous hybrid method not considering time delay. Keywords: fault detection, fault diagnosis, partial least squares, pulp mill process, transporation lag 1. Introduction In order to improve the safety of the chemical plant and their plant personnel, automatic fault diagnosis system analyzes process data on-line, monitors process trends, and diagnoses faults when an abnormal situation arises. Among a variety of fault diagnosis approaches for chemical processes, expert system, state estimation such as observer and EKF (extended Kalman filter), signed digraph (SDG), fault tree, qualitative simulation, statistical method, and neural network have been developed\ Tennessee Eastman challenge process^ created by the Eastman Chemical Company has been used for evaluating fault diagnosis methods during the past several years. Recently, Castro and Doyle introduced a pulp mill simulator as a benchmark process for process system engineering studies^. The simulated process can be an alternative realistic testbed for plantwide fault diagnosis methods because it includes approximately 8200 states, 140 inputs, 114 outputs, and 6 operating modes. Also, the process shows common characteristics of the industrial procesess such as long measurement delay, many components, several recycle streams, and high nonlinearity. Safe and reliable operation of pulp mill process has been important for survival in a very competitive international market. Although automatic fault diagnosis systems are in demand, to our knowledge there currently exists no literature on fault detection and diagnosis of a pulp mill. For the fault diagnosis of the process, this study uses the PCA (principal component analysis), which is compared with the hybrid method combining SDG and the PLS"^. The hybrid method has the advantages of improving the diagnosis resolution and accuracy compared to previous qualitative methods. Moreover, it enhances the reliability of the diagnosis for all predictable faults, including multiple fault. Although it is based on statistical process data, it allows the diagnosis model to be built based on easily obtainable data sets, and does not require faulty case data sets.
1461
1462
G.Leeetal
2. Pulp Mill Process 2.1. Process Description Pulp mills produce pulp of a given Kappa no. or brightness from wood chips while satisfying the criteria of productivity and environmental impact. The Kappa no., which is a major variable of interest, is a measure of the amount lignin remaining in the wood. Pulp mills can be divided into two major sections of the fiberline and the chemical recovery. It includes key unit operations such as digester, brown stock washers, oxygen reactor, bleach plant, evaporators, recovery boilers, smelt dissolving tank, green liquor clarifiers, mud washers, causticizers, white liquor clarifier, and lime kiln. The fiberline removes lignin from the wood to achieve fibers of a certain brightness and strength target. It uses chemicals such as NaOH, NaSH, oxygen and CIO2. The process is sequential from the digester to brown stock washers, and the bleaching plant. Wood chips enter the digester with the white liquor (WL, mixture of NaOH and NaSH). Pulps delignified in the digester are sent to the brown stock washer where dissolved lignin and chemicals are removed and sent to chemical recovery area to regenerate NaOH and NaSH from the spent liquor. The final operations in the fiberline are to further delignify and brighten the pulp in the bleaching section of an oxygen reactor (O) followed by a chlorine dioxide tower (Di), a sodium hydroxide tower (E), and the second chlorine dioxide tower (D2). 2.2. Process and Fault Simulation The pulp mill simulator developed by Castro and Doyle^ is used for this study. It is written in the C language using Matlab® s-function format with Simulink®, and can be downloaded from the homepage of Doyle's group"^. This study modified the simulator to get more realistic fault diagnosis problem as follows. At first, white noises were added to the simulator input and output variables. The maximum and minimum values of the noise are ±0.1% of the maximum expected errors or changes of the variables, which were specified in the original simulator. 114 measured output variables and 82 controller outputs are used for fault diagnosis. It is assumed that the measurements of manipulated variables and disturbances are not available. The name of the process variables are formed as adding the type of the variables to the consecutive number of the variables. For instance, KN4 means the Kappa number sensor of y(4). Also, the simulator has been modified to include sensor or control valve faults. The faulty sensors and control valves were chosen with regard to the most important variables of the process. 4 sensors of digester Kappa no., final brightness of the pulp in the D2, [0H-] concentration before the E washer, D2 production rate were selected, and 3 sensor faults of bias (-5% and +20%), precision degradation {±\% and ±10%), and drift (0.025%o x time) were simulated for the sensors. Also, 4 control valves of wood chips flow, Dl caustic flow, caustic flow 3, and oxygen WL flow. For each sensors, 2 valve faults of bias (P/o and -5%) and sticking were defined. In addition, 5 disturbances already included in fiberline of the original simulator became the target faults. They are the variations in wood temperature and densities, changes in operating temperature and compositions of the caustic in the E tower, and change in the wash water temperature of the E tower, and the number of target faults is 37. Adding the type of fault to the name of the equipment composes the name of the fault. For example, KNmSBias means -5% bias for the digester Kappa number sensor. The sampling interval is 5 minutes and the total simulation time is 1500 minutes. Fault or setpoint change occurs at 600 minutes.
Fault Detection and Diagnosis of Pulp Mill Process
1463
3. Fault Diagnosis Method 3.1. System Decomposition based on SDG The first step of building fault diagnosis model is the system decomposition centering on measured variables in SDG. In SDG, each arc represents the instantaneous effect produced from the source node to the target node. All source nodes connected to a particular target node by means of the arcs have a direct influence on that target node. That is, only the source nodes connected to a target node can affect the particular target node. Because unmeasured nodes among the source nodes cannot demonstrate the direct effects from faults, unmeasured nodes are removed, resulting in the reduced digraph that contains only the measured nodes of the original SDG. Each decomposed subprocess includes a central measured variable (target variable) as well as measured variables (source variables) and faults connected to the target variable. In order to diagnose 37 faults defined in the pulp mill process, the process is decomposed centering on 16 measured variables and the reduced digraph for the decomposed subprocess is obtained. The variables are Fl, F3, KN4, CD 10, T12, T15, and T16 around the digester, KN19, T21, KN22, T23, OH24, T25, BR26, and P37 around the bleaching plant, and CI22 in the recausticizing section. Local Fault diagnosis can be performed for each decomposed subprocess. Because fault diagnosis is locally executed for each target variable, the fault diagnosis method using the system decomposition can diagnose all types of multiple faults except for those multiple faults that affect the same measured nodes. 3.2. Fault Diagnosis based on Dynamic PLS Models This simple diagnosis on the based on the decomposition technique is to estimate the value of each target variable using the measured values of the source variables connected to the target variable. A substantial difference between the estimated and measured values implies the occurrence of one or more faults. The sensor faults occurred in the sensor corresponding to the source variables used for the estimation, produce errors in the estimated values. The faults added to the target node give rise to errors in the measured values. The estimation of our previous study used the PLS model built for each decomposed subprocess. The input X of the model contains the source variables connected to the target variable, and the output Y is the estimated value of the target variable. To handle the process dynamics accurately, we used DPLS that is integrated with ARMAX. In addition to the past values of the source variables, the resulting input of DPLS for a target variable includes the past values of the target variable, as well as the source variables. The necessary number of past values (time lags) / and the principal components (PC) are determined from the learning data. The number / is usually 1 or 2 which indicates the order of the dynamic system. Each DPLS model can be built from the operation data set representing local relations between the input variables and the output variable of the DPLS models. Therefore, the required data set for each DPLS model can be easily obtained. The available data sets can be obtained in the presence of set-point changes or external disturbances, which occur frequently. Therefore, the proposed method does not need a faulty case data set, which would otherwise be difficult to obtain. Fault detection is performed by the observation of the residual, which is the difference between the estimated value determined by the DPLS model and the measured one.
^=J/-i>/
(1)
1464
G. Lee et ah
Where r/ is the residual of output variable /, and yt and y. are the measured and estimated values of variable /, respectively. A qualitative state, which corresponds to ranges of possible values for the residual, becomes an attribute of the residual. We will consider methods that use three ranges: low, to which the qualitative state (-) is assigned; normal, assigned (0); and high, assigned (+). If a fault occurs, the qualitative state for the residual may be (+) or (-). The abnormal qualitative state for the residual becomes a symptom, which is expressed as the pair of the target variable and the qualitative state of the residual. Those faults inducing the abnormality of each residual are classified along with their symptoms, and the classified faults are stored in a set (called a fault set). Also, faults can be classified into two types: one is the faults added to the target variable and the other is the sensor faults that occur in the sensor corresponding to the source variables in the DPLS model. The first step of on-line fault diagnosis is the monitoring of the residuals, in order to detect their qualitative change of state. The detected residual of a variable becomes an element in the set. The fault of sensor degradation can make the signs of the symptoms fluctuate between (+) and (-), which greatly decreases the diagnosis accuracy. In order to make a stable diagnosis, CUSUM monitors the squared residuals as well as the residuals of each variable. The next step of fault diagnosis is to obtain the minimum set of faults that can explain all of the detected symptoms. 3.3. Incorporation of Time Delay into the DPLS Model Time delay can be defined as the time interval between the start of an event in one point and its resulting action at another point. It is also referred transportation lag, time delay, dead time or distance-velocity lag. In the target process, large time delay can be found in the equipments such as the digester, the storage tank, and the D2 tower. As the model assumes that the change of the source variable effects instantly the target variable, these large time delay may make the estimation model inaccurate. In order to increase the accuracy of the estimation, the information on the time delayfi*omthe input variables to the output variables should be incorporated into the dynamic PLS model. The input matrix, X of the DPLS model for the variable / is modified as follows. x..(/t
+ T
- T^^^ •
0,y }
(2)
x=
^iN:
Where, T
V "*" ^MD,Ni
MD,i
^^MAX(t
^MD,i
-lAt^\
^Ti
is the measurement delay of /, r ^ is the time delayfi-om/ to /, and t ^ . •'
TD,ij
•'
J
7
" MD,i
, r Twr. )• The measured value of variable / to calculate the residual / is
\ MD,j=\,Ni'
MD,i I
yk'^'^MDi~'^1!^i)' ^^^ study used the data obtained with the set-point change in order to determine the dead time.
Fault Detection and Diagnosis of Pulp Mill Process
1465
4. Result and Discussion Consider the example of Dl Stuck (sticking of Dl CIO2 flow control valve). While the +10% setpoint change of wood chips flow at 600 minutes, the Dl CIO2 flow control valve stuck also at the same time. Using the DPLS model, the detection sequence of symptoms is KN22(+) and KN22^ from 915 minutes, T21(+) from 920 minutes, T21^ from 935 minutes, BR26^ from 1185 minutes, and BR26(-) from 1190 minutes (Figure 1 (a)). The bounds of Figure 1 are the minimal jump size of CUSUM. There is no false detection, and DlCm5Bias, DlCStuck, WLlBias, WLStuck, and WoodDens obtained as the solution from 920 minutes. Although the resolution is 5, operator may easily judge that WLlBias, WLStuck, and WoodDens are not fault candidates because there are no detected variables in the digester and oxygen reactor section. Also, the resolution can be reduced by using the dynamics of the residuals. Figure 1 (b) shows the residual obtained by the previous hybrid model without time delay. KN22(+) is detected from 995 minutes, KN22^ and T21(+) from 1005 minutes, T21^ from 1035 minutes, BR26^ from 1055 minutes, and BR26(-) from 1105 minutes. The detection is 80 minutes later than the proposed method with time delay. Figure 1 shows evidently that the models with time delay generate clearer and more accurate residuals denoting fault occurrence than the ones without time delay. 0.05 0.04 H o
g 0.02
0.03
•2 0.01
1 0.02 'So
f2
'% 0.01 0 -0.01
, 500
700
0 -0.01
900 1100 Time (min)
500
1300 1500
700
900
1100
1300
1500
1300
1500
Time (min)
(a)
700
900
1100
1300
900
1500
1100
Time (min)
Time (min)
(b)
Fig. 1 (a) residuals obtained by the DPLS models with time delay, (b) residuals obtained by the DPLS models without time delay for DlCStuck.
1466
G. Lee et al Table 1. Diagnosis result
fault BR0025Drift BRlODeg BR20Bias BRm5Bias CFmSBias DlCStuck DlCmSBias EbackTemp ECausComp ECausTemp KN0025Drift KNlODeg KN20Bias KNmSBias
detection delays 215/40/560 20/20 20/65/65 315/395 20/20 15/15 65/65 15/15 115/105 55/45 35/25 35/35
accuracy
fault
lA 1/1 1/1 1/0 1/1 1/1 0.99/0.94 0.22/0.1 1/1 1/1 1/1 1/1 1/0.97 1/1
OH0025Drift OH20Bias 0Hm5Bias PR0025Drift PR20Bias PRm5Bias WClBias WCStuck WCm5Bias WLlBias WLStuck WLm5Bias WoodDens WoodTemp
detection delays 65/85 25/25 25/25 145/175 35/35 45/45 35/35 35/35 15/35 25/25 10/10 10/20 110/60 115/70
accuracy
iTl 0.99/1 1/1 1/1 1/1 1/1 1/1 1/1 0.02/0.0 1/0.94 1/1 0.85/0.28 0.26/0.17 1/1
To compare the diagnostic performance, accuracy and detection delay are used. The accuracy is 1 if the diagnosis is accuate; that is, the true fault is included in the final fault candidates set. Otherwise, the accuracy is 0. The detection delay refers to the time from fault occurrence to fault diagnosis. In Table 1, the former performance parameter value is the diagnosis result obtained by the proposed method, and the latter is the one by the previous hybrid method not considering time delay. The 1% faults of BRlDeg, CFlBias, DlClBias, KNlDeg, and PRlDeg are too small to be diagnosed by two methods, and are not shown in Table 1. In addition, the method failed to diagnose CFStuck and PRlODeg. Though T15 is independent with WCm5Bias and WoodDens, the proposed method detected wrongly T15 for these faults. Because WLStuck can explain all detected symptoms, the accuracy was very low. The diagnostic performance by the new method for all cases except WoodDens and WoodTemp are much better than the previous one. The faster detections of two cases are due to fast and wrong detection of T12. In WoodTemp case, wood temperature increases and the symptom of T12(+) should be detected. However, the previous method detected T12(-), and the detection was earlier than the new method.
Acknowledgement This work was supported by grant No. (R05-2002-000-00057-0) from the Basic Research Program of the Korea Science & Engineering Foundation.
References [1] V. Venkatasubramanian, R. Rengaswamy, K. Yin and S.N. Kavuri, Comp. Chem. Engng., 27, (2003) 293 [2] JJ. Downs and E.F. Vogel, 1993, Comp. and Chem. Engng., 17, (1993) 245 [3] J.J. Castro and F.J. Doyle III, Journal of Process Control, 14, (2004) 17 [4] G. Lee, S.-O. Song, and E.S. Yoon, 2003, Ind. Engng. Chem. Res., 42, (2003) 6145 [5] http://www.chemengr.ucsb.edu/~ceweb/faculty/doyle/docs/benchmarks/mill