Situation prediction based on fuzzy clustering for industrial complex processes

Situation prediction based on fuzzy clustering for industrial complex processes

Information Sciences xxx (2014) xxx–xxx Contents lists available at ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins...

3MB Sizes 0 Downloads 82 Views

Information Sciences xxx (2014) xxx–xxx

Contents lists available at ScienceDirect

Information Sciences journal homepage: www.elsevier.com/locate/ins

Situation prediction based on fuzzy clustering for industrial complex processes Claudia V. Isaza a, Henry O. Sarmiento b,c,⇑, Tatiana Kempowsky-Hamon d,e, Marie-Veronique LeLann d,e a

Grupo de Investigación SISTEMIC, Facultad de Ingeniería, Universidad de Antioquia UdeA, Calle 70 No. 52 - 21, Medellín, Colombia Grupo de Investigación en Control Automático y Robótica (ICARO), Politécnico Colombiano Jaime Isaza Cadavid, Cra 48 No. 7-151 (P19-145), Medellín, Colombia c Grupo GEPAR, Facultad de Ingeniería, Universidad de Antioquia UdeA, Calle 70 No. 52 - 21, Medellín, Colombia d CNRS, LAAS, 7 Avenue du Colonel Roche, F-31400 Toulouse, France e Univ de Toulouse, INSA, LAAS, F-31400 Toulouse, France b

a r t i c l e

i n f o

Article history: Received 15 October 2012 Received in revised form 26 March 2014 Accepted 11 April 2014 Available online xxxx Keywords: Situation prediction Fuzzy clustering Markov’s chain Complex process

a b s t r a c t Prediction of process behavior is important and useful to understand the system status and to take early control actions during operation. This paper presents a fuzzy clustering approach for predicting situations (functional states) in complex process industries. The proposed methodology combines a static measurement, such as the result of a fuzzy classifier trained with historical process data, and an estimation algorithm based on Markov‘s theory for discrete event systems. The situation prediction function is integrated into a process monitoring system without increasing the computational cost, which makes real-time implementation feasible. The monitoring strategy includes two principal stages: an offline stage for designing the fuzzy classifier and the predictor, and an online stage for identifying current process situations and for estimating predicted functional states. Thus, at each sample time, the results of a fuzzy classifier are used as inputs in the prediction procedure. An attractive feature of our proposed method, for situation prediction, is that it provides information about the evolution of the process. The proposed approach was tested on a monitoring system for a power transmission line, and also for monitoring a boiler subsystem of a steam generator. Experimental results indicate that our proposed technique in this paper is effective and can be used as a tool, for operators, to be used in industrial process decision making. Ó 2014 Elsevier Inc. All rights reserved.

1. Introduction Monitoring, as a fundamental task in the supervision and fault-detection of systems, provides process operators some necessary mechanisms for assessing the current situation, interacting with the process and recording process behavior [30]. Data-driven techniques such as learning classifiers or clustering algorithms are being increasingly used in complex process monitoring [6,7,16]. Taking advantage of the huge amount of historical information, classifiers can be used to associate

⇑ Corresponding author at: Grupo de Investigación en Control Automático y Robótica (ICARO), Politécnico Colombiano Jaime Isaza Cadavid, Cra 48 No. 7151 (P19-145), Medellín, Colombia. Tel.: +57 44343480. E-mail addresses: [email protected] (C.V. Isaza), [email protected], [email protected] (H.O. Sarmiento), [email protected] (T. Kempowsky-Hamon), [email protected] (M.-V. LeLann). http://dx.doi.org/10.1016/j.ins.2014.04.030 0020-0255/Ó 2014 Elsevier Inc. All rights reserved.

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

2

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

the values of process variables with a given class or situation (functional state) of the process (e.g. normal operation, alarm, fault, shut-down). The information about the class or situation is provided to the operator in order to help him to determine and analyze process behavior, during the decision task. A situation or a functional state of the process is defined as the characterization of the process condition, based on its past track records, on the functions to be performed and the possible transitions towards new situations [22]. The use of hard (or ‘‘crisp’’) and fuzzy classifiers [24] in monitoring systems has been reported in the literature [6,8,16,20,30]. Fuzzy classifiers have the advantage to estimate the membership degree for each data vector (instant values of process variables) to each class; finally, the vector is associated to the class with the maximum membership value. Membership degrees can be used to describe the process behavior. In order to generate classes, which characterize different process situations in the training stage, fuzzy clustering algorithms can be used. Some of these fuzzy clustering algorithms frequently used are Fuzzy C-means (FCM) [5], GK-means (GKM) [12], and Learning Algorithm for Multivariate Data Analysis (LAMDA) [3,14,15,22], among others. All are based on an n-dimensional time-independent relationship analysis and have presented a high performance in industrial monitoring systems in the case of situation assessment [4,6,16,22,23,25,30]. Then, during monitoring, classifiers execute a multi-variable analysis from the data vector at time t, which enables the identification of the current situation (normal or fault) of the process. Time-independent analysis (static analysis), which is characteristic of classifiers based on data. This kind of classifiers does not take into account the dynamic nature of real processes. Nevertheless, it would be interesting to have some knowledge of the possible future situation (in t + 1), when the process is at time t. In [23], a method for fault diagnosis of dynamic systems was proposed. In this method there is not a priori information about all failure modes, and the proposal is based on an incremental clustering procedure for generating fuzzy rules describing operational states. This classifier uses a semi-supervised learning mechanism. For each new cluster developed by the clustering algorithm, a corresponding fuzzy rule is created with antecedent parameters extracted from the cluster. New clusters may indicate new operating conditions or faults, and then each rule describes an operational mode. Each cluster can be identified without a significant time delay, but it is not possible to predict the next operating condition. In order to consider the process evolution and to improve monitoring and supervision systems, prediction algorithms or situation prediction techniques, such as Hidden Markov‘s Models and the Recurrent Neural Networks [17] can be used. Thus, knowledge-based methods and machine learning methods have emerged to calculate a probabilistic prediction of the processes behavior [32]. Qualitative knowledge-based methods may lead to combinatorial explosion [36]. The machine learning methods, such as clustering algorithms or neural networks, allow identifying of relevant information about the behavior of processes using historical data, and the models obtained do not have a high complexity. In order to describe the process behavior, without rules, it is possible to formulate an automaton (schema including classes, and connections between classes). To represent the system behavior, using the group information obtained with clustering techniques, we may establish the relationship (possible connections) between clusters. In [6], an automaton is proposed by using fuzzy clustering algorithms. However, this method does not permit the estimation of the next state (prediction of situations) or the evolution of the connecting links among functional states (the functional states are represented by the clusters). This algorithm can be considered as a static analysis and only includes historical data information. Within the best of our knowledge, research on situation prediction, based on fuzzy relations among clusters obtained using fuzzy classification algorithms, has not been reported in the literature. Du and Yeung in [9] proposed a methodology to calculate Fuzzy Transition Probabilities (FTP), based on a supervision system designed with classifiers. Their proposal estimates the probability of the system remaining in its current situation (current functional state) and to evolve towards other situations. In order to include time information, the mathematical formulation of discrete-time Markov’s chains was integrated. In [10], Fuzzy Probabilities (FP) [35] are employed for problem solving, but the proposal is only applicable to process with non-renewable states (i.e. the system cannot return to situations that have already appeared during the process operation, for example mechanical degrading processes). In this paper, a new method for online situation (functional states) prediction in complex processes is proposed. The proposal integrates the advantages of fuzzy classifiers, a prediction task and the possibility of being applied to renewable processes. The essential part of the methodology is the association of fuzzy information (membership degrees) -obtained from clustering- to the mathematical formulation of Markov’s chains to make the prediction task possible. The first step is to design the classifier; however, the quality of the data space partition obtained with the fuzzy classifier is beyond the scope of this paper. Thus, it is assumed that a suitable classification (the best) can be obtained by any fuzzy classification algorithm. In the second step, a matrix is generated using the resulting membership degree values provided by the classifier, enabling the predictions to be calculated. The final step is to obtain the online prediction at each time sample. Section 2 presents some fuzzy clustering generalities and characteristics that make this technique useful in a monitoring process framework. Next, the theoretical basis and the development of the proposed methodology are presented. Section 4 describes how situation prediction is integrated within a monitoring structure based on fuzzy clustering, and gives full details of how the offline (training) and online (monitoring) stages are carried out by means of a simple example. Then, case studies of two real industrial applications are presented in Section 5. In the first case, a thermal monitoring system for a power transmission line, the proposed methodology is developed in detail, while in the second case the prediction results were obtained for a boiler subsystem of a steam generator. Finally, the last section presents conclusions and future work.

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

3

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

2. Process monitoring based on fuzzy clustering In a process monitoring, the principal aim of classification is to perform an automatic classification of samples taken from the process. Each sample is classified according to its similarity to a reference class or prototype [25]. That is, a classifier must be developed to determine to with what class a sample has more similarity. The classifier is generally designed using a training set composed of samples for which a class label may or may not be known. Each sample is represented as a data vector   x  Ra ; xT ¼ x01 ; x02 ; . . . ; x0a where the dimension a, is the number of available measurements (called descriptors) describing each sample. The samples provided by a SCADA system (Supervisory Control and Data Acquisition) can be associated with the situations or functional states in the processes. Jain et al. [18] give a brief presentation of the main approaches that can be used to generate a classifier based on data, among them the K-means [19], Fuzzy C-means (FCM) [5], GK means (Gustafson and Kessel algorithm) [12], have been widely used in process industries (continuous and batch) mainly for monitoring and fault detection [30]. The fuzzy classifiers have been successfully integrated in monitoring and supervision schemes of processes [2,4,14,16,18,21,22]. 2.1. Fuzzy clustering The group of n data vectors X = [x1, x2, . . ., xn] each with a descriptors, where n corresponds to the number of samples (time samples), is divided, using a clustering method, into m clusters or classes. The fuzzy clustering enables calculation of the membership degree matrix U = [ljf]mxn, where ljf represents the membership degree of a sample f to the jth cluster. The cluster where the sample is located can be determined according to the membership degree for each sample f. In general, the highest membership degree determines which cluster is assigned to the sample (see Eq. (1)).

C f ¼ j; where maxfljf g;

j ¼ 1; 2; . . . ; m:; for each f

ð1Þ

Different algorithms have been proposed in order to perform fuzzy clustering, those based on a distance metric are the best known. Among these methods are the Fuzzy C-means (FCM) [5] and Gustafson-Kessel Means (GKM) [12] algorithms. These algorithms use an optimization criterion Jb (see Eq. (2)) that makes data clustering possible according to the similarity among individuals. The fuzziness exponent b > 1 regulates the ‘fuzziness’ of the partition.

J b ðU; v Þ ¼

n X m X 2 ðljf Þb ðdjf Þ

ð2Þ

f ¼1 j¼1

In these algorithms, the similarity is evaluated using the distance function djf (3), which is measured between the individuals and cluster prototypes or class centers v = {v1, v2, . . ., vm}.

djf ¼ ðxf  v j ÞT Hj ðxf  v j Þ

ð3Þ

The FCM distance measurement is Euclidean (Hj = 1). This distance produces spherical clusters in the a-dimensional space (hyper-spheres). Hj is defined according to Eq. (4) for GKM, where dj is the volumetric index of cluster j, and Fj is the fuzzy covariance matrix of cluster j. The distance, in this case, generates ellipsoidal clusters (hyper-ellipsoids) which may adapt better to the data than spherical clusters. For both algorithms the number of clusters must be given a priori. 1=n

Hj ¼ ½dj detðF j Þ

ðF j Þ1

ð4Þ

The FCM and GKM clustering algorithms are iterative procedures where n individuals are to be grouped into m classes. The number of classes m (1 < m < n) is selected by the user. Class prototypes are randomly initialized and are modified during the iteration process. Consequently, the fuzzy partition of the data space is also modified, until matrix U stabilizes (i.e. kUt  Ut1k < e, where e is a termination tolerance). An algorithm widely used for monitoring task is LAMDA - Learning Algorithm for Multivariate Data Analysis [3]. It is a fuzzy methodology for conceptual clustering and classification that combines the concepts of fuzzy clustering and neural networks. It is based on finding the global membership degree of a sample to an existing class, considering all the contributions of each descriptor. A numeric component of the data vector x is the normalized value of a descriptor. The contribution of each descriptor is called the marginal adequacy degree (MAD). When the descriptor is a numerical type, the MAD is calculated by selecting one of the different possible functions [1]; among these the ‘‘fuzzy’’ extension of the binomial function (5) and the Gaussian function (6) are the most commonly used.

h i ~x0 ~0 MAD x0g jqjg ¼ qjgg ð1  qjg Þð1xg Þ

ð5Þ

x0 ¼ ðx0  min x0 = max x0  min x0 Þ, and qjg corresponds to the mean value for descriptor g characterizing the class j. where ~

h i 2 0 MAD x0g jmjg ; rjg ¼ eð1=2Þððxg mjg Þ=rjg Þ

ð6Þ

where, m is the mean, and parameter r measures the proximity to the prototype. Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

4

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

Fig. 1. Monitoring scheme based on classifiers [15].

Marginal adequacies are combined using fuzzy logic connectives [34] as aggregation operators in order to obtain the global adequacy degree (GAD) of an individual to a class [3]. Fuzzy logic connectives are fuzzy versions of the binary logic operators, specifically, intersection (t-norm) and union (t-conorm). The aggregation function [26] is a linear interpolation between t-norm (c) and t-conorm (b) as shown in Eq. (7) where the a parameter, 0 6 a 6 1, is called exigency.

          GADðx0 jC Þ ¼ a  c MAD x01 jC ; . . . ; MAD x0a jC þ ð1  aÞ  b MAD x01 jC ; . . . ; MAD x0a jC

ð7Þ

The most commonly used Fuzzy Logic operators are: {c(a,b) = a,b; b(a,b) = a + b  a,b} and {c(a,b) = min(a,b);b(a,b) = max(a,b)}. Finally, the GAD value can be associated to the membership of a sample to each class. An element is assigned to the class which exhibits the maximum GAD. To avoid the assignment of an under representative element to a class, that is an element with a small membership, a minimum global adequacy threshold is employed. It is called the non-informative class (NIC) [27]. Passive recognition, self-learning or supervised learning are possible in LAMDA. LAMDA allows the processing of quantitative, qualitative and interval information [13]; at the self-learning mode, the number of clusters does not have to be determined in advance; it is not an iterative algorithm and has a low computational cost. A characteristic of fuzzy clustering techniques is that they are independent of time. The clustering analysis is done using the similarity in the data space (vector x), where time is not taken into account. The resulting classifier is employed in the online phase to estimate the membership degrees of the data sample at time t to the classes created. 2.2. Monitoring based on fuzzy clustering Monitoring based on fuzzy clustering consists of two main stages: an offline learning stage and an online recognition stage (see Fig. 1). In the learning stage, historical data of the process are used to train the fuzzy classifier (i.e. modeling situations using the clusters). Then, the classifier is used online (recognition step) to process every new sample taken from the process. The result of this is to provide the operator with information in real time about the membership degrees of the sample to each class or, by means of a final decision loop, to assign the class to which the sample belongs. Since this type of approach is time-independent, the main objective of this article is to include a situation prediction algorithm in order to improve process monitoring. These algorithms provide the operator with a possible panorama of future situations (at time t + 1) likely to occur according to the current evolution of the process. 3. Situation prediction based on membership degrees In the situation prediction strategy, two fundamental elements are highlighted: the required information (membership degrees) and the construction of the prediction algorithm. Membership degrees are obtained when applying a fuzzy clustering Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

5

algorithm to the historical dataset of the process; these values are the basic information for the proposal presented in this paper. This information is normally contained in the Membership Degree Matrix (U). The prediction algorithm proposed in this paper is based on the work on Fuzzy Transition Probabilities proposed by Du and Yeung [9]. Relevant details about this work are described later. 3.1. Fuzzy transition probabilities – FTP Du and Yeung [9] based their proposal on the calculation of probabilities associated with discrete events established by Markov [28]. Markov demonstrated that the probability Pj of the discrete event j at time t + 1 can be calculated if the probability Pk of the discrete event k at time t is known. Pj for the next sample time (St+1) can be estimated using Eq. (8); where Pkj is the transition probability between two discrete events k and j (from k to j), m the number of discrete events (classes in a fuzzy classifier), and S corresponds to the sample.

Pj ðStþ1 Þ ¼

m X Pkj  P k ðSt Þ; j ¼ 1; 2; . . . ; m

ð8Þ

k¼1

In addition to the condition in Eq. (9) must be satisfied.



m X Pkj ; k ¼ 1; 2; . . . ; m

ð9Þ

j¼1

Using Eq. (8), Du and Yeung [9] proposed to use fuzzy probabilities and to estimate the fuzzy probability (FPj) for the discrete state j at time t + 1. This probability (see Eq. (10)) is estimated from the Fuzzy Transition Probability between states j and k (FPkj) and the fuzzy probability FPk(St) for the discrete state k in the current time sample t.

FPj ðStþ1 Þ ¼

j X FPkj  FPk ðSt Þ; j ¼ 1; 2; . . . ; m:; where the upper limit of the sum k ¼ j

ð10Þ

k¼1

The following condition must be fulfilled:



m X FPkj ; k ¼ 1; 2; . . . ; m

ð11Þ

j¼1

The fuzzy probabilities are obtained by means of a classifier (based on intervals) developed by the authors. Determining the classifier and calculating the FPkj constitute the offline training stage of the monitoring process. Calculating the online prediction of the fuzzy probability FPj(St+1) requires both the FPjk found in the training stage and the current fuzzy probability FPk(St) calculated by the classifier. According to Eq. (10) where k = j, Du and Yeung’s proposal is restricted to non-renewable processes. Since a large number of industrial processes exhibit the possibility to return to previous situations (e.g. from an alarm to normal operation), the FTP methodology cannot be applied in every case, and in addition, it depends on the classification method proposed in Du and Yeung [9]. For this reason, this paper proposes a generalization of the FTP method where return transitions are included. 3.2. Proposed method for situation prediction To predict the next situation for the monitored system, a general methodology based on fuzzy clustering methods is proposed. This approach is based on the FTP theory presented in Section 3.1. This new algorithm allows the prediction of membership degrees associated with the current sample at time t + 1, i.e. lk(St+1) (k = 1, 2, . . ., m), based only on the current membership degrees at time t, i.e. lk (St)(k = 1, 2, . . ., m). These membership degrees are obtained using any fuzzy clustering (or fuzzy classification) algorithm. The estimated situation at t + 1 is associated with the maximum value of the estimated membership degrees. The new methodology has two stages: an offline training stage to obtain what we have called Weights of Fuzzy Transition (WFT) and an online stage that predicts the future state. In order to include return transitions the summation term of Eq. (10) is modified, the upper limit value j is replaced by m(k = m) to obtain Eq. (12).

FPj ðStþ1 Þ ¼

m X FPkj  FPk ðSt Þ; j ¼ 1; 2; . . . ; m

ð12Þ

k¼1

Since the task to predict the membership degrees, that the process will have at time t + 1 for the current situation j(j = 1, 2, . . ., m), the probabilities in Eq. (12) to be replaced by the membership degrees. The replacement is feasible due to the bijective relationship between possibility (p) and probability (p) proposed by Dubois and Prade [11] - based on the Consistency Principle established by Zadeh [34] -. Dubois and Prade [11] deduced a mathematical formulation to convert probability measures into possibility measures (principle of maximum specificity) (13), and conversely (principle of insufficient reason) Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

6

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

(14). As a consequence, the behavior of the discrete events expressed in terms of probabilities and the behavior of the discrete events (expressed in terms of possibilities) preserve the same pattern and relationship among values. The probabilities are obtained from the possibilities or vice versa.

8i ; i ¼ 1; . . . ; m: pi ¼

m X 1 j¼i

8i ; i ¼ 1; . . . ; m: pi ¼

j

ðpj  pjþ1 Þ

ð13Þ

m X minðpi ; pj Þ

ð14Þ

j¼1

Then, from Eq. (12), considering the possibility of returning to previous situations, the fuzzy probabilities can be replaced with the membership degrees obtained by any fuzzy clustering method. Thus, the prediction (at time t + 1) of membership degrees is defined as:

lj ðStþ1 Þ ¼

m X WFT kj  lk ðSt Þ; j ¼ 1; 2; . . . ; m

ð15Þ

k¼1

The matrix equation (Eq. (16)), which is equivalent to the system of equations generated in Eq. (15), allows the prediction of the process membership degrees for each situation, [l(S2. . .n)] – at time (t + 1) –, as well as the WFT vector.

½lðS2...n Þ ¼ ½lðS1...ðn1Þ Þ  ½WFT kj 

ð16Þ

Eq. (15) has been developed for every situation j in Eq. (17). Then, the membership degree – at time t + 1 – to situation j (lj(St+1)) is calculated as:

lj ðStþ1 Þ ¼ WFT 1j  l1 ðSt Þ þ WFT 2j  l2 ðSt Þ þ . . . þ WFT mj  lm ðSt Þ

ð17Þ

Eq. (17) takes into account the fact that a transition is possible from any existing situation to situation j. This is represented by the statement that k covers all the existing situations (i.e. k = 1, . . ., m) meaning that the process may return to a previous situation. With the historical information of the process, Eq. (16) is solved for WFT. A solution for WFT is obtained (minimizing E in Eq. (18)) using the Least Squares Method (LSM). Since the solution of a linear system with more equations than unknown values may have multiple solutions; we constrained the solution to values equal to or greater than zero. In Eq. (18), x corresponds to vector WFT, and matrix A and vector b correspond to membership degrees organized according to Eqs. (19) and (20).

E ¼ kA  x  bk;

xP0

ð18Þ

Vector WFT represents transitions that are likely to happen from situation k to the different situations of the process j(j = 1, 2, . . ., k, . . ., m) at time t. Where

3 3 2 l1 ðS2 Þ WFT 11 7 6 7 6 .. .. 7 6 7 6 . 7 6 . 7 6 7 6 7 6 6 l ðSn Þ 7 6 WFT 1m 7 7 6 1 7 6 7 6 7 6 6 l2 ðS2 Þ 7 6 WFT 21 7 7 6 7 6 7 6 .. .. 7 6 7 6 7 6 . . 7 6 7 ½lðS2...n Þ ¼ 6 7; ½WFT kj  ¼ 6 6 WFT 7 6 l2 ðSn Þ 7 2m 7 6 7 6 7 6 7 6 .. .. 7 6 7 6 7 6 . . 7 6 7 6 7 6 7 6 6 lm ðS2 Þ 7 WFT m1 7 6 7 6 7 6 7 6 .. 7 6 .. 7 6 5 4 . 5 4 . WFT mm l ðSn Þ 2

ð19Þ

m

and

2  A~  1 6 ~ 6 0 6 6 ½lðS1...ðn1Þ Þ ¼ 6  ... 6 6 ~ 4 0  ~ 0

~ 0 ~ A1

...

~ 0 ~ 0

~ 0 ~ 0

.. . ~ 0

.. . . . . A~1 ... ~ 0

.. . ~ 0 A~1

~ 0

... .. .

            

  A~  2 ~  0   .  ..  ~  0  ~ 0

~ 0 ~ A2

...

~ 0 ~ 0

~ 0 ~ 0

.. . ~ 0

.. . . . . A~2 ... ~ 0

.. . ~ 0 A~2

~ 0

... .. .

        ...     

  A~  2 ~  0   .  ..  ~  0  ~ 0

~ 0 ~ A2

...

~ 0 ~ 0

~ 0 ~ 0

.. . ~ 0

.. . . . . A~2 ... ~ 0

.. . ~ 0 A~2

~ 0

... .. .

3   7 7 7 7 7 7 7 5  

ð20Þ

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

7

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

Where n is the number of samples (training data vectors), ~ A corresponds to the membership degree vector of the previous sample (i.e. situation j goes from lj(S1) to lj(Sn1)),

2 6 6 ~i ¼ 6 A 6 4

li ðS1 Þ 3 li ðS2 Þ 7 7 .. .

2 3 0 607 6 7 7; i ¼ 1; 2; . . . m:; and ~ 7 0¼6 7 6 .. 7 5 4.5

ð21Þ

0

li ðSn1 Þ

Since Eqs. (16), (19) and (20) have membership degrees as inputs (matrix U), the result in the training stage is a fuzzy transitions weights (WFT) matrix. To obtain an appropriate WFT matrix requires a suitable membership degree (U) matrix as input regardless which classification algorithm is used. In order to consider the evolution of the process, besides the values of the membership degrees obtained from historical data, the change of membership degrees, Dl is included. This Dl allows enabling the incorporation of information about the variations – trend (Stewart, [29]) – in membership degrees, according to the process evolution. In general, for any j state (situation), the change in membership degrees between two time samples is defined by:

Dlj ¼ lj ðSt Þ  lj ðSt1 Þ

ð22Þ

These changes in membership degrees are added to Eqs. (16), (19) and (20) in order to predict changes in membership degrees, producing Eqs. (23)–(26). In the training stage, Eqs. (23)–(26) are evaluated to find the Delta Weight of the Fuzzy Transition (DWFT) vector.

½DlðS2...n1 Þ ¼ ½DlðS1...ðn2Þ Þ  ½DWFT kj 

ð23Þ

where

3 2 Dl1 ðS2 Þ DWFT 11 7 6 6 . .. 7 6 . 6 . 7 6 . 6 7 6 6 6 Dl ðSn1 Þ 7 6 D WFT 1 1m 7 6 6 7 6 6 DWFT 6 Dl2 ðS2 Þ 7 21 6 7 6 6 7 6 .. .. 6 7 6 6 . . 7 6 ½DlðS2...n1 Þ ¼ 6 7; ½DWFT kj  ¼ 6 6 DWFT 6 Dl2 ðSn1 Þ 7 2m 6 7 6 6 7 6 . . 6 .. 7 6 .. 6 7 6 6 7 6 6 DWFT 6 Dlm ðS2 Þ 7 m1 6 7 6 6 7 6 .. 6 .. 7 6 4 . 5 4 . 2

3 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5

ð24Þ

DWFT mm

Dlm ðSn1 Þ and

2  B~  1 6 ~ 6 0 6 6 ½DlðS1...ðn2Þ Þ ¼ 6  ... 6 6 ~ 4 0  ~ 0

~ 0 B~1

...

~ 0 ~ 0

~ 0 ~ 0

.. . ~ 0

.. . ~ . . . B1 ... ~ 0

.. . ~ 0

~ 0

... .. .

B~1

            

  B~  2 ~ 0   .  ..  ~ 0  ~ 0

~ 0 B~2

...

~ 0 ~ 0

~ 0 ~ 0

.. . ~ 0

.. . ~ . . . B2 ... ~ 0

.. . ~ 0

~ 0

... .. .

B~2

        ...     

  B~  m  ~  0   .  ..   ~  0   ~ 0

~ 0 B~m

...

~ 0 ~ 0

~ 0 ~ 0

.. . ~ 0

.. . ~ . . . Bm ... ~ 0

.. . ~ 0

~ 0

... .. .

B~m

3   7 7 7 7 7 7 7 5  

ð25Þ

where B corresponds to the vector of the change in membership degrees in the previous time sample, and

3 2 3 Dli ðS1 Þ 0 6 Dl ðS2 Þ 7 607 7 6 6 7 i ~i ¼ 6 7; i ¼ 1; 2; . . . m:; and ~ 7 0¼6 B .. 7 6 6 .. 7: 5 4 4.5 . 0 Dli ðSn1 Þ 2

ð26Þ

Vectors WFT and DWFT are used to predict membership degrees for a sample at St+1 during the monitoring stage. Therefore, in order to estimate membership degrees lj(St+1) to all m situations - at time t + 1- via Eq. (15), the WFT matrix and the current membership degrees lj(St) provided by the classifier are used. The change in membership degrees (Dlj(St) is calculated using Eq. (22)) for membership degrees provided by the classifier. Next, Eq. (23) can be evaluated with the value of Dlj(St) since DWFT is known, and the result is Dlj(St+1) prediction. Then, the Initial Prediction of Membership Degrees (l0j ðStþ1 Þ, Eq. (27)) includes the resulting values of lj(St+1) and Dlj(St+1).

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

8

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

l0j ðstþ1 Þ ¼ lj ðstþ1 Þ þ Dlj ðstþ1 Þ

ð27Þ

To predict a change of state (e.g. from a normal to an abnormal situation), it is necessary to identify the moment when the transition starts. Considering that during a transition between two situations the membership degrees for each sample lj(St) to each class (j = 1, . . ., m) have similar values, there is no real certainty as to which the current situation of the process is. For this reason, to predict the moment when a transition occurs, an information measure was included. The information index proposed by Isaza [15] is a measurement of the difference among the membership degree with the highest value (lM) and all the others lj. The largest difference is observed when there is the greatest degree of certainty regarding the current state of the process. The information index ID(l) is evaluated according to Eq. (28) yielding values between ’0’ and ’1’; ’0’ corresponds to the situation in which all membership degrees are equal (minimum certainty), and ’1’ when only one membership degree is the maximum and the others are zero (maximum certainty).

I D ð lÞ ¼

X

ki :eki =C:lM :elM ; i ¼ 1; 2; . . . m

ð28Þ

i

Vector ki(l) contains the differences between the maximum membership lM and the other membership degrees of the same sample (see Eq. (29)), where lM = max[li] and C = m  1.

ki ðlÞ ¼ flM  li gi–M ; i ¼ 1; 2; . . . m

ð29Þ

During transitions, it is necessary to amplify the drift of the predicted membership degrees (Kl(St+1), Eq. (30)). This difference is analyzed to assess the change in the process trend (change between situations). Then, the final membership prediction (l00j ðStþ1 Þ, Eq. (31)) includes this difference and the inverse value of ID calculated at time t.

Klðstþ1 Þ ¼ l0j ðstþ1 Þ  lj ðst Þ 1 l00j ðstþ1 Þ ¼ lj ðst Þ þ Klðstþ1 Þ ID ðlÞ

ð30Þ ð31Þ

Predicted values of membership degrees l00j ðStþ1 Þ provide the information which determines the estimated situation. The highest value of l00j ðStþ1 Þ corresponds to the situation (class) which has a greater possibility of occurrence. It is important to keep in mind that the information on the possibility of transition to other situations, which is useful in the decision making process, is also available. 4. Proposed monitoring system In Fig. 2, the steps of the proposed methodology are presented. Solid lines indicate the sequence of the steps to be followed in the training stage (offline) and the monitoring stage (online). Dot lines indicate that the resulting information for one specific step is required for another step. In the offline stage, the fuzzy classifier is trained using the process historical data. In this step, any fuzzy clustering method can be used (the problem of assuring a suitable classification or the choice of the fuzzy clustering method is not the subject of this paper). Using the obtained membership degree for each data to each class, the predictor is built (matrix

Fig. 2. Monitoring strategy including the estimation proposal.

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

9

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

WFT and DWFT). Although, the offline stage has a high computational cost, it has no effect on the time required for monitoring (online step). The construction and solution of Eqs. (16) and (23) in order to obtain WFT and DWFT are the outputs of the offline stage. These equation systems are characterized by a high dimensionality. In the online monitoring, the online sample at time t, is the input argument to the classifier in order to obtain the membership degrees to all existing classes (situations). Only membership degrees are the input information to the predictor. The computational cost is negligible because the complexity of computation is reduced to a simple matrix multiplication – Eqs. (16) and (23) – in order to obtain the situation prediction. In order to show details of how the proposed methodology works, a simple example is presented. It consists of 3 process situations (classes), 3 variables, a training dataset with 6 samples (see Table 1), and a test dataset with 3 samples (see Table 2). 4.1. Off-line training The samples (historical dataset), their corresponding membership degrees (matrix U) obtained by means of the fuzzy classifier (initial stage of the training), and the changes in their membership degrees (matrix DU, Eq. (22)) are presented in Table 1. Three fuzzy situations are defined as shown in Table 1. Therefore, the WFT model was obtained by expanding Eq. (15) as follows:

lj ðStþ1 Þ ¼ WFT 1j  l1 ðSt Þ þ WFT 2j  l2 ðSt Þ þ WFT 3j  l3 ðSt Þ; situation : j ¼ 1; samples : t ¼ 1 to 5: ...

lj ðStþ1 Þ ¼ WFT 1j  l1 ðSt Þ þ WFT 2j  l2 ðSt Þ þ WFT 3j  l3 ðSt Þ;

ð32Þ

situation : j ¼ 2; samples : t ¼ 1 to 5: .. .

lj ðStþ1 Þ ¼ WFT 1j  l1 ðSt Þ þ WFT 2j  l2 ðSt Þ þ WFT 3j  l3 ðSt Þ; situation : j ¼ 3; samples : t ¼ 1 to 5: And the matrix from Eq. (32) is:

3

2

2

l1 ðS2 Þ l1 ðS1 Þ 6 l ðS Þ 7 6 l ðS Þ 6 1 3 7 6 1 2 7 6 6 6 l1 ðS4 Þ 7 6 l1 ðS3 Þ 7 6 6 6 l ðS Þ 7 6 6 1 5 7 6 l1 ðS4 Þ 7 6 6 6 l1 ðS6 Þ 7 6 l ðS5 Þ 7 6 1 6 6 l ðS Þ 7 6 0 6 2 2 7 6 7 6 6 6 l2 ðS3 Þ 7 6 0 7 6 6 7 6 6 6 l2 ðS4 Þ 7 ¼ 6 0 7 6 6 6 l ðS5 Þ 7 6 0 7 6 6 2 7 6 6 6 l2 ðS6 Þ 7 6 0 7 6 6 6 l ðS2 Þ 7 6 0 7 6 6 3 7 6 6 6 l3 ðS3 Þ 7 6 0 7 6 6 6 l ðS4 Þ 7 6 0 7 6 6 3 7 6 6 4 l3 ðS5 Þ 5 4 0 0 l3 ðS6 Þ

l2 ðS1 Þ l2 ðS2 Þ l2 ðS3 Þ l2 ðS4 Þ l2 ðS5 Þ

0

0

0

0

0

0

0

0

0 l1 ðS1 Þ

0 0

l1 ðS2 Þ l1 ðS3 Þ l1 ðS4 Þ l1 ðS5 Þ

0

0

0

0

0

0

0 l1 ðS1 Þ l1 ðS2 Þ

0 0

0 0 0 0 0

l1 ðS3 Þ l1 ðS4 Þ l1 ðS5 Þ

0

l3 ðS1 Þ l3 ðS2 Þ l3 ðS3 Þ l3 ðS4 Þ l3 ðS5 Þ

0

0

0

0

0

0

0

0

0 l2 ðS1 Þ

0 0

l2 ðS2 Þ l2 ðS3 Þ l2 ðS4 Þ l2 ðS5 Þ

0

0

0

0

0

0

0 l2 ðS1 Þ l2 ðS2 Þ

0 0

0

0 0

0 0

0

0

0

0

l2 ðS3 Þ l2 ðS4 Þ l2 ðS5 Þ

0

0

0

0

0

0

0

0

0

0 l3 ðS1 Þ

0 0

l3 ðS2 Þ l3 ðS3 Þ l3 ðS4 Þ l3 ðS5 Þ

0

0

l3 ðS1 Þ l3 ðS2 Þ l3 ðS3 Þ l3 ðS4 Þ l3 ðS5 Þ

0 0

0 0

0

0

0

0

0 0 0

3 7 7 7 7 72 7 WFT 3 7 11 76 76 WFT 12 7 7 76 76 WFT 7 7 76 13 7 76 76 WFT 21 7 7 76 7 76 7 WFT 76 22 7 76 76 WFT 23 7 7 76 7 76 76 WFT 31 7 7 76 74 WFT 32 7 5 7 7 7 WFT 33 7 7 7 7 5

ð33Þ

Table 1 A simple example (training database). Samples

1 2 3 4 5 6

Variables (descriptors)

Fuzzy membership degrees, U Matrix

Fuzzy membership degree changes, DU Matrix

x1

x2

x3

l1

l2

l3

Dl 1

Dl2

Dl3

1 1 0.8 0.3 0.3 0.3

0.6 0.5 0.6 0.8 1 0.9

0.4 0.5 0.4 0.4 0.2 0.2

0.586 0.784 0.529 0.213 0.157 0.157

0.205 0.225 0.280 0.583 0.475 0.475

0.095 0.083 0.154 0.311 0.858 0.670





0.198 0.255 0.316 0.056 0

0.020 0.055 0.303 0.108 0

– 0.013 0.071 0.157 0.547 0.187

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

10

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

Table 2 A simple example (test database). Samples

1P⁄ 2P⁄ 3P⁄ ⁄

Variables (descriptors)

Fuzzy membership degrees, U Matrix

Fuzzy membership degree changes, DU Matrix

x1

x2

x3

l1

l2

l3

Dl 1

D l2

Dl 3

0.28 0.3 0.32

0.98 0.97 0.96

0.25 0.24 0.22

0.167 0.165 0.167

0.502 0.496 0.476

0.665 0.710 0.761

– 0.002

– 0.006

– 0.045

P = Sample in test database.

Eq. (33) is built using the U matrix in Table 1. This equation provides a solution for WFT (see Eqs. (34) and (35)).

2

0:784

2

3

6 6 0:529 7 6 7 6 6 7 6 6 6 0:213 7 6 7 6 6 6 0:157 7 6 7 6 6 7 6 6 6 0:225 7 6 7 6 6 6 0:280 7 6 7 6 6 7 6 6 6 0:583 7 6 7 6 6 6 0:475 7 ¼ 6 7 6 6 7 6 6 6 0:475 7 6 7 6 6 6 0:083 7 6 7 6 6 7 6 6 6 0:154 7 6 7 6 6 6 0:311 7 6 7 6 6 7 6 6 4 0:858 5 6 6 4 0:670 2

3

3

0:586

0

0

0:205

0

0

0:095

0

0

0:784

0

0

0:225

0

0

0:083

0

0

0:529

0

0

0:280

0

0

0:154

0

0

0:213

0

0

0:583

0

0

0:311

0

0

0:157 0

0 0:586

0 0

0:475 0

0 0:205

0 0

0:858 0

0 0:095

0 0

0 0

0:784 0:529

0 0

0 0

0:225 0:280

0 0

0 0

0:083 0:154

0 0

0

0:213

0

0

0:583

0

0

0:311

0

0 0

0:157 0

0 0:586

0 0

0:475 0

0 0:205

0 0

0:858 0

0 0:095

0

0

0:784

0

0

0:225

0

0

0:083

0

0

0:529

0

0

0:280

0

0

0:154

0

0

0:213

0

0

0:583

0

0

0:311

0

0:157

0

0

0:475

0

0

0:858

0

7 7 7 7 72 7 WFT 3 7 11 76 76 WFT 12 7 7 76 76 WFT 7 7 76 13 7 76 76 WFT 21 7 7 76 7 76 76 WFT 22 7 7 76 76 WFT 23 7 7 76 7 76 76 WFT 31 7 7 76 74 WFT 32 7 5 7 7 7 WFT 33 7 7 7 7 5

ð34Þ

3

2

WFT 11 0:7936 6 WFT 7 6 0:2383 7 7 6 6 12 7 7 6 7 6 6 WFT 13 7 6 0 7 7 6 7 6 7 6 7 6 6 WFT 21 7 6 0 7 7 6 7 6 6 WFT 22 7 ¼ 6 0:7468 7 7 6 7 6 7 6 7 6 6 WFT 23 7 6 1:1615 7 7 6 7 6 6 WFT 31 7 6 0:0171 7 7 6 7 6 7 6 7 6 4 WFT 32 5 4 0:1068 5 0:1513 WFT 33

ð35Þ

In a similar way, the DU matrix (Table 1) is used to build Eq. (23), which provides a solution for DWFT (see Eqs. (36) and (37)).

2

0255

3

2

0:198 7 6 6 0:316 7 6 0:255 6 7 6 6 0:056 7 6 0:316 7 6 6 7 6 6 7 6 6 0 7 6 0:056 6 6 0:055 7 6 0 7 6 6 7 6 6 6 0:303 7 6 0 7 6 6 6 0:108 7 ¼ 6 6 0 7 6 6 7 6 6 7 6 6 0 0 7 6 6 0:071 7 6 0 7 6 6 7 6 6 6 0:157 7 6 0 6 7 6 6 0:547 7 6 0 5 4 4 0:187

0

0 0

0 0

0:020 0:055

0 0

0 0

0:013 0:071

0 0

0 0 0:198

0

0:303

0

0

0:157

0

0 0

0:108 0

0 0:020

0 0

0:547 0

0 0:013

0:255

0

0

0:055

0

0

0:071

0:316

0

0

0:303

0

0

0:157

0:056 0

0 0:198

0 0

0:108 0

0 0:020

0 0

0:547 0

0

0:255

0

0

0:055

0

0

0

0:316

0

0

0:303

0

0

0

0:056

0

0

0:108

0

0

0 0

3

72 7 WFT 3 7 11 76 0 76 WFT 12 7 7 7 7 0 76 7 WFT 76 13 7 76 0 7 76 WFT 21 7 76 7 6 0 76 76 WFT 22 7 7 7 0 7 76 6 WFT 23 7 7 7 0 76 76 WFT 31 7 7 0:013 76 6 74 WFT 32 7 5 7 0:071 7 7 WFT 33 0:157 5 0:547 ð36Þ

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

11

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

3 2 3 DWFT 11 0:2306 6 DWFT 7 6 0 7 7 6 6 12 7 7 6 7 6 6 DWFT 13 7 6 0 7 7 6 7 6 7 6 7 6 6 DWFT 21 7 6 0 7 7 6 7 6 6 DWFT 22 7 ¼ 6 0 7 7 6 7 6 7 6 7 6 6 DWFT 23 7 6 1:8365 7 7 6 7 6 6 DWFT 31 7 6 0 7 7 6 7 6 7 6 7 6 4 DWFT 32 5 4 0:0118 5 0:0235 DWFT 33 2

ð37Þ

The fuzzy classifier, the WFT and DWFT are the elements that will allow the online prediction of the example process. 4.2. Online monitoring A prediction can be obtained once the current sample (St) and the previous sample (St1) have been classified. Then, for the prediction of membership degrees for test sample 3P, test samples 1P and 2P in Table 2 (Test dataset) are used to illustrate the procedure. As online data, sample 2P (x1 = 0.3, x2 = 0.97 and x3 = 0.24) is taken from the process with the respective classification (l1 = 0.165, l2 = 0.496 and l3 = 0.710), and the differences in membership degrees Eq. (22) are calculated for samples 2P and 1P (Dl1 = l1(2P)  l1(1P) = 0.165  0.167 = 0.002, Dl2 = l2(2P)  l2(1P) = 0.006 and Dl3 = 0.045). Eqs. (16) and (23) are solved when WFT, DWFT, l and Dl (for 2P sample) are known, as shown in Eqs. (38) and (39) respectively.

2

0:165

0

32

0

WFT 11

3

7 6 0 6 0:165 0 7 76 WFT 12 7 6 76 7 6 7 6 0 6 0 0:165 76 WFT 13 7 7 2 2 3 6 3 76 7 6 l1 ðS3 Þ 0 0 76 WFT 21 7 0:1431 6 0:496 7 7 6 6 6 7 7 7 6 6 0:496 0 7 4 l2 ðS3 Þ 5 ¼ 6 76 WFT 22 7 ¼ 4 0:4856 5 6 0 76 7 6 l3 ðS3 Þ 0 0:496 76 WFT 23 7 0:6835 6 0 76 7 6 76 WFT 31 7 6 0:710 0 0 76 7 6 76 7 6 0:710 0 54 WFT 32 5 4 0 0

0

0:710

ð38Þ

WFT 33

3 DWFT 11 76 DWFT 7 6 0 0:002 0 76 6 12 7 76 7 6 76 DWFT 13 7 6 0 0 0:002 76 7 2 2 3 6 3 7 7 6 6 Dl1 ðS3 Þ 0 0 0:005 76 DWFT 21 7 6 0:006 76 7 6 6 7 6 76 DWFT 22 7 ¼ 4 0:005 7 0 0:006 0 4 Dl2 ðS3 Þ 5 ¼ 6 5 76 7 6 76 7 6 Dl3 ðS3 Þ 0 0 0:006 76 DWFT 23 7 0:010 6 76 7 6 76 DWFT 31 7 6 0:045 0 0 76 7 6 76 7 6 0 0:045 0 54 DWFT 32 5 4 0 0 0:045 DWFT 33 2

0

0:002

32

0

ð39Þ

In Eq. (40) the result of the initial prediction according to Eq. (27) is shown.

2

3

2

3

2

3

2

3

l01 ðs3 Þ 0:1431 0:1426 0:005 6 0 7 6 7 6 7 6 7 4 l2 ðs3 Þ 5 ¼ 4 0:4856 5 þ 4 0:005 5 ¼ 4 0:4861 5 l03 ðs3 Þ 0:6835 0:6736 0:010

ð40Þ

The value of ID (Eqs. (28) and (29)) is calculated in Eqs. (41) and (42) using the membership degrees of sample 2P.

lM ¼ l3 ¼ 0:71; ID ¼



k1 k2



 ¼









lM  l1 0:71  0:165 0:545 ¼ ¼ lM  l2 0:71  0:496 0:214

0:545  e0:545 þ 0:214  e0:214 ¼ 0:4172 2  0:71  e0:71

 ð41Þ

ð42Þ

The change value Kl(S3P) Eq. (30) evaluated as the difference between the initial prediction and membership degrees of sample 2P is calculated as follows:

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

12

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

2

3 2 3 2 3 2 3 Kl1 ðS3 Þ 0:1426 0:165 0:0224 6 7 6 7 6 7 6 7 4 Kl2 ðS3 Þ 5 ¼ 4 0:4861 5  4 0:496 5 ¼ 4 0:0099 5 Kl3 ðS3 Þ 0:6736 0:710 0:0364 The final prediction of membership degrees

2

3

2

3

2

ð43Þ

l00j ðs3 Þ Eq. (31) is evaluated in Eq. (44). 3

2

3

l001 ðs3P Þ 0:165 0:0224 0:1113 1 6 6 00 7 6 7 7 6 7 4 l2 ðs3P Þ 5 ¼ 4 0:496 5 þ 4 0:0099 5 ¼ 4 0:4723 5 0:4172 l003 ðs3P Þ 0:710 0:0364 0:6227

ð44Þ

The final situation for sample 3P is determined by the maximum predicted membership value, which in Eq. (44) corresponds to the value associated with class j = 3. Predicted membership degrees (Eq. (42)) and expected membership degrees to sample 3P (see Table 6) are similar and, in both cases (predicted values and expected values), the maximum membership degree corresponds to class 3 (j = 3). The prediction for sample 3P can be calculated accurately based on the information available in sample 2P.

5. Real application examples Two case studies are presented in this paper: a thermal monitoring system in an electrical conductor –in which the proposed methodology is presented in detail – and a monitoring system in a boiler subsystem of a steam generator [22]. These two cases allow analyzing the performance of the predictor in the monitoring scheme. In both applications, according to the proposed methodology, the starting point is the best fuzzy classification obtained and validated by the expert. For the boiler subsystem, Botia et al. [6] showed that a suitable classification could be obtained by different fuzzy algorithms such as FCM, GKM and LAMDA. Procedures and prediction calculations were performed using MATLAB software, working on a PC Intel (R) Core (TM) 2 Duo CPU, 2.53 GHz, 6 GB RAM, 64 bits. The estimated average time to execute the training stage – for both cases – was 2 s, tuning of classification algorithm parameters was not included as they were considered valid. In the online monitoring stage, the estimated average time to execute a prediction was 0.014 s. 5.1. Thermal monitoring in a power transmission line: description of the process The elementary fiber Bragg grating (FBG) consists of a short section of single-mode optical fiber in which the core refractive index is periodically modulated [33]. An FBG reflects particular wavelengths of light and transmits the others. The fiber core generates a wavelength specific dielectric mirror (Bragg wavelength). A fiber Bragg grating can, therefore, be used as an inline optical filter to block certain wavelengths, or as a wavelength-specific reflector. In FBG, there is a linear relationship between spectral displacement and temperature changes. Also, FBG presents reduced size and weight, electromagnetic immunity, the possibility of multiplexation, and remote sensing. These characteristics facilitate the configuration of a quasi-distributed sensing of temperature in power transmission lines [31]. A section of power transmission line (test system), with the required fittings and isolators, was installed inside the High Voltage laboratory at the Engineering Faculty of Antioquia University. An 8-meter-long 1/0 ACSR conductor was connected to a transformer fed with a variable electrical source to control the applied voltage. An adjustable current value flowed through the circuit made up of the secondary wire winding of the transformer and the conductor. Three FBGs on the same optical fiber (at 2, 4 and 6 meters from one of the ends) were installed on top of the conductor. The wavelengths (WL), at 25 °C, corresponding to sensors 1, 2 and 3 respectively were: 1554.11 nm, 1556.14 nm and 1558.18 nm. A Bragg grating interrogator was installed on one of the ends of the fiber in order to record temperature values. The conductor was subjected to heating through a controlled current increase (normal operation) and through external temperature sources (flame with butane torch); to apply a disturbance and generate a fault situation (i.e. an abnormal temperature increase). The thermal profiles used for the training and test stages of the monitoring system are presented in Fig. 3. Normal operation and fault states were present in both stages. In Fig. 3, the horizontal axis corresponds to the samples of the signals (1 sample per second) and the vertical axis shows the variations in the wavelength (reading of the fiber interrogator) for each sensor. The real temperature can be calculated as a function of the value of the wavelength variation (WL), see Eq. (45). The temperatures oscillated between 30° (room temperature) and 110 °C.

T ð CÞ ¼ 102:629  ðWLÞ þ 30

ð45Þ

The training and test datasets contain 9339 and 5699 samples respectively. Each sample has 3 variables (S1 = Wavelength Variation sensor 1, S2 = Wavelength Variation sensor 2 and S3 = Wavelength Variation sensor 3), one for each FBG. Seven classes were necessary to characterize the temperature behavior in the conductor as a consequence of the electric current flow and the external temperature sources (Table 3). Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

13

Fig. 3. Temperature record with FBG: (a) training dataset, (b) test dataset.

Table 3 Power line: classes vs. functional states. Class

Functional state

C1 C2 C3 C4 C5 C6 C7

Normal operation: Uniform variation of temperature in the 3 sensors Fault: Temperature increase, active external source near sensor 3 Recovery: Temperature decrease, inactive external source near sensor 3 Fault: Temperature increase, active external source near sensor 2 Recovery: Temperature decrease, inactive external source near sensor 2 Fault: Temperature increase, active external source near 1 Recovery: Temperature decrease, inactive external source near sensor 1

5.1.1. Signal pre-processing Before applying the clustering method, a preprocessing of the selected temperature signals was made. The first part of the signal preprocessing corresponds to three steps: (a) signal filtering to eliminate high frequency noises, (b) subsampling due to the large amount of data (data are registered every 10 samples) and, finally, (c) the elimination of the thermal inertia in the three sensors through the flow of the same current (the absolute value of the difference between signals, jS1–S2j denominated S12, jS1–S3j denominated S13 and jS2–S3j denominated S23, was the action taken). The objective of the second part of the preprocessing was to allow differentiating the time of a fault occurrence and its recovery towards normal (or a degraded) operation. The value of the slope calculated in a window of 10 samples for each signal (original training dataset) generated three new variables (IS1, IS2 and IS3). The resulting signals were then normalized according to Eq. (46) and are shown in Fig. 4.

xin ¼ ðxi  xmin Þ=ðxmax  xmin Þ

ð46Þ

Fig. 4. Signal preprocessing: (a) training dataset, (b) test dataset.

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

14

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

5.1.2. Offline training of the fuzzy classifier The classifier is trained with the LAMDA algorithm in a supervised scheme, i.e. classes are established in advance and the normal and fault situations are present in the historical dataset. For every fault situation, the presence and isolation of the fault was differentiated. The association of the 7 classes with known process situations is presented in Table 3. The results obtained with the classifier in the training stage are presented in Fig. 5a, the solid line represents the evolution of the pre-established classes (reference classes) and the dash line represents the evolution of the classes recognized by the classifier. All the classes are well recognized, but in the transition intervals there are some differences in the identification of the situation changes. In Fig. 5b, the first transition from C1 to C2 is delayed with respect to the change in reference classes; but in the transition from C2 to C3 the classifier identifies the change of the class in advance. By analyzing the natural behavior of the signals, it is possible to establish that the variations are not significant until several samples after the moment when a disturbance occurs are registered. Due to the nature of the clustering method, it is not possible to recognize a change of class if there is not a considerable change in the spatial location of the sample; that is, a significant change in the measured values of the variables is required. 5.1.3. Offline training: calculating the predictor for membership degrees and for changes of membership degrees In order to obtain the U matrix for the training dataset, the LAMDA algorithm with supervised learning was used. Thus, with membership degrees of the historical dataset, Eqs. (15) and (23) are used in order to determine WFT and DWFT. Then, by applying the Least Squares Method algorithm, a solution for both WFT and DWFT was obtained. 5.1.4. Online process monitoring The monitoring stage uses the WFT and DWFT matrices (calculated only once during training) and the resulting lj(St1) and lj(St) from the fuzzy classifier. These membership degrees correspond to the values registered in the time t  1 and in the current time t respectively. Eq. (31) is evaluated to obtain the final prediction of the membership degrees l00j ðStþ1 Þ. The situations (classes) for the training and test dataset, assigned according to the maximum estimated membership degrees, are illustrated in Fig. 6a and Fig. 8a respectively. In the figures, the solid line represents the evolution of the reference situations; the dash line, the evolution of the situations according to the fuzzy classifier, and the dash-dot line, the evolution of the predictions. 5.1.5. Analysis of results In order to validate the proposed algorithm, the performance of the predictor was tested with the training dataset. In Fig. 6a, the results of prediction and classification for the training dataset are presented. The fuzzy classifier (dashed line) identifies the present situations, but there are delays in the transitions with respect to the reference class. It is also observed that the situation predictor (dashed-dotted line) is able to identify the current situations and that, in all transitions, the predictor is ahead of the classifier in at least 3 samples. In the training dataset, once the fault (abnormal temperature increase near sensor 3) in sample 128 starts – which slightly and gradually affects the process – the classifier detects it only at sample 149, while the prediction estimates the situation change accurately at sample 146 (see Fig. 6b). Fig. 7a illustrates the membership degrees obtained with the fuzzy classifier for samples 142 to 154, and Fig. 7b, the classes obtained both by the classifier and the predictor. In Fig. 7b, the fuzzy classifier registers the change from class C1 to class C2 at sample 149 (see solid line with black circles), while the predictor estimates the change of class at sample 146 (see grey solid line with grey ‘x’s). The predictor, based on the decreasing trend TC1 (dotted line) and the constant trend TC2 (x’s) – see Fig. 7b –, predicts membership degrees values in class C1 lower than those of class C2, thus obtaining accurate prediction 4 samples ahead of the classifier. Using the test data set, the predictor satisfactorily fulfills the prediction task by being ahead of the classifier in all cases (see Fig. 8). The performance of the predictor is adequate, since the variations in the signals of the process – due to the dis-

Fig. 5. Classification obtained with LAMDA: (a) classification for training dataset, (b) zoom: samples among 125 and 180.

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

15

Fig. 6. Situation behavior for the training dataset: (a) all samples, (b) zoom – samples 120 to 160: abnormal temperature increase near sensor 3 (around sample 128).

turbances – produce gradual variations in the membership degrees provided by the classifier, and these patterns are learned by the predictor. 5.2. Case study 2: boiler subsystem The second case study corresponds to the boiler subsystem of a steam generator [22]. The steam generator was designed as a scaled version to pilot a real steam generator at a nuclear plant. Operation of the process is as follows: the feed water flow is generated by a pump that propels water to a boiler. To maintain constant water level in the boiler, an On–Off controller operates through the pump. Therefore, the heat power value of the boiler depends on the steam accumulator pressure. When the accumulator pressure drops below a minimum value, the heat resistance that gives the maximum heat power is activated, and when achieving a maximum pressure the heat resistance is cut off to maintain the pressure at the set-point. The historical data contains 937 training samples, 1800 test samples, and 5 descriptor variables, corresponding to physical measures: the feed water flow, heat power, boiler pressure, boiler level and output steam flow. In order to homogenize the influence of the order of magnitude of the variables, data were normalized with respect to their maximum and minimum value. Membership degrees (matrix U) for each historical sample were obtained using the LAMDA algorithm. Five classes were identified by the fuzzy clustering method. Each sample was classified according to its maximum membership degree. In Table 4, the 5 resulting classes were associated to situations after previous validation by the process expert (100% clustering performance). Based on matrix U estimated for the training data set, WFT and DWFT matrices are obtained by solving Eqs. (15) and (23). And, by applying online monitoring steps, the estimator for training and test datasets is tested (see Figs. 9 and 10 respectively). It can be observed, in Fig. 9a, that the classes are correctly predicted which corresponds to the training dataset. It is also important to highlight that, contrary to the classifier, the prediction avoids (in some cases) the oscillation between classes C2 and C3 (see details in Fig. 9b). Classes C2 and C3 correspond to the same pressure regulation situation according to Table 4. Class 3 differs from Class 2 in that its variable values are very close to the values that determine the next situation. Prediction results for the test data set are shown in Fig. 10a. It can be observed that the predictor conserves the desired characteristic of eliminating the oscillations between classes C2 and C3 (see detail in Fig. 10b). It is also important to highlight that the change from class C2 to C5 was accurately predicted; this transition was not present in the training dataset (see detail in Fig. 10c). 5.2.1. Analysis of results This case study presents a difference with respect to the time response of the temperature system. In the boiler process, the changes happen faster. In the case of the FBG system temperature monitoring, once the disturbance occurs, the variables Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

16

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

Fig. 7. Prediction results for the training dataset (samples 142 to 154): (a) membership degrees obtained for the classifier, (b) classes: Classifier (circles) vs. Predictor (grey ‘x’s)

Fig. 8. Situation behavior for the test dataset: (a) all samples, (b) samples 406 to 450: abnormal temperature increase near sensor 3.

Table 4 Boiler subsystem: classes vs. states. Class

Functional state

C1 C2, C3 C4 C5

Normal operation Regulation: Pressure Regulation: Level Regulation: Level and pressure (simultaneously)

of the system vary slowly, and it is likely that a trend in the behavior will be found before a change of class arises. In the case of the boiler, the changes are drastic (changes of class between samples 8 and 9, following the occurrence of the disturbance) as can be seen in Fig. 11. Fig. 11a corresponds to a zoom on the graph of the membership degrees obtained for the historical dataset (between samples 4 and 12). It can be observed that soon after the disturbance arises (sample 7) membership degrees strongly change

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

17

Fig. 9. Situation behavior: classifier (empty circles) vs. predictor (asterisks): (a) training dataset, (b) zoom – samples from 870 to 890: oscillation between classes C2 and C3 identified by the classifier before the process evolves from class C1 to class C2.

Fig. 10. Situation behavior for the test dataset: classifier (empty circles) vs. predictor (asterisks): (a) all samples, (b) zoom – samples 330 to 370: oscillations between classes C2 and C3; (c) zoom – samples 105 to 140: change from class C2 to C5.

(membership values from class C1 and class C3 in samples 7 and 8) generating a change of class. This rapid change in the degrees of membership is associated with the rapid change of process variables valves due to any disturbance. Hence, no pattern can be detected, and the predictor can, at most, estimate the same class obtained by the classifier, thus eliminating oscillatory transitions (C3). Fig. 11b compares the classes identified by the classifier (solid line with circles) and the classes obtained by the predictor (solid line with ‘x’s). The disturbance that arises after sample 7 produces a drastic change in the membership degrees as recorded in sample 8. Consequently, the prediction for sample 8 follows the trend of the values for sample 7 and remains in class C1 (max membership value). Afterwards, with the recording of sample 8, the predictor estimates class C2 for sample 9, and again, the prediction of sample 10 based on the recording of sample 9 is class C2. These predictions for samples 8, 9, and 10 have bypassed the oscillation between classes C2 and C3 before the process evolves from class C1 to class C2.

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

18

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

Fig. 11. Prediction results for the test dataset (samples 4 to 12): (a) membership degrees obtained for the classifier, (b) class identification: classifier (empty circles) vs. predictor (x’s)

6. Conclusions and future work This paper presents a method for predicting situations based on the information extracted from a process by means of fuzzy classifiers. The existing relationship between probability values and possibility values (sustained in the literature) allowed the use of a prediction structure of the probability field, such as Markov’s chains, for the prediction of membership degrees. The proposal developed allows the use of any clustering algorithm that provides membership degrees. The basis of this work is a suitable classification characterizing process situations. This classification may be obtained by any fuzzy clustering algorithm. The proposed prediction of membership degrees was integrated into a general process monitoring system, and was evaluated as satisfactory according to the results obtained for the two processed case studies. When a system has a slow response to a disturbance, the predictor is able to estimate the following state adequately. If the response of the system is too fast, the predictor is able to follow this behavior in a conservative way (maintaining the state previous to the recorded fault). As future work, based on the state prediction proposal, it would be useful to build a representative automaton of the process. This automaton could incorporate the evolution of the process (situations prediction). In the proposed automaton, connection links between states would assist the operator by providing information about the evolution of the process at each time sample, and could be used as support in the decision making task. Since matrix WFT contains the transition situation information, the values of this matrix could be used as weights for the connections among the states of the process. Acknowledgements This work was sponsored by the international cooperation agreement COLCIENCIAS (COL) – ECOSNord (FRA), and CODI – Universidad de Antioquia UdeA (COL). References [1] C. Aguado, J. Aguilar-Martin, A mixed qualitative-quantitative selflearning classification technique applied to diagnosis, in: QR’99 The Thirteenth International Workshop on Qualitative Reasoning, Chris Price, 1999, pp. 124–128. [2] D. Aguado, C. Rosen, Multivariate statistical monitoring of continuous wastewater processing plants Engineering Applications of Artificial Intelligence 21 (7) (2008) 1080–1091. [3] J. Aguilar-Martín, R. Lopez De Mantaras, The process of classification and learning the meaning of linguistic descriptors of concepts, in: M.M. Gupta, E. Sanchez (Eds.), Approximate Reasoning in Decision Analysis, North Holland, 1982, pp. 165–175. [4] C. Bedoya, C. Uribe, C. Isaza, Unsupervised Feature Selection Based on Fuzzy Clustering for Fault Detection of the Tennessee Eastman Process, Advances in Artificial Intelligence, Springer-Verlag LNAI 7637, 2012, pp. 350–360. [5] J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Publishing Corporation, New York, USA, 1981.

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx

19

[6] J. Botía, C. Isaza, T. Kempowsky, M-V. LeLann, J. Aguilar-Martín, Automaton based on fuzzy clustering methods for monitoring industrial processes, Eng. Appl. Artif. Intell. 26 (4) (2013) 1211–1220. [7] A.A. Cuadrado, I. Díaz, A.B. Diez, M. Domínguez, J.A. González, F. Obeso, Maprex: a SOM based condition monitoring system, in: International Federation of Automatic Control 15th IFAC World Congress, Barcelona, España, 2002. [8] S. De Silva, M. Dias, V. Lopez, M.J. Brennan, Structural damage detection by fuzzy clustering, Mech. Syst. Signal Process. 22 (2008) 1636–1649. [9] R. Du, K. Yeung, Fuzzy transition probability: a new method for monitoring progressive faults. Part 1. The theory, Eng. Appl. Artif. Intell. 17 (2004) 457–467. [10] R. Du, K. Yeung, Fuzzy transition probability: a new method for monitoring progressive faults. Part 2. Applications examples, Eng. Appl. Artif. Intell. 19 (2006) 145–155. [11] D. Dubois, H. Prade, Unfair coins and necessity measures: towards a possibilistic interpretation of histograms, Fuzzy Sets Syst. 10 (1983) 15–20. [12] D.E. Gustafson, W.C. Kessell, Fuzzy clustering with a fuzzy covariance matrix, in: IEEE Conference on Decision and Control Including the 17th Symposium on Adaptive Processes, University of California, Berkeley, California, 1978, pp. 761–766. [13] L. Hedjazi, J. Aguilar-Martin, M-V. Le Lann, Similarity-margin based feature selection for symbolic interval data, Pattern Recogn. Lett. 32 (4) (2011) 578–585. [14] L. Hedjazi, J. Aguilar-Martin, M-V. Le Lann, T. Kempowsky, Towards a unified principle for reasoning about heterogeneous data: a fuzzy logic framework, Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 20 (2) (2012) 281–302. [15] C. Isaza, E. Diez-Lledo, H. Hernandez de Leon, J. Aguilar-Martin, M-V. Le Lann, Decision method for functional states validation in a drinking water plant, in: 10th International Symposium on Computer Applications in Biotechnology, Cancún-Mexico, 2007, pp. 359–364. [16] C. Isaza, A. Orantes, T. Kempowsky, M-V. Le Lann, Contribution of fuzzy classification for the diagnosis of complex systems, in: 7th IFAC Symposium on Fault Detection, Supervision and Safety of Technical Processes, Barcelona-Spain, 2009, pp. 1132–1137. [17] R. Isermann, P. Ballé, Trends in the application of model-based fault detection and diagnosis of technical processes, Control Eng. Pract. 5 (5) (1997) 709–719. [18] A.K. Jain, M.N. Murty, P.J. Flynn, Data clustering: a review, ACM Comput. Surv. 31 (3) (1999) 264–323. [19] A.K. Jain, Data clustering: 50 years beyond k-means, Pattern Recogn. Lett. 31 (8) (2010) 651–665. [20] K. Jyoti, S. Singh, Data clustering approach to industrial process monitoring, fault detection and isolation, Int. J. Comput. Appl. 17 (2) (2011) 41–45. [21] T. Kempowsky, A. Subias, J. Aguilar-Martin, Supervision of complex processes: strategy for fault detection and diagnosis, in: MCPL’04, Third Conference on Management and Control of Production and Logistics, Santiago de Chile-Chile, 2004. [22] T. Kempowsky, A. Subias, J. Aguilar-Martin, Process situation assessment: from a fuzzy partition to a finite state machine, Eng. Appl. Artif. Intell. 19 (2006) 461–477. [23] A. Lemos, W. Caminhas, F. Gomide, Adaptive fault detection and diagnosis using an evolving fuzzy classifier, Inform. Sci. 220 (2013) 64–85. [24] M. Omran, A.P. Engelbrecht, A. Salman, An overview of clustering methods, Intell. Data Anal. 11 (6) (2007) 583–605. [25] A. Orantes, T. Kempowsky, M-V. Le Lann, J. Aguilar-Martin, A new support methodology for the placement of sensors used for fault detection and diagnosis, Chem. Eng. Process.: Process Intensif. 47 (3) (2008) 330–348. [26] N. Piera, J. Aguilar, Controlling selectivity in non-standard pattern recognition algorithms, IEEE Trans. Syst. Man Cybernet. 21 (1) (1991) 71–82. [27] N. Rakoto-Ravalontsalama, J. Aguilar-Martin, Automatic clustering for symbolic evaluation for dynamical system supervision, in: Proc. of American Control Conference, ACC’92, Chicago, USA, 1992, pp. 1895–1897. [28] S. Ross, Stochastic Processes, second ed., John Wiley & Sons, USA, 1996. [29] J. Stewart, Single Variable Calculus: vol. 2, Early Transcendentals, 7th Edition, MacMaster University, CAN, 2012. [30] G. Sylviane, Supervision des Procedes Complexes, Traite IC2, Serie Systemes Automatises, Lavoisier, 2007. [31] E. Udd, W.B. Spillman, Fiber Optic Sensors: An Introduction for Engineers and Scientists, second ed., John Wiley& Sons, Hoboken, USA, 2011. [32] A. Yan, W. Wang, C. Zhang, H. Zhao, A fault prediction method that uses improved case-based reasoning to continuously predict the status of a shaft furnace, Inform. Sci. (2013), http://dx.doi.org/10.1016/j.ins.2013.04.025. [33] F.T.S. Yu, S. Yin, Fiber Optic Sensors, Marcel Dekker, New-York, 2002. [34] L.A. Zadeh, Fuzzy sets as a basis of theory of possibility, Fuzzy Sets Syst. 1 (1978) 3–28. [35] L.A. Zadeh, Fuzzy probabilities, Inform. Process. Manage. 20 (3) (1984) 363–372. [36] Z.J. Zhou, C.H. Hu, D.L. Xu, J.B. Yang, D.H. Zhou, New model for system behavior prediction based on belief rule based systems, Inform. Sci. 180 (2010) 4834–4864.

Claudia Isaza is an Assistant Professor in Electronic Engineering at the University of Antioquia, Medellín, Colombia. Her research interests include complex systems monitoring using clustering methods, data mining, fuzzy logic. She received her bachelor’s degree in electronic engineering from Distrital F.J.C. University (Bogota-Colombia) in 2002, her M.S. degree in electrical engineering (control emphasis) from Andes University (Bogota-Colombia) in 2004, and her Ph.D. degree in Automatic Systems from INSA, Toulouse, France in 2007.

Henry O. Sarmiento Maldonado is an Associated Professor in Instrumentation and Control Engineering at the Jaime Isaza Cadavid Polytechnic Institute, Medellín, Colombia. His research interests include complex systems monitoring using clustering methods, data mining, fuzzy logic, artificial neural networks and electric power system. He received her bachelor’s degree in electrical engineering from University of Antioquia. (Medellín-Colombia) in 1996, his Specialist degree in Industrial automatization from University of Antioquia in 1998, his M.S. degree in engineering (electric power system emphasis) from University of Antioquia in 2008, and he is Ph.D. student (last year) in Electronic Engineering of the University of Antioquia.

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030

20

C.V. Isaza et al. / Information Sciences xxx (2014) xxx–xxx Tatiana Kempowsky-Hamon is a Research Engineer since 2007 at the Laboratory for Analysis and Architecture of Systems (LAAS-CNRS), Toulouse, France. In 2000 she graduated as an Electrical Engineer from Los Andes University, Bogota Colombia. She obtained her MSC. In Industrial Systems from the National Institute of Applied Sciences (INSA) Toulouse, France in 2001; and received her Ph.D. in 2004 on Industrial Systems Supervision. Her research interests are the supervision, fault detection and diagnosis of industrial systems by means of data driven methods such as clustering and fuzzy classification. Currently she is works on feature selection techniques for cancer diagnosis/prognosis.

Marie-Véronique Le Lann obtained her diploma in Chemical Engineering in 1981 at ENSIGC (INP of Toulouse, France), her Ph.D. in 1988 at INPT. She has been assistant professor at ENSIGC (INPT) until 1999 with activities in the field of model predictive control and process control more generally. She is full-professor at Institut National des Sciences Appliquées since 1999 and performs her research activities at Laboratoire d’Analyse et d’Architecture des Systèmes at Toulouse in France. Her works deals with supervision, diagnosis, features selection, information processing applied to processes or medical domain. She is author of 220 articles in journals and conferences.

Please cite this article in press as: C.V. Isaza et al., Situation prediction based on fuzzy clustering for industrial complex processes, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.04.030