Fuzzy classification with an artificial chemical process

Chemical Engineering Science 60 (2005) 399 – 412 www.elsevier.com/locate/ces Fuzzy classiﬁcation with an artiﬁcial chemical process Roberto Irizarry∗...

Download PDF

300KB Sizes 4 Downloads 76 Views

Report

PDF Reader
Full Text

Chemical Engineering Science 60 (2005) 399 – 412 www.elsevier.com/locate/ces

Fuzzy classiﬁcation with an artiﬁcial chemical process Roberto Irizarry∗ DuPont Electronics Microcircuit Industries, Ltd, P.O. Box 30200, Manati, PR 00674-8501, USA Received 29 October 2003; received in revised form 3 May 2004; accepted 27 July 2004 Available online 22 October 2004

Abstract In this work, a new algorithm to extract a compact set of if/then rules from data for classiﬁcation problems is presented. The premise is extracted directly using LARES as a learning tool, which is a new global optimization procedure based on a new recently introduced paradigm called artiﬁcial chemical process. The conclusion part is determined using soft computing techniques. In the learning phase, the objective function minimizes the number of misclassiﬁed patterns from training data and reduces the conﬂicts between the rules to generate the pattern partition. The proposed method has many potential applications in industrial processes. Several examples are presented, including fault detection and operation of reactions with unstable regimes. 䉷 2004 Elsevier Ltd. All rights reserved. Keywords: Fuzzy logic; Pattern classiﬁcation; Linguistic model; Artiﬁcial chemical process; Global optimization

1. Introduction 1.1. Fuzzy and neuro-fuzzy systems A fuzzy inference system consists of a set of rules described in if/then statements, which together determines the action (output) for a given situation (input). This capacity for explaining responses on the basis of human-like reasoning has proven to be a very powerful tool for industrial applications. Inference systems have been used in designing feedback controllers (De Carli et al., 1994) and extracting control rules for robotic applications (Zhang and Ferch, 2003). When combined with neural networks, it has been applied to monitoring nuclear reactors to estimate the departure from nucleate boiling protection limit (Na, 1999). Other applications to industrial processes that exhibits complex operational behavior are sintering processes (Er et al., 2000), polymer molding processes (Li et al., 2002), and the generation of activated sludge (Du et al., 1999), among others. In

∗ Tel.: +1-787-621-1460; fax: +1-787-621-1403.

E-mail address: [email protected] (R. Irizarry). 0009-2509/$ - see front matter 䉷 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.ces.2004.07.123

addition to industrial applications and robotics the development of inference systems also has important applications in medicine, speciﬁcally for medical diagnosis (Sanchez, 1998; Sanchez and Bartolin, 1990). This approach can prove very important to the production of high performance materials in the electronic and pharmaceutical industries, which involve complex interactions between raw material properties, processing conditions and end-product properties. Given its practical importance, automatic generation of inference systems from data is an important area of research. Algorithms have been developed that combine genetic algorithms with fuzzy logic (Karr, 1991; Sugeno and Yasukawa, 1993; Wang and Mendel, 1992; Ishibuchi et al., 1992, 1995). It has been demonstrated that, for some neural networks, there is an equivalence between the neural network and a fuzzy inference system (Jang and Sun, 1993). Many algorithms exploit this concept by combining GA fuzzy logic and neural networks, where the rules are extracted from the trained network (De Carli et al., 1994; Zhang and Ferch, 2003; Na, 1999; Er et al., 2000; Li et al., 2002; Du et al., 1999; Simpson, 1992; Tsoukalas and Uhrig, 1997; Russo, 1998, 2000; Mastorocostas and Theocharis, 2000; Su and Chang, 2000; Chung et al., 2000). Most of the algorithms

400

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

consist of a ﬁve-layer structure, with the training process consisting of two phases. In the ﬁrst phase, GA is used to overcome the problem of getting trapped in local minima. In the second phase, back-propagation is used to improve output precision. Other methods use singular value decomposition to solve an over-speciﬁed linear system. Less attention has been paid to developing specialized algorithms for the classiﬁcation problem. The algorithms developed by Ishibuchi et al. (1992, 1995) are one of the few methodologies developed for the classiﬁcation problem. They consists of two phases: (1) fuzzy partition of a pattern space and (2) identiﬁcation of the fuzzy rule for each subspace determined by the partition. The partition of the pattern space is based on a “multi-grid” partition. This coarseto-ﬁne partition generates a high classiﬁcation power, but it also generates a very large number of if/then rules. To alleviate this problem, GE was utilized to reduce the number of rules while preserving a high classiﬁcation power (Ishibuchi et al., 1995). This method outperformed many other methods like fuzzy k nearest neighbor, fuzzy c-means, fuzzy integral with perceptron and others methods in classifying the Iris data (Russo, 1998).

1.2. Optimization algorithms The learning phase of fuzzy or neurofuzzy systems consists of solving a complex multi-modal optimization problem. When the structure of the model is also part of the solution, the learning phase becomes a complex multi-modal combinatorial optimization problem. As mentioned in the previous section, most of the works in the literature rely on GA (Holland, 1975; Goldberg, 1989) for the training phase, due to its ability to escape from local minimum and its coding capability for representing real and combinatorial variables. Other optimization algorithms like gradientbased methods are not suitable for this task, since they get trapped in local minimum. Evolution strategies (Schwefel, 1995) developed for real parameter optimizations cannot be used when the structure of the problem is part of the solution. Simulated annealing is too slow and its performance strongly depends on the cooling schedule. In this article, LARES, a new optimization algorithm introduced by Irizarry (2004a) is used in the learning phase. This algorithm is based on a new paradigm called artiﬁcial chemical process. LARES has the properties of escaping from local minimum and it is also very ﬂexible in encoding generic variables, needed properties for the learning task. The performance is very robust in terms of tuning parameters, and the operators are very simple and do not need to be modiﬁed. These properties allow LARES to be used as a black box without the need for tuning by an expert, a desired property for application engineers. At the same time, the structure of the operators is simple enough for an expert to tailor the algorithm for a particular application. See Irizarry (2004a) for more details on the properties of the algorithm.

The main objectives of this article are: (a) to develop a simple algorithm to generate a compact set of fuzzy rules for the classiﬁcation problem and (b) to demonstrate that LARES’ synergy with fuzzy systems can be also an efﬁcient tool in the learning phase. This paper is organized as follows: in Section 2, the LARES algorithm is described; in Section 3, the new algorithm for pattern classiﬁcation, LFC, is presented; in Section 4, the results are presented; and the conclusions are presented in Section 5.

2. LARES artiﬁcial chemical process The optimization problem: Consider the optimization problem in its more general form: Min F (), ∈

(1)

where is the vector of decision variables to be determined, is the feasible space of the decision variables, and F is the performance index to be minimized. No restriction is assumed on the type of variables (continuous, integer, combinatorial, non-numerical or symbolic, etc.). Also, there is no restriction on the type of function F. The application of LARES to solve the optimization problem consists of: • Encoding the decision variables, , into a new set of variables called molecules. • Evaluating the performance index at each iteration. Representation of the decision vector with molecules: The possible solutions to the problem (decision variables) are encoded into a ﬁnite set of variables (xj , j = 1, . . . , V ) called molecules, ( = H (x)). These molecule variables are deﬁned over a small range of discrete values. Let j be the set of possible values that the variable xj can have, j j j ={1 , . . . , Mj }, where Mj is the total number of possible values for molecule j. The discrete value assigned to the molecule is called the state. If the optimization includes real parameters as part of the decision vector, then binary encoding similar to GA can be used, but in this case each bit represents a molecule. Reaction or activated state: The state (value assigned to the molecule variable) of the best solution found so far is called a ground state, g . During each iteration, some of the molecules change states systematically to generate new trial vectors. This “reaction” to generate an activated state is a powerful concept, different from GA or other populationbased algorithms. As will be seen in the description of the algorithm, all of the operators act using this concept, taking a molecule at a time thus making each molecule independent of the others. This concept is very ﬂexible and can be advantageous for certain types of problems: (1) Each molecule can have a different range of states that can be visited (j ), which allows multiple encoding to be a natural part of the algorithm. This property is not shared with other

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

algorithms like genetic algorithms. (2) Constraints or bias (provided by problem-speciﬁc information) can be added to the states that a given molecule can visit, in order to satisfy constraints or to add knowledge-based information for the problem at hand. (3) Since the molecule variables are independent of each other, the molecule vector can be indexed in any desired way (i , ij k , etc.). Underlying principles of the algorithm: This is an iterative improvement methodology, which considers one solution at a time. Let Z be the set of all V molecules in the system g and xj the state for the molecule j for the best value found g g so far, x g = (x1 , . . . , xV ). If this solution can be improved, then ∃A ⊆ Z, in which each molecule in A has a new state g xjt = xja = xj ∀j ∈ A and all other molecules are in the g state of the best solution found so far (xjt = xj ∀j ∈ Z\A) generating a new vector x t such that F (x t ) < F (xg). To ﬁnd this set, the following strategy, called artiﬁcial chemical process, is applied. The ﬁrst step is to make a perturbation to the system by selecting a random set AR, and for each element of AR a new random state is assigned, g xja = xj ∀j ∈ AR. If this is the desired set AR = A, the trial state vector is accepted as the new best value found so far. If not, the following hypothesis is postulated to improve the perturbation with the hope of ﬁnding A: H1: ∃A ⊆ AR, in which each molecule in A has a new state generating a new vector x t such that F (x t ) < F (xg). The hypothesis H1 is tested using the following iterative procedure. In each iteration, select E molecules from AR and return their state to the state of the best value found so g far (xjt = xj ∀j ∈ E). If the new perturbation is improved (by some deﬁnition of improvement) from the previous one, then AR=AR\ E. Otherwise, all molecules in E are returned to AR with molecules in a new random state. This process is continued until A is found or until a predetermined termination criterion for AR is achieved. In the second case it is assumed that H1 is false, so then a brand new AR is generated and H1 is tested again. Iterations continue until an algorithm termination criterion is achieved. The elements of the algorithm are: 1. Selection of the perturbation set AR by a probability distribution function (PDF) and selection of new states for each molecule in AR. 2. Selection of the extraction set E by a PDF. 3. The improvement criterion for selecting whether molecules in E are returned back to AR in a new state or are separated from AR and back to the state of the best solution found so far. 2.1. Algorithm outline Let L, AR, E and S be four disjunctive sets whose elements are the molecule variables. Let F be the objective function to be minimized, expressed in terms of the vector of molecule variables, F = F (x). Let the value of the best

401 g

g

solution found so far be x g = (x1 , . . . , xV ). The algorithm consists of redistributing the molecule variables among the sets to generate new trial vectors using the following rule: whenever a molecule variable xj becomes a member of the set AR, a new value different from the best value found is g assigned to it. Let xja = xj be the new value, which will not change until the variable is moved out of the set AR. For each iteration, the trial vector x t = (x1t , . . . , xVt ) is then constructed using the following formula: g x if xj ∈ / AR, t (2) xj = ja xj if xj ∈ AR. With these deﬁnitions, the algorithm is described next: Initialization: 1. The algorithm starts by initializing x g randomly and placing all variables in L (set AR = E = S = ⭋, L = {x1 , . . . , xV }). Outer loop: Perturbation to form AR. 2. Select a random number, |Trx | (|Trx | |L|), from a uniform PDF: |Trx | = min(int([V × co ] + 1), |L|),

3. 4. 5. 6. 7.

8.

9.

(3)

where is uniformly distributed in (0,1) and co is an adjustable parameter used to select the average fraction of elements to be selected from V, which is the total number of molecules representing the possible solutions to the problem. Select |Trx | elements randomly from L to form the subset Trx , (Trx ⊆ L). Transfer the subset to AR: L = L\Trx ; AR = AR ∪ Trx . Select a random new value for each molecule variable in g Trx : xja = xj ∀j ∈ Trx . The new trial vector, x t , is generated using Eq. (2). If F (x t ) < F (x g ), the trial vector is accepted as a new best solution found, x g = x t . In this case, all of the elements in AR are sent to the set S: S = S ∪ AR; AR = ⭋. If the algorithm termination criterion is achieved, exit the algorithm and return x g as the solution to the optimization problem. If a better solution was found in step 7, skip the inner loop and perform another outer loop iteration: Go to step 17. Otherwise, continue with step 9. Initialize the parameter RP: RP = F (x t ). This parameter is used and modiﬁed in the “goodness” test in the inner loop. Also set |AR|0 = |AR|, which is the initial number of molecules in AR before starting the next inner loop. Inner loop: Iterative improvement of AR

10. Select a random number, |E|, from a prescribed PDF, |E| |AR|. The formula used is: |E| = min [int([|AR|0 × ci ] + 1), |AR|],

(4)

402

11. 12. 13. 14.

15.

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

where |AR|0 is deﬁned in step 9 and ci is another adjustable parameter. Select |E| elements randomly from AR to form the subset E. Extract the subset E from AR: AR = AR\E. The new trial vector, x t , is generated using Eq. (2). If F (x t ) < F (x g ), the trial vector is accepted as a new best solution found, x g = x t . All of the elements in AR are sent to the set S: S=S∪AR; AR=⭋. If the algorithm termination criterion is achieved, exit the algorithm and return x g as the solution to the optimization problem, Improvement criterion for AR: If F (x t ) RP, the hypothesis is that there is a high probability that most elements in E will prefer to stay in their ground state g (xj =xj ) to generate better solutions. In this case, the elements in E are transferred to S: S=S∪E; E=⭋ and the metric RP is updated, RP = F (x t ). When F (x t ) > RP, the hypothesis is that there is a high probability that most elements in E will induce a better solution if they are in a different state from their ground state. In this case, a new activated state is generated for all elements g in E (xj = xja = xj ∀j ∈ E), and all of the elements in E are transferred back to AR (AR = AR ∪ E; E = ⭋).

Check conditions to exit or continue the inner loop: 16. If one of the following conditions is satisﬁed, the algorithm will exit the inner loop (go to step 17): • If a better solution was found in step 14. • The number of elements in AR is less than or equal to one, |AR| 1. For some particular problems this restriction could be generalized to a generic threshold parameter (|AR| < c), but this case is not considered in this work. • Large recycle-ratio (RR) deﬁned as rec RR = RRT |AR|0

(5)

where rec is the counter of the number of times that E is sent back to AR in the current inner loop and |AR|0 deﬁned in step 9 is the initial number of molecules in AR generated in the outer loop. The parameter RRT is adjustable. Otherwise, a new inner loop iteration is started: go to step 10. Note that most of the inner loops should be terminated naturally either by a better solution or by |AR| 1. The RRT is set large enough to let most of the inner loops to be terminated naturally by a better solution or |AR| 1. The RRT should not be too large, because then many iterations will be allowed for an undesired inner loop, affecting the algorithm’s performance.

Check the number of elements in L and AR, and the algorithm termination criterion: 17. Check the number of elements in the sets L. If the number of elements in L is below a prescribed value, LT, all of the elements in the set S are transferred: L = L ∪ S; S = ⭋. If the number of elements in L is still too low (|L| LT), then all of the elements in AR are transferred to L: L = L ∪ AR; AR = ⭋. 18. Start an outer loop iteration, returning to step 2. The following parameters are used: RRT = 1.0, co = 0.3, ci = 0.25, LT = V /2. A ﬂow chart of the algorithm is shown in Fig. 1. 2.2. The algorithm viewed as an artiﬁcial chemical process This procedure has many analogies with a real chemical process. The molecule variables can be viewed as molecules that will be transformed from one state to another (variable value) by a “reversible chemical reaction”. First the Activation Reactor, AR, is loaded from the Load unit, L. A reaction is performed, and the undesired byproducts are separated from the product by separation processes. In this separation process, some molecules are extracted from the activation reactor and sent to the Extraction unit, E. If the Reactor Performance, RP, is improved, the extracted material is sent to the Separation unit, S. Otherwise, the material is recycled back into the activation reactor to be re-processed. These outer–inner iterations (reaction–puriﬁcation) are continued until the product cannot be improved anymore. A schematic representation of this process is shown in Fig. 2.

3. LFC algorithm The following algorithm, called LARES fuzzy classiﬁcation (LFC), generates a fuzzy inference system for automatic pattern classiﬁcation from data using LARES as the learning tool. The hyperstructure of the algorithm is shown in Fig. 3. At each LARES iteration, the premise part is generated from the new trial vector, and then the pattern data are used to generate the conclusion part using soft computing techniques. The resulting fuzzy inference system is tested in terms of classiﬁcation power using the same data to determine the objective function to be minimized by LARES. This method is an alternative approach, which generates fuzzy rules to classify patterns without the use of a grid partition, SVD or a ﬁxed neural-network structure. The algorithm is described in detail in the following sections. In Section 3.1, the steps to generate the premise part are described. Then the premise part of this structure is encoded into a set of molecule variables as described in Section 3.1.1. In Section 3.1.2, a new transformation called re-assignment transformation is discussed for handling more ﬂexible membership functions. The conclusion part is

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

403

Start

Select AR ⊆ Z Activate molecules in AR Set RP=F(xt )

Better solution xg=xt

A

xt

YES

Stop

YES B

Select E ⊆ AR Deactivate molecules in E AR=AR\E

xt

YES Stop

A YES B

NO

YES F(xt)
Reactivate molecules in E AR=AR ∪ E

RP=F(xt)

YES

NO

Inner termination Criteria

A

= If termination criteria is satisfied

B

= If F(xt)
Molecules in a new state Molecules in their ground state

Recycle: Activate-Transfer

L

Activate-Transfer

Reaction rejected

AR

Deactivate-Transfer

E

Product (reaction completed)

Reprocess material

Fig. 2. Schematic representation of LARES.

Extract

S

404

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

LARES

Decode to generate membership function parameters and Combinatorial variables (section 3.1)

- Use RT if necessary.

F () Generate the consequent part (section 3.2)

Data

Measure the performance of the resulting fuzzy algorithm (section 3.3) Fig. 4. Trapezoid rule deﬁned in Eq. (7). The parameters are the nodes locations that should satisfy the constraint in Eq. (8). The RT is used to handle this type of membership function. Fig. 3. Hyper-structure of the LFC algorithm.

generated from the training data as described in Section 3.2. This “trial” fuzzy algorithm is tested using the inference engine on the same pattern data set. The objective function is then constructed to maximize classiﬁcation power, as described in Section 3.3. The remarks on algorithm implementation are presented in Section 3.4. 3.1. Generating the premise part using LARES The rule base system considered in this algorithm consists of R rules of the following generic structure: Rule r: IF x1 is A1r and x2 is A2r and . . . xN is AN r THEN x belongs to Cr ,

(6)

where xi is the input for variable i to the fuzzy inference system and Air is the fuzzy set associated with variable i and rule r. The class Cr is the consequence of the rule r. Each fuzzy set, Air (xi ), is described by one-dimensional membership functions, ir (xi ). The model includes a combinatorial variable ir that deﬁnes the existence of the predicate corresponding to variable i and rule r. When ir = 1, the variable i forms part of the premise in rule r, that is, xi is Air , is part of the premise and is zero otherwise. There are several alternatives for the membership functions used to describe the fuzzy set. In this section, it is assumed that all variables are normalized. The membership function is of the form

ir (xi ) = (xi ; pir ),

(7)

where xi is deﬁned in the interval [0,1] and pir is the vector of parameters that deﬁne the membership function. The triangular membership function is:

(xi ; pir ) = max{1 − |xi − air |/bir , 0}.

(8)

The grade of membership deﬁned by this equation is 1 when xi =air and positive in the open interval (air −bir , air +bir ).

A gaussian membership function is another two-parameter model that can be used, 2 (xi ; pir ) = g(cir , air ) ≡ exp(−(xi − cir )2 /air ),

(9)

where cir is the center of the distribution and air is the width. A more generic membership function is the trapezoid function deﬁned by four parameters:

(y, pir )  if    if    if = if       if

(y air )(y, pir ) = 0, (air < y < bir )(y, pir ) = (y − air )/(bir − air ), (bir y cir )(y, pir ) = 1, (cir < y < dir )(y, pir ) = (y − cir )/(dir − cir ), (y dir )(y, pir ) = 0, (10)

where 0 air bir cir dir 1.

(11)

The four parameters, air , bir , cir , and dir , locates the position of the four nodes as shown in Fig. 4. In this case, the grade of membership is equal to 1 in the closed interval [bir , cir ] and positive in the open intervals (air , bir ) and (cir , dir ). This membership function is very ﬂexible, since the non-fuzzy part is explicitly represented and the fuzzy part is not limited to being symmetric. For some applications, the importance between variables must be weighted (Sanchez, 1998). In this model, the membership function is modiﬁed by -cut functions,

(xi ; pir ) = MAX(, (xi ; pir )),

(12)

where is deﬁned between zero and one. A value of one indicates that the variable does not contribute to the premise.

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

For a given input pattern vector x, the ﬁre strength of the precedent of a fuzzy rule r is given by

r (x) =

N

(xi ; pir ),

(13)

i=1 (i,r)=1

when the multiplication operator is utilized. Alternatively, when the min operator is utilized, the ﬁre strength of a rule r is given by

r (x) =

N

(xi ; pir ).

405

the computational space with the corresponding index: air = zir,s(1) , bir = zir,s(2) , cir = zir,s(3) , dir = zir,s(4) .

(16)

This transformation is called the re-assignment transformation (RT). 3.2. Procedure to determine the consequent part for classiﬁcation

(14)

i=1 (i,r)=1

Notice that the combinatorial variable determines which variables form part of the premise. 3.1.1. Encoding the premise using molecule variables The decision variables in LARES are the parameters, pir , that deﬁne the membership functions ir (xi ) = (xi ; pir ) and the combinatorial variable ir . There are P ∗ N ∗ R parameters and N ∗ R pointers variables, where P is the number of adjustable parameters for the membership function utilized. For each real parameter, K molecule variables are used, which can have two possible state values (0 and 1). Binary encoding is then used to map the state of the molecules to the value of the real parameters. For each pointer, ir , a molecule variable is used whose value equals the value of the pointer (ir = j ), where j is the index of the molecule assigned to represent the value of ir .

Given the antecedents of the fuzzy rule system from the LARES trial vector, the consequence part is generated using soft computing techniques (Ishibuchi et al., 1992). This is a very effective and rapid method for calculating the conclusion for the classiﬁcation problem. The training data consists of T patterns. Each pattern belongs to one of the M possible classes. Each pattern in the training set, p, consists of N p p variables, x p = (x1 , . . . , xN ), and the corresponding class is Cp . It is assumed that all variables are normalized. The procedure for determining the conclusion of a rule is described as follows: for each rule r, calculate the cumulative ﬁring power of all patterns for each class, Ci :

Ci =

r (x p ).

(17)

p∈Ci

The class with the largest ﬁring strength is the consequence of the rule: Cr = Cb such that Cb = max{ C1 , . . . , CM }.

(18)

3.1.2. Encoding non-symmetrical membership functions When the trapezoid membership function is utilized, the constraints in Eq. (12) add an additional difﬁculty in using LARES, as this constraint will be violated frequently if the parameters (air , bir , cir , dir ) are used as decision vectors in the optimization problem. To avoid the complexity of this constraint, a two-step encoding strategy (Irizarry, 2004b) is utilized to satisfy the constraints in all iterations. To encode this type of membership function instead of using the parameters (air , bir , cir , dir ) as part of the decision variables, a set of “computational” parameters (zir,1 , zir,2 , zir,3 , zir,4 ), also deﬁned in the interval [0, 1] × [0, 1] × [0, 1] × [0, 1], are used as decision variables. LARES will sample in the computational domain and then deﬁne a transformation that determines the original problem parameters. Mapping from the computational space to the parameter space is accomplished by ﬁrst sorting the computational parameters in ascending order (Bean, 1994):

Although this parameter does not form part of the fuzzy inference system, it will be utilized in the objective function to further manipulate the fuzzy partition. Inference mechanism: Given the rule set, to classify a new pattern x t , the inference mechanism consists of ﬁnding the rule with maximum ﬁre strength:

zir,s(1) zir,s(2) zir,s(3) zir,s(4) ,

r¯ = max{r (x t ) | rule exists}.

(15)

where s(n) is the variable of the computational space with the nth rank in the sorted list. The parameters of the membership function are then determined by the parameters of

If there is more than one class with the largest ﬁring strength, the consequence cannot be resolved and the rule is eliminated from the inference engine. Similarly, if Cb is zero, there are no data in the partition covered by the premise and the rule is also eliminated from the inference engine. Let the segregation parameters be deﬁned by

C Sr ≡ M b . i=1 Ci

r

(19)

(20)

Then, the classiﬁcation of x t is the consequence of rule r¯ , Cr¯ . If there is more than one class with the maximum , the pattern is unclassiﬁed.

406

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

3.3. Objective function for the classiﬁcation problem To evaluate the performance of the trial inference system at each LARES iteration, the inference system is applied to the training data. Let Nfail be the number of patterns that where misclassiﬁed or not classiﬁed, and the objective function to be minimized by LARES is of the form: F = Nfail + w

R

(1 − Sr )2 ,

(21)

r=1

where w is a weight factor. The second term is used to try to eliminate conﬂict in a given rule to improve knowledge. This second term is very important when it is desirable to determine which regions of the input space belongs to one class with 100% accuracy.

important in each rule, which is not recommended for system with many variables. (c) Membership function and operation type: As a default, the triangular membership function with the min operator can be utilized to simplify interpretation of the fuzzy system. The trapezoid membership function can be selected if the user wishes to separate the non-fuzzy parts of the domain from the fuzzy parts. This second alternative makes the learning phase slower due to the 2X increase in parameters for a given problem and the transformation needed (Section 3.1.2) adding more nonlinearities to the problem. As discussed earlier, LARES is very robust, and its optimization parameters are ﬁxed with no further manipulation. The weight factor in Eq. (21) is set to zero. If the user desires to have each rule be as independent from the others as possible, this parameter can be set to unity.

3.4. Remarks on algorithm implementation To implement the algorithm, the only information needed p p is the training patterns, x p = (x1 , . . . , xN ; Cp ), p = 1, Np . The user-speciﬁed parameters are (a) number of rules, R, (b) type of structure identiﬁcation, and (c) type of membership function and operation type. The number of rules is the most important parameter of the algorithm. Some remarks on these tuning parameters follow. (a) Number of rules: This is the most important parameter of the algorithm. The optimal number is found by starting with a low value of R and generating the inference system using LFC. Repeat the procedure increasing the number of rules R = R + N (where N is a small number; a value of one is recommended). Continue until there is no further improvement in classiﬁcation rate caused by increasing the number of rules. All other user-speciﬁed parameters are constant during this continuation process. This approach is efﬁcient in ﬁnding a compact set of rules. Starting with a large value of R has the danger of over-specifying the system. As an example, the iris data discussed later have the following performance: (R, classiﬁcation rate) = (2, 50%), (3, 98%), (4, 98%), (5, 99.3%), when the triangular membership function with the min operator is used. When the trapezoid membership function is utilized the prediction powder increases to greater than 99% with four rule and 100% with ﬁve rules. In this case an inference system 4–5 rules is selected. (b) Type of structure identiﬁcation: The default is to have the combinatorial variable ir active (part of the solution of the optimization problem), in order to identify the structure of the fuzzy rules. When Eq. (13) is used, the combinatorial variable can be eliminated from the set of decision variables, since the -cut will eliminate the terms that do not contribute to the rule. When both variables are disabled, it is implied that all variables are

4. Test results The performance of the LFC procedure is tested using classiﬁcation benchmark problems. The Iris data have been used extensively to test new algorithms, see Wang and Mendel (1992) and references therein. The second set of benchmark problems consists of a set of non-linear functions used by Ishibuchi et al. (1992) to test their algorithm. The algorithm is also applied to potential applications in chemical engineering. The ﬁrst application considers a fault detection problem in a chemical reactor. The second application involves a classiﬁcation of areas of stable operation vs. unstable operation in a reactor. 4.1. Iris data The Iris data (Fisher, 1936) consists of four inputs to describe the three possible classes; iris setosa, virginica, and versicolor. The input variables are sepal length in cm (in the range of [4.3–7.9]), sepal width in cm (in the range of [2.0, 4.4]), petal length in cm (in the range of [1.0, 6.9]), and petal width in cm (in the range of [0.1, 2.5]). The best results found for this type of problem are achieved by Ishibuchi et al. (1995) with high classiﬁcation power using 13 rules and Russo (1998, 2000) with high classiﬁcation power using ﬁve rules. The test results using LFC are summarized in Table 1. Table 1 Pattern classiﬁcation for the Iris data Type of operator Membership function Number of rules % classiﬁed Mult. Mult. Mult. Min Min

Trapezoid Triangular Gaussian Trapezoid Trapezoid

5 5 5 5 4

100 100 99.3 99.3 99.3

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

1 0.5 0 0.0

1 0.5 0 0.0

1 0.5 0 0.0

1 0.5 0 0.0

1 0.5 0 0.0

0.5

0.5

0.5

0.5

0.5

1.0

1 0.5 0 0.0

0.5

1.0

1 0.5 0 0.0

1.0

1 0.5 0 0.0

1.0

1 0.5 0 0.0

1.0

1 0.5 0 0.0

1.0

1 0.5 0 0.0

0.5

0.5

0.5

1.0

1 0.5 0 0.0

0.5

1.0

1.0

1 0.5 0 0.0

0.5

1.0

1.0

1 0.5 0 0.0

0.5

1.0

407

Virginica

Setosa

Versicolor

Virginica 0.5

0.5

1.0

1.0

1 0.5 0 0.0

0.5

1.0

1 0.5 0 0.0

Versicolor 0.5

1.0

Fig. 5. Fuzzy rules for the Iris problem, using ﬁve rules with the triangular membership function and the multiplication operator.

SW

SL

PL

PW

1 0.5 0

Setosa 0.0

0.5

1.0

1 0.5 0

1 0.5 0 0.0

1 0.5 0 0.0

0.5

1.0

0.5

1.0

Versicolor 0.0

0.5

1.0

0.0

0.5

1.0

0.0

0.5

1.0

1 0.5 0

1 0.5 0 0.0

0.5

1.0

Versicolor

1 0.5 0

1 0.5 0 0.0

0.5

1.0

Virginica

Fig. 6. Inference system for the Iris data, using four rules with the trapezoid membership function and the min operator.

In all simulations w was set to unity and the maximum number of iterations was 300,000, although in most cases the classiﬁcation was above 99% between 3000 and 30,000 iterations. It is shown that with only ﬁve rules, the algorithm can classify the iris data very effectively. It is also shown that with four rules, 99.3% of the data can be classiﬁed correctly. Fig. 5 shows the inference system using ﬁve rules, with a triangular membership function and the multiplication operator. Fig. 6 shows the resulting inference system with

only four rules, using the trapezoid membership function and the min operator. This system is very compact and easy to analyze and shows that with only four rules, 99.3 of the data can be classiﬁed correctly. This is probably one of the most compact inference system developed in the literature for the iris data. The effect of adding the second factor to the objective function (w = 1) is shown in Fig. 7. When w = 1, conﬂict between rules is minimized while high classiﬁcation power

Segregation coefficient, S

408

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412 Table 2 Pattern classiﬁcation for non-linear function benchmark problem

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1

2

3

5

4

Type of operator

Membership function

Number of rules

Data

Test

Mult. Mult. Mult. Mult. Min Mult. Min

Gaussian Gaussian Gaussian Gaussian Gaussian Triangular Triangular

2 3 4 5 5 5 5

98.0 99.1 99.1 99.5 99.4 99.2 99.3

94.3 95.2 95.3 96.0 94.5 94.7 94.5

Rule number

Fig. 7. The effect of Sr in the objective function. When w = 1, three out of ﬁve rules will represent a given class with no other class included in the partition described by the rule. These rules can be analyzed in an isolated manner, in addition to forming part of the inference system.

100

4.2. Pattern classiﬁcation of non-linear functions The following functions are used to generate a partition into two classes of the space [0, 1] × [0, 1]: f1 (x) = − sin(2 x1 )/4 + x2 − 0.5,

(22a)

f2 (x) = − sin(2 x1 )/3 + x2 − 0.5,

(22b)

f3 (x) = − sin(2 x1 − /2)/3 + x2 − 0.5,

(22c)

f4 (x) = −| − 2x1 + 1| + x2 ,

(22d)

f5 (x) = (x1 + x2 − 1)(−x1 + x2 ),

(22e)

98 % classification

96 94 92 a

90

b

88

c

86

d

84

e

82

f

80 0

5000

10000

15000

20000

25000

30000

Number of iterations Fig. 8. LFC best-so-far curves as function of different parameters for the Iris classiﬁcation problem (First 30,000 iterations with R = 5). (a) mf = triangular, operator = min, opt = combinatorial, (b) mf = triangular, operator = mult, opt = combinatorial, (c) mf = trapezoid, operator = min, opt = combinatorial, (d) mf = trapezoid, operator = min, opt = combinatorial, (e) mf = triangular, operator = mult, opt = combinatorial disable, (f) mf = triangular, operator = mult, opt = -cut variable included. At the end of the run (300,000 iterations) all cases have classiﬁcation powder equal or larger to 99.3. For the triangular membership function the classiﬁcation powder is above 99% in the ﬁrst 10,000 iterations while for the trapezoid membership function the classiﬁcation power is above 98% in the ﬁrst 20,000 iterations.

is preserved. Rules with Sr = 1, in addition to forming part of the inference system, can be examined separately, since they deﬁne regions in the pattern space that are not fuzzy and that represent only one class. The effect of different parameters on the performance of LFC for the Iris data is shown in Fig. 8. The algorithm is robust in terms of parameter selection. At the end of the run (300,000 iterations) all cases have classiﬁcation powder equal or larger to 99.3. For the triangular membership function the classiﬁcation powder is above 99% in the ﬁrst 10,000 iterations while for the trapezoid membership function the classiﬁcation power is above 98% in the ﬁrst 20,000 iterations.

f6 (x) = −(x1 − 0.5)2 /0.42 + (x2 − 0.5)2 /0.32 + 1, (22f) f7 (x) = − (x1 − 0.5)2 /0.152 + (x2 − 0.5)2 /0.22 + 1.

(22g)

Classes 1 and 2 are deﬁned by y = sign(f (x)). These functions were used by Ishibuchi et al. (1992) to test their algorithm. For each function, 10 problems are generated. For each problem, the training data consist of 50 random patterns for each class, and the test data consists of 50 random patterns for each class. Ishibuchi et al. (1992), solved this problem using the multigrid method. With four rules, the average classiﬁcation power over the 70 instances was 68.4%. The classiﬁcation power was increased to 91% with 29 rules, to 94.6 with 203 rules, and to 94.2 with 1014 rules. The results using LFC are presented in Table 2, which presents the average results over the 70 instances. It shows the effect of the number of rules and different parameters. These results demonstrate very good classiﬁcation power for training and testing data with a very compact set of rules. With only two rules, the test data can be classiﬁed with 98% accuracy and the training data can be classiﬁed with 94.3% accuracy. With ﬁve rules, the test data improve to 99.5% and the training data to 96%. In these simulations, the combinatorial variables ir ’s, were part of the decision variables to be determined by LARES, with termination criteria of 20,000 iterations. In many cases, the optimal solution was found in less than 6000 iterations. The objective function included the segregation term with w = 1.0.

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

x4

x3

<

1 0.5 0

1 0.5 0

0.0 0.5 1.0

x2

x5

<

<

1 0.5 0

1 0.5 0

One fault 0.0 0.5 1.0

0.0 0.5 1.0

x1

x3

x6

x5 1 0.5 0

0.0 0.5 1.0

<

1 0.5 0

<

<

1 0.5 0 0.0 0.5 1.0

One fault 0.0 0.5 1.0

x1 1 0.5 0 0.0 0.5 1.0

409

0.0 0.5 1.0

1 0.5 0

One fault 0.0 0.5 1.0

x2 1 0.5 0

Two faults 0.0 0.5 1.0

x5

<

1 0.5 0

x6

0.0 0.5 1.0

1 0.5 0 0.0 0.5 1.0

Two faults

Fig. 9. Fault detection problem rules.

4.3. Fault detection in a chemical reactor

Table 3 Pattern classiﬁcation for fault detection problem

Fault detection has been identiﬁed as a very important area of research for control engineers. The cost of abnormal event management (AEM) in process industries is on the order of billions of dollars per year, as discussed in Venkatasubramanian et al. (2003) and references therein. The ﬁrst step in AEM is the diagnosis step, which can be viewed as a classiﬁcation problem. The input patterns are the process variables and the classiﬁcation is the type of fault. The reactor studied in Venkatasubramanian et al. (1990), is used as a prototype of a fault detection problem. This problem was also analyzed in Agrawal et al. (2003) using support vector machines. The problem is described in detail in Venkatasubramanian et al. (1990) and Agrawal et al. (2003) and not repeated here. The pattern space consists of six measured variables. This case study consists of ﬁnding an inference system that can differentiate between a single fault and a double fault. Fig. 9 shows the resulting inference system when the trapezoid membership function is utilized with the min operator. The combinatorial variables were included in the decision vector. The objective function included the segregation term with w set to unity. The data were classiﬁed 100% with ﬁve rules. Table 3 shows different combinations of operators and membership functions. It is shown that for some combinations, all of the data can be classiﬁed with four rules.

Type of operator

Membership function

Number of rules

% classiﬁed

Min Min Mult Min Mult

Trapezoid Trapezoid Trapezoid Triangular Triangular

4 5 4 4 4

92 100 100 100 100

4.4. Chemical processes with complex dynamics Chemical processes can have regions of unstable operations that can generate run-away reactions with large ﬂuctuations in temperatures and concentrations. In some applications, avoiding those conditions is very important from a quality and safety point of view. This problem can also be viewed as a classiﬁcation problem in which we want to distinguish between conditions for unstable operations and conditions for stable operations. To simulate such chemical processes, let us consider the CSTR with a ﬁrst-order irreversible reaction studied by Uppal et al. (1974, 1976). This system may present steady state, multiple steady states, oscillatory, or chaotic solutions, depending on operation conditions. The system is modeled by two differential equations for mass and energy balance.

410

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

1 0.9

y

0.8 0.7 0.6 0.5 0.4 50

55

60

65

70

dimensionless time

(a)

1 0.9

y

0.8 0.7 0.6 0.5 0.4 20

25

30

35

(b)

40

45

50

55

60

65

70

75

80

85

90

95 100

dimensionless time

Fig. 10. Example of (a) steady-state behavior and (b) oscillatory behavior for the CSTR problem.

Table 4 Dimensionless parameters for the non-linear reactor inference system

Mass balance:

dy . = −y + Da(1 − y) exp d 1 + /

(23)

Energy balance:

d = − + Da · B(1 − y) exp d 1 + / − ( − c ),

(24)

where y is the reactant conversion and is the dimensionless temperature, Da is the number deﬁned as the ratio of residence time to characteristic reaction time, is the dimensionless activation energy, B is the dimensionless heat of reaction, is the dimensionless heat transfer coefﬁcient and c is the dimensionless coolant temperature. The numerical solutions can be classiﬁed into steady-state solutions as class one and unstable solutions as class two. Simulated process data are generated by sampling the parameters space uniformly and integrating equations (23) and (24) until t = 100. Examples of class one and class two

Type of operator Mult. Mult. Mult. Mult

Membership function Gaussian Gaussian Gaussian Triangular

Number of rules 10 12 14 12

Data

Test

100 100 99 100

92 95 91 94

Combinatorial variables No Yes Yes Yes

operations are shown in Fig. 10. The pattern space consists of (Da, , B), the parameter Da varies in the interval [0, 0.2], B varies in the interval [15, 25] and varies in the interval [1.5, 3.0]. The parameter c is kept constant and is equal to zero in all simulations. 100 samples were generated for the training data and 100 samples for the test data. Table 4 shows the results for four study cases. In the ﬁrst experiment, the combinatorial variables were not included as part of the decision variables (all terms form part of the premise), while in the last three experiments the combinatorial variables formed part of the decision variables. Using 10 rules the algorithm was able to classify the training

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

data with 99–100% accuracy and the test data with 91–95% accuracy.

This work demonstrates that there is good synergy between fuzzy logic and LARES in ﬁnding a compact set of rules, making it an alternative to more classical GA approaches, which have also been shown to be robust algorithms for this application. LARES’ particular properties can be very important in certain applications. For example, if the feature space has attribute variables in addition to real parameters, the LARES’ multiple coding capability can be used without any need to tailor the algorithm, which is not the case in GA. Another advantage of LARES is that it reaches the global optimum neighborhood rapidly, which will be a very efﬁcient approach when combined with a hill-climbing algorithm in a second phase (back-propagation type of algorithm) to ﬁnd the global optimum. To study LARES’ performance for other fuzzy models, consider the classiﬁcation model of Ishibuchi et al. (1992). This model produces a large number of fuzzy rules that increase exponentially with the number of variables. In Ishibuchi et al. (1995), a combinatorial optimization problem was postulated to reduce the number of rules while keeping a high classiﬁcation power of the original model. The GA algorithm was used to solve the optimization problem with the following parameters: population size = 5, 10, 50, one point mutation with mutation rate = 0.1, 0.01, 0.001, 0.0001, Fitness proportional selection, one point cross over and elitism where the best individual passes to the next generation. The number of rules for the original algorithm using the Iris data with no optimization was 683. Depending on the mutation rate and population size selected, the optimal number of reduced rules varied from 18 (Np = 5, P = 0.001) to 281 (Np = 50, P = 0.1). The GA was improved by introducing a specialized mutation operation tailored to this problem in particular, and the optimal number of rules found was 13, with high classiﬁcation rates over 99%. The same problem was solved using LARES, reducing the system to 11 rules in 13,000 iterations (See Fig. 11), even better than the best results found in the tailored GA in Ishibuchi et al. (1995). Notice that when the LFC algorithm was used, four rules were enough to describe 99.3% of the data. 5. Conclusion LFC is a new algorithm developed to generate a compact set of fuzzy classiﬁcation rules from data. Numerical experiments demonstrate that the algorithm is direct, ﬂexible, robust and efﬁcient. LARES is a new algorithm used for the ﬁrst time to train fuzzy systems demonstrating its utility for training fuzzy systems. LARES has the capability to escape from local minima, reaching a near-global solution

400 350 Number of rules

4.5. Synergy of LARES with fuzzy logic

411

300 250 200 150 100 50 0 0

5000 10000 Number of iterations

15000

Fig. 11. Decrease in number of rules when using LARES for the solution of Ishibuchi et al. (1995) combinatorial optimization problem applied to the iris data. After 13,000 iterations the system was described with 11 rules.

effectively. Also the coding ﬂexibility of LARES helps to deﬁne the premise in terms of combinatorial and real variables in order to achieve a better solution. For partitions where ﬂexible membership functions are needed, the generalized trapezoid function with re-assignment transformation has been introduced. This transformation allows the use of these types of functions in LFC without any tailoring of LARES. This approach can also be used for more complex membership functions that have collocation points as part of the parameters to be determined. As shown in the numerical experiments, LFC presents a good balance between input–output mapping precision and linguistic interpretation. This algorithm will have applications in areas of chemical process industries needs to classify different patterns like diagnostic of a possible fault in the process, diagnostics for unstable operations, and allocation of type of raw materials and process conditions for product quality. The algorithm eliminates several problems presented by other algorithms like (1) Exponential increase of rules with the number of variables (2) Exponential increase of computational load with number of rules (3) all variables form part of the rule (4) No need for two phase learning and/or tailoring of the optimization algorithm. References Agrawal, M., Jade, A.M., Jakaraman, V.K., Kulkarni, B.D., 2003. Support Vector Machines: A Useful Tool for Process Engineering Applications. CEP, January, pp. 57–62. Bean, J.C., 1994. Genetic algorithms and random keys for sequencing and optimization. ORSA Journal on Computing 6 (2), 154–160. Chung, I.F., Lin, C.J., Lin, C.T., 2000. A GA-based fuzzy adaptive learning control network. Fuzzy Sets and Systems 112, 65–84. De Carli, A., Liguori, P., Marroni, A., 1994. A Fuzzi-PI control strategy. Control Engineering Practice 2, 147–153. Du, Y.D., Tyagi, R.D., Bhamidimarri, R., 1999. Use of Fuzzy neuralnet model for rule generation of activated sludge process. Process Biochemistry 35, 77–83.

412

R. Irizarry / Chemical Engineering Science 60 (2005) 399 – 412

Er, M.J., Liao, J., Lin, J., 2000. Fuzzy neural networks-based quality prediction system for sintering process. IEEE Transactions on Fuzzy Systems 8, 314–324. Fisher, R.A., 1936. The use of multiple measurements in taxonomic problems. Annals of Eugenics 7, 179–188. Goldberg, D.E., 1989. Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Reading, MA. Holland, J.H., 1975. Adaptation in Natural and Artiﬁcial Systems, University of Michigan Press, Ann Arbor, MI. Irizarry, R., 2004a. LARES: an artiﬁcial chemical process approach for optimization. Evolutionary Computation Journal 12 (4). Irizarry, R., 2004b. Solution of optimal control problems using LARES: an artiﬁcial chemical process for optimization. Chemical Engineering Science, submitted for publication. Ishibuchi, H., Nozaki, K., Tanaka, H., 1992. Distributed representation of fuzzy rules and its application to pattern classiﬁcation. Fuzzy Sets and Systems 52, 21–32. Ishibuchi, H., Nozaki, K., Yamamoto, N., Tanaka, H., 1995. Selecting fuzzy if–then rules for classiﬁcation problem using genetic algorithms. IEEE Transactions on Fuzzy Systems 3, 260–270. Jang, J.S., Sun, C.T., 1993. Functional equivalence between radial basis function networks and fuzzy inference systems. IEEE Transactions on Neural Networks 4, 156–159. Karr, C., 1991. Applying genetics to fuzzy logic. AI Expert 3, 39–43. Li, E., Li, J., Yu, J., 2002. A genetic neural fuzzy system-based quality prediction model for injection process. Computers and Chemical Engineering 26, 1253–1263. Mastorocostas, P., Theocharis, J., 2000. FUNCOM: a constrained learning algorithm for fuzzy neural networks. Fuzzy Sets and Systems 112, 1–26. Na, M.G., 1999. Application of a genetic neuro-fuzzy logic to departure from nucleate boiling protection limit estimation. Nuclear Technology 128 (3), 327–340. Russo, M., 1998. FuGeNeSys—a fuzzy genetic neural system for fuzzy modeling. IEEE Transactions on Fuzzy Systems 6, 373–388. Russo, M., 2000. Genetic fuzzy learning. IEEE Transactions on Evolutionary Computation 4, 259–273.

Sanchez, E., 1998. Fuzzy logic and inﬂammatory protein variations. Clinica Chimica Acta 270 (1), 31–42. Sanchez, E., Bartolin, R., 1990. Fuzzy inference and medical diagnosis, a case study. Journal of Biometric Fuzzy System Assessment 1, 4–21. Schwefel, H.P., 1995. Evolution and Optimum Seeking, Wiley, New York. Simpson, P.K., 1992. Fuzzy min–max neural networks—Part 1: classiﬁcation. IEEE Transactions on Neural Networks 3, 776–786. Su, M.C., Chang, H.T., 2000. Application of neural networks incorporated with real-valued genetic algorithms in knowledge acquisition. Fuzzy Sets and Systems 112, 85–97. Sugeno, M., Yasukawa, T., 1993. A fuzzy-based approach to qualitative modeling. IEEE Transactions on Fuzzy Systems 1, 7–31. Tsoukalas, L.H., Uhrig, R.E., 1997. Fuzzy and Neural Approaches in Engineering, Wiley, New York. Uppal, A., Ray, W.H., Poore, A.B., 1974. On the dynamic behavior of continuous stirred tank reactors. Chemical and Engineering Sciences 29, 967–985. Uppal, A., Ray, W.H., Poore, A.B., 1976. The classiﬁcation of the dynamic behavior of continuous stirred tank reactors-inﬂuence of reactor residence time. Chemical and Engineering Sciences 31, 205–214. Venkatasubramanian, V., et al., 1990. Process fault detection and diagnosis using neural networks—1. Steady-state processes. Computers and Chemical Engineering 14, 699–712. Venkatasubramanian, V., Rengaswamy, R., Yin, K., Kavuri, S.N., 2003. A review of process fault detection and diagnosis Part I: quantitative model-based methods. Computers and Chemical Engineering 27, 293–311. Wang, L.X., Mendel, J.M., 1992. Generating rules by learning from examples. IEEE Transactions on Systems, Man and Cybernetics 22, 1414–1427. Zhang, J., Ferch, M., 2003. Extraction and transfer of fuzzy control rules for sensor-based robotic operations. Fuzzy Sets and Systems 134, 147–167.

Fuzzy classification with an artificial chemical process

Fuzzy classification with an artificial chemical process

Recommend Documents