A new method for designing neuro-fuzzy systems for nonlinear modelling with interpretability aspects

A new method for designing neuro-fuzzy systems for nonlinear modelling with interpretability aspects

Neurocomputing 135 (2014) 203–217 Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom A new m...

1MB Sizes 1 Downloads 98 Views

Neurocomputing 135 (2014) 203–217

Contents lists available at ScienceDirect

Neurocomputing journal homepage: www.elsevier.com/locate/neucom

A new method for designing neuro-fuzzy systems for nonlinear modelling with interpretability aspects K. Cpałka nn, K. Łapa n, A. Przybył, M. Zalasiński Cze¸stochowa University of Technology, Institute of Computational Intelligence, Poland

art ic l e i nf o

a b s t r a c t

Article history: Received 7 August 2013 Received in revised form 12 December 2013 Accepted 13 December 2013 Communicated by R. Tadeusiewicz Available online 21 January 2014

In this paper we propose a new approach to nonlinear modelling. It uses capabilities ofthe so-called flexible neuro-fuzzy systems and evolutionary algorithms. The aim of our method is not only to achieve appropriate accuracy of the model, but also to ensure the possibility of interpretability of the knowledge within it. The proposed approach was achieved by, among others, appropriate selection of operational criteria applied to evolutionary model creation. It allows to extract interpretable fuzzy rules in the cases which use the learning data e.g. from identification. The possibility of interpretation of knowledge accumulated in the model seems to be important in practice, because it guarantees operation predictability and facilitates production of efficient and accurate control methods. Our method was tested with the use of well-known simulation problems from the literature. & 2014 Elsevier B.V. All rights reserved.

Keywords: Nonlinear modelling Neuro-fuzzy systems Evolutionary algorithms Hierarchical clustering Interpretability

1. Introduction The analysis of technical issues aims at finding and understanding the essence of the problem, it tries to create a model. The reason for this is the willingness to ensure predictability, which guarantees safety, decreases costs and ensures control. It aims not only to have an accurate model, which is able to work in the realtime, but also the model which is interpretable (data mining). The knowledge of the model facilitates designing efficient and accurate controllers for the modelling process. In the literature the following approaches to modelling are considered:



 White-box model: This approach uses phenomenological (theoretical) description of physical phenomena. In the cases for dynamic modelling it is presented as differential equations. Most commonly algebraic form is implemented with the use of theory of state variables. It is well known issue in control theory [11]. It is worth mentioning that the phenomenological model is interpretable but not necessarily accurate enough. It stems from simplifying assumptions or insufficient knowledge of the modelled phenomena. An example of this are models of electromechanical drives [45]. In such implementations the

n

Corresponding author. Principal corresponding author. E-mail addresses: [email protected] (K. Cpałka), [email protected] (K. Łapa), [email protected] (A. Przybył), [email protected] (M. Zalasiński). nn

0925-2312/$ - see front matter & 2014 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.neucom.2013.12.031



simplifying assumptions apply mainly to (a) assume symmetry and linearity, (b) idealization of actors and sensors characteristics, and (c) neglecting saturation of the magnetic circuit. In order to improve the quality of such a model the compensation of influence of selected phenomena (e.g. nonuniform magnetic circuit) or extending mathematical description by the description of physical phenomena in the cooperating components are proposed [4]. Black-box model: In this approach the behaviour of the object is recreated on the basis of observations of cause and effect dependencies. Parameters of standard (usually very complex) model are tuned to data derived from observation of the object. In such a case it is theoretically possible to obtain high accuracy of the model. It is worth mentioning that in this type of modelling the interpretation of the model is often impossible. It stems from the characteristics of methods which support this type of modelling. These methods include, among others, systems of computational intelligence such as neural networks [21,41,42,55,58]. Neural networks have the ability to learn based on data derived from observation of the object. Unfortunately, the knowledge accumulated in the neural networks is non-interpretable. Grey-box model: This approach is based on model structure derived from some laws and parameters tuned to the data defining behaviour of the object. This aims at ensuring that the mapping model creates possibility of interpretability of the knowledge accumulated within the model. These methods include, among others, hybrid solutions and systems of computational intelligence such as fuzzy systems and neuro-fuzzy

204

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

systems. Special attention should be paid to neuro-fuzzy systems which will be considered in this work. These systems unite the advantages of both neural networks and fuzzy systems [15,30,49,52,54,60]. The knowledge accumulated in the fuzzy systems is described by if–then rules, formulated by an expert, which are easy to read. The neuro-fuzzy systems unite the ability to learn the neural networks with the possibility of easily read representation of the knowledge of the fuzzy systems. Thanks to this combination the neuro-fuzzy systems are perfect for nonlinear modelling. It is worth mentioning that only the use of neuro-fuzzy system for nonlinear modelling does not ensure interpretability of the knowledge accumulated within it. The reason for that is (a) use of a large number of fuzzy rules, (b) use of a large number of antecedents and consequents of the rules, (c) overlapping of the fuzzy sets, etc. As already mentioned the grey box model also includes the hybrid solutions. These solutions combine e.g. fuzzy sets theory and fuzzy systems theory with the state variables theory (for dynamic modelling). In such correlation the following approaches are known: (a) approaches based on implementation of sector-nonlinearity method and (b) approaches based on nonlinearity of selected parameters of linear model (described by fuzzy rules). The first approach identifies sectors which are the basis for local linear object approximation in different operating points. Then states variable technique describes (locally linear) behaviour for the whole object in the selected sector. This description is based on well known methods of the control theory which applies to linear models (it is their big advantage). It should also be mentioned that the interpretation of the knowledge accumulated in such a model is difficult. Even the use of fuzzy rules does not change this fact [25]. If–then fuzzy rules describe only the way of switching between linear models while the reason for switch of the sector is hard to substantiate. In the second approach the interpretation of the accumulated knowledge is a bit easier. It stems from the fact that the if–then fuzzy rules describe the change of the values of the selected coefficients of the linear model in reaction to change in the input data values, not the change of the whole model [3,34,44]. There is still a search for such nonlinear modelling methods which will be characterized by good accuracy and possibility to interpret the knowledge accumulated within it. The interpretability issue in the context of nonlinear modelling is much harder than in the case of classification (in the system the exact value of the output signal is important). Each limitation put upon the system structure (used to increase the interpretability) has a negative effect on the accuracy (and vice versa). On the other hand, the attempt to directly interpret the accumulated knowledge, for example in the case of fuzzy system without techniques to increase interpretability, is difficult. In the literature there are different approaches to increase the interpretability of fuzzy systems. It can be noted that these approaches are mainly based on a suitable structure of the fuzzy system (e.g. a hybrid structure) or on the use of specific training algorithm (e.g. methods in the field of multiobjective optimization or evolutionary optimization) (see e.g. [1,5,16,22–24,36,46,53,56,62]). It seems that the approach used to increase interpretability should be simple to implement and subsequent to modify, and should allow the use of dynamic programming techniques. The accuracy versus interpretability trade-off is also a common topic in the literature [19,62]. An interesting approach for nonlinear modelling is the use of potential of neuro-fuzzy systems and extortion of interpretability of the knowledge accumulated within it. The interpretability of the neuro-fuzzy system is defined in many ways [5,35,37,62], in most cases it is defined as

interpretability of fuzzy partitions, also known as integrity or similarity, and interpretability of rules, also known as complexity. In [14] a taxonomy based on a double axis ’complexity versus semantic interpretability0 considering the two main kinds of measures was presented; and ’rule base versus fuzzy partitions0 considering the different components of the knowledge base to which both kinds of measures can be applied. This systematics assumes four quadrants of the interpretability of fuzzy-rule based systems:

 Q1: Complexity at the rule base quadrant. This quadrant includes, among others, number of rules, number of conditions.

 Q2: Complexity at the fuzzy partition quadrant. This quadrant  

includes, among others, number of membership functions, number of features. Q3: Semantics at the rule base quadrant. This quadrant includes, among others, consistency of rules, rules fired at the same time, transparency of rule structure. Q4: Semantics at the fuzzy partition quadrant. This quadrant includes, among others, completeness or coverage, normalization, distinguish ability, complementarily, relative measures.

As mentioned before, considering the interpretability in the cases connected to nonlinear modelling is not easy. It is due to the fact that it requires specified assumptions (e.g. resulting from considered Q1–Q4 quadrants) where each can affect the accuracy in a negative way. The population based algorithms are a convenient tool used for learning neuro-fuzzy systems in nonlinear modelling tasks. They exist in many variants based on behaviour of different populations, for example ants, bacterial, birds. It stems from their implementation flexibility and simplicity, and effectiveness. Population based algorithms are used for example for system parameters0 selection [27,32], system structure selection [17], system structure reduction [23,24,56]. These algorithms also allow us to implement mechanisms which ensure system0 s interpretability [7,16,19]. Using the population based algorithms regarding the learning neuro-fuzzy systems in nonlinear modelling tasks, it is very important to pay attention to the initial population generation [2,46]. The initialization cannot be random, as in the case of other kind of problems solved with the use of population based algorithms. It stems from the fact that some groups of genes within chromosome are connected (for example, genes which code parameters of membership functions) and depend on each other (for example, genes which code fuzzy sets with singular input or output). Then the initialization is based most often on the capabilities of unsupervised learning and allows among others to initially select rules of the systems and their number. If it is conducted correctly, it shortens learning time of the system and enhances system0 s quality in the meaning of selected criteria. The appropriate initialization of population based algorithm is not simple. It stems from the fact that the aim of the initialization algorithm is selection of the whole population, not a single system. The diversity of population needs to be ensured. In this paper we propose a new approach to nonlinear modelling. It includes a new system initialization algorithm and the new method of neuro-fuzzy system structure and parameters0 selection, aimed at obtaining both high accuracy and high interpretability in accordance to the Q1–Q4 criteria presented in [14]. The method we propose

 Uses the potential of the flexible neuro-fuzzy systems for nonlinear modelling, proposed in our previous works [8,48,50]. These systems have a high accuracy, which is very important in the nonlinear modelling cases. These systems also allow us to enter a hierarchy of antecedents of rules and whole rules. It seems to

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217







be very important in cases of nonlinear modelling in which some of the rules are supposed to be superior. Implementation of this type of systems is possible thanks to implementation of weighted triangular norms. Work of these operators depends not only on the arguments but also on the weights of these arguments. Includes all of interpretability criteria proposed in [14]: Our method reduces the rule base by reducing inputs, rules, antecedents and discretization points. On the other hand, our method seeks semantic interpretability by minimization of fuzzy rules activated for the input sets, removing overlapping fuzzy sets, minimization of differences in the shape of fuzzy sets associated with the inputs and outputs, normalization ensuring, distinguishability and complementarity of input fuzzy sets. The process of automatic structure and parameters of neuro-fuzzy system selection is created to ensure balance between accuracy and interpretability. It should also be noted that consideration of the interpretability criteria is possible thanks to making the rule base independent from the defuzzification mechanism by using change formula of centre of area defuzzification. The details of this formula can be found in our previous works [8]. Uses the abilities of population based algorithms in the range of ensuring proper accuracy of the system and the interpretability of the knowledge accumulated within it: In our work we used evolutionary strategy ðμ; λÞ, as a typical method from population based methods. We wanted to show that even basic method which is ðμ; λÞ strategy can be an effective tool in cases of nonlinear modelling. Allows a proper initial population generation: In this procedure we used the abilities of Ward0 s hierarchical clustering method [59] and we presented a new way of generation whole initial population.

This paper is organized into 5 sections. Section 2 contains description of the flexible neuro-fuzzy system for nonlinear modelling. Description of the new method for designing our system is given in Section 3. Simulation results are presented in Section 4. Conclusions are drawn in Section 5.

205

The flexible fuzzy rule base consists of a collection of N fuzzy IF–THEN rules in the form 20 3 1 IF ðx 1 is Ak1 ÞjwAk;1 AND…AND ðx n is Akn ÞjwAk;n k 4@ Ajwrule 5; R : ð2Þ k THEN ðy1 is Bk1 Þ; …; ðym is Bkm Þ where x ¼ ½x 1 ; …; x n  A X, y ¼ ½y1 ; …; ym  A Y, Ak1 ; …; Akn are fuzzy sets characterized by membership functions μAk ðxi Þ, i ¼ 1; …; n, i

k ¼ 1; …; N, Bk1 ; …; Bkm are fuzzy sets characterized by membership functions μBk ðyj Þ, j ¼ 1; …; m, k ¼ 1; …; N, wAk;i A ½0; 1, i ¼ 1; …; n, j

k ¼ 1; …; N, are weights of antecedents, wrule A ½0; 1, k ¼ 1; …; N, k are weights of rules. The fuzzy inference determines a mapping from the fuzzy sets in the input space X to the fuzzy sets in the output space Y. Each of k the N rules (2) determines fuzzy sets B j  Y given by the compositional rule of inference k

B j ¼ A0 ○ðAk -Bkj Þ;

ð3Þ

characterized by membership functions μ k ðyj Þ ¼ μAk -Bk ðx; yj Þ: Bj

ð4Þ

j

In our previous papers [49,50] we have shown that neuro-fuzzy systems of the Mamdani type are more suitable for approximation problems, whereas neuro-fuzzy systems of the logical type may be preferred for classification problems. Due to this fact in our method we use the neuro-fuzzy system of the Mamdani type. In such a system t-norm is the inference operator μ k ðyj Þ ¼ Tfτk ðxÞ; μBk ðyj Þg; Bj

where t-norm Tfg is a generalization of the usual two-valued logical conjunction (studied by classical logic) [29] and τk ðxÞ is a flexible firing strength of the k-th rule, k ¼ 1; …; N, defined as follows: n

τk ðxÞ ¼ μAk ðxÞ ¼ T n fμAk ðx i Þ; wAk;i g; i¼1

where T n fg is a weighted t-norm [8–10,50] in the form T n fa1 ; a2 ; w1 ; w2 g ¼ Tf1  w1  ð1  a1 Þ; 1  w2  ð1  a2 Þg ¼ minf1  w1  ð1  a1 Þ; 1  w2  ð1  a2 Þg;

In our previous works we considered a new class of the neurofuzzy systems – the flexible neuro-fuzzy systems [10,47]. Those systems have very high accuracy in the field on classification and approximation problems. In these systems we used compromise fuzzy reasoning, soft and parameterized type and weighted triangular norms. More details are available in [8,48,50]. However, in this work our goal was not only to show that flexible neurofuzzy systems can have good accuracy but can also be highly interpretable. Of course our approach can be applied for all systems which use fuzzy rules. We consider multi-input, multi-output neuro-fuzzy system mapping X-Y, where X  R n and Y  R m . The fuzzifier performs a mapping from the observed crisp input space X  Rn to the fuzzy sets defined in X. The most commonly used fuzzifier is the singleton fuzzifier which maps x ¼ ½x 1 ; …; x n  A X into a fuzzy set A0 D X characterized by the membership function ( μA0 ðxÞ ¼

1

if x ¼ x

0

if x a x:

ð1Þ

ð6Þ

i

e:g:

2. Flexible neuro-fuzzy system for nonlinear modelling

ð5Þ

j

ð7Þ

where w1 and w2 A ½0; 1 mean weights of importance of the arguments a1 ; a2 A ½0; 1. Note that T n fa1 ; a2 ; 1; 1g ¼ Tfa1 ; a2 g and T n fa1 ; a2 ; 1; 0g ¼ a1 . More details about weighted t-norm can be found in our previous papers. The flexible aggregation operator, applied in order to obtain the k fuzzy set B0j , is based on fuzzy sets B j , k ¼ 1; …; N. In the Mamdani approach the aggregation is carried out by N

k

Bj 0 ¼ ⋃ B j :

ð8Þ

k¼1

The membership function of B0j is given by N

μB0j ðyj Þ ¼ Sn fμ k ðyj Þ; wrule k g; k¼1

Bj

ð9Þ

where Sn fg is a weighted t-conorm [8–10,50] in the form Sn fa1 ; a2 ; w1 ; w2 g ¼ Sfw1  a1 ; w2  a2 g e:g:

¼ maxfw1  a1 ; w2  a2 g;

ð10Þ

where t-conorm Sfg is a generalization of the usual two-valued logical disjunction, w1 and w2 A ½0; 1 mean weights of importance of the arguments a1 ; a2 A ½0; 1. Note that Sn fa1 ; a2 ; 1; 1g ¼ Sfa1 ; a2 g and Sn fa1 ; a2 ; 1; 0g ¼ a1 . Analogously, as in the case of weighted t-norm, more details about weighted t-conorm can be found in our previous papers.

206

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

The defuzzifier performs a mapping from the fuzzy sets B0j to crisp points y j , j ¼ 1; …; m, in Y  R. The flexible centre of area (COA) method is defined in the discrete form by the following formula [47,50,51]: yj ¼

def 0 ∑Rr ¼ 1 y def j;r  μBj ðy j;r Þ

∑Rr ¼ 1 μB0j ðy def j;r Þ

;

yj ¼

S

nN

k ¼ 1 fTfT

 It selects the parameters and structure of the neuro-fuzzy system

ð11Þ

where y def j;r , j ¼ 1; …; m, r ¼ 1; …; R, are the discretization points, R is the number of discretization points. Note that neuro-fuzzy architectures developed so far in the literature are based on the centre of area discretization formula with the assumption that the number of discretization points R is equal to the number of rules N. In this paper we relax that assumption. It should be noted that in neuro-fuzzy systems studied so far, for which R¼N, there is no possibility to reduce rules without changing the defuzzifier. If a small number of rules N correctly describes a problem to be solved, then the corresponding small number of discretization points R¼N makes it impossible to obtain a high accuracy. Moreover, reduction of rules causes significant deterioration of the system accuracy. It is a consequence of a simultaneous reduction of the number of rules and discretization points. In our concept we distinguish between the number of rules N and the number of discretization points R. As a result we are able to reduce the rule base without deterioration of system accuracy. Finally, in the Mamdani approach formula (11) takes the form ∑Rr ¼ 1 y def j;r

proposed evolutionary strategy ðμ; λÞ. It is a typical algorithm from the population based methods [35,47]. The main characteristics of the method that we propose can be summarized as follows:

nn

def A rule i ¼ 1 fμAki ðx i Þ; wk;i g; μBkj ðy j;r Þg; wk g

N n rule ∑Rr ¼ 1 Sn k ¼ 1 fTfT n i ¼ 1 fμAk ðx i Þ; wAk;i g; μBk ðy def j;r Þg; wk g i j

:

ð12Þ

In the next section a new learning algorithm for evolution of flexible neuro-fuzzy system (12) is proposed. The aim of the algorithm is the selection of the parameters and structure of the neuro-fuzzy system for nonlinear modelling described in Eq. (12) with the accuracy and interpretability taken into consideration. In the process of evolution (evolution of parameters) we will find all parameters of the neuro-fuzzy system as follows:

 fx Ai;k ; sAi;k g, i ¼ 1; …; n, k ¼ 1; …; N – parameters of Gaussian

membership functions μAk ðxi Þ of the input fuzzy sets Ak1 ; …; Akn i (in our simulations the Gauss functions were used, but in their place other function membership can also be used),

 fy B ; sB g, k ¼ 1; …; N, j ¼ 1; …; m – parameters of Gaussian j;k j;k

membership functions μBk ðyj Þ of the output fuzzy sets Bk1 ; …; Bkm ,

 wA , i ¼ 1; …; n, k ¼1, …,N – weights of antecedents, k;i  wrule , k¼ 1, …,N – weights of rules, k  y def , j ¼ 1; …; m, r ¼1, …,R – discretization points. j;r j







(12) used for nonlinear modelling: In this process all quadrants (Q1–Q4) of interpretability presented in [14] are considered and complexity and semantics in relation to the rule base and the fuzzy partition are also considered. Moreover the appropriate trade-off between interpretability and accuracy is ensured. It selects via evolutionary algorithm the number of inputs, number of the rules, number of antecedents and number of consequents: The number of these elements needs to be established through trial and error for particular simulation problem. It is time consuming and inconvenient. In our method the possibility of automated selection stems from the use of flexible structure of neuro-fuzzy system. It was proposed in our previous works [8,34]. It allows us to reduce the number of fuzzy rules without causing a negative effect on the accuracy of the system. It allows us to automatically select the hierarchy of importance of antecedents and rules based on learning dataset: It expands the precision of the model in significant way. This possibility stems from using the weighted triangular norms, proposed in our previous works [8,34]. It allows us to appropriately initialize the evolutionary algorithm: The appropriate initialization allows us to obtain better learning outcomes in the meaning of chosen criteria and allows us to shorten the learning process. It is worth mentioning that our initialization method uses the Ward0 s hierarchical clustering algorithm. It allows an automatic selection of group centres and their masses.

The structure of the neuro-fuzzy system is described by formula (12) and its parameters are found using the evolutionary strategy ðμ; λÞ and the Pittsburgh approach [38,47]. The evolutionary strategy ðμ; λÞ starts with a random generation of the initial parents population P containing μ individuals (see Fig. 1). Next, a temporary population T is created by means of reproduction, whose population contains λ individuals, while λ Z μ. Reproduction consists in a multiple random selection of λ individuals out of population P (multiple sampling) and placing the selected ones in temporary population T. Individuals of population T undergo crossover and mutation operations as a result of which an offspring population O is created, which also has size λ. The purpose of the repair procedure of the population O is to correct the

Moreover, in the process of evolution (evolution of structure) we will find number of inputs n, number of rules N, number of antecedents and consequents (number of fuzzy sets) and number of discretization points R. In the next section we are going to show the use of an example of population based algorithm (evolutionary strategy ðμ; λÞ) used to select structure and parameters of the system (12) with the accuracy and interpretability taken into consideration. Moreover we are going to show how the new generation method of initial population works. This method is universal and can also be used for generation of initial population of other algorithms which are based on populations.

3. Description of the new method for designing neuro-fuzzy system for nonlinear modelling As mentioned before, for selection of structure and parameters of neuro-fuzzy systems (12) for nonlinear modelling we have

Fig. 1. Diagram presenting process of standard evolutionary strategy ðμ; λÞ.

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

parameters if they reach inadmissible values. The new population P containing μ individuals is selected only out of the best λ individuals of population O. A very interesting mechanism used in evolutionary strategies is the self-adaptation of the range of the mutation. At the beginning of the operation of evolutionary strategy the range is large, while during the convergence its gradual reduction is observed. This results in a smooth transition from exploration (occurring at the beginning of the algorithm) to exploitation of the promising areas. For more details on the evolutionary strategy ðμ; λÞ see [13,47]. 3.1. Evolution of parameters We apply evolutionary strategy ðμ; λÞ for learning all parameters in the system described by formula (12). In a single chromosome, according to the Pittsburgh approach, a complete linguistic model is coded in the following way: 9 8 > > x A1;1 ; sA1;1 ; …; x An;1 ; sAn;1 ; … > > > > > > > > > > A A A A > > x ; s ; …; x ; s ; > 1; Nmax n; Nmax n; Nmax > 1; Nmax > > > > > > > > B B B B > > y ; s ; …; y ; s ; … > > 1;1 m;1 1;1 m;1 > > = < B B B B y ; s ; …; y ; s ; 1; Nmax m; Nmax m; Nmax 1; Nmax par > > > Xch ¼ > ð13Þ > > > wA1;1 ; …; wAn;1 ; …; wA1; Nmax ; …; wAn; Nmax ; > > > > > > > > > > > rule > > wrule > > 1 ; …; wNmax ; > > > > > > > > y def ; …; y def ; …; y def ; …; y def ; : 1;1 1;Rmax m;1 m;Rmax ¼ fX par ; …; X par g; ch;1 ch;L where L ¼ Nmax  ð3  n þ2  m þ 1Þ þ Rmax, ch ¼ 1; …; μ for the parent population or ch ¼ 1; …; λ for the temporary population, Nmax is the maximum number of rules, Rmax is the maximum number of discretization points. The maximum number of rules Nmax should be selected to the problem individually. When the number of the rules Nmax is too high it can decrease the system interpretability. The purpose of the evolution strategy is to automatically select the number of rules from the range ½1; Nmax. Analogously, the maximum number of discretization points Rmax should also be selected to the problem individually. The purpose of the evolution strategy is to select automatically the number of discretization points from the range ½1; Rmax. The number of discretization points have no influence on interpretability of the knowledge accumulated in the neurofuzzy system (12). In particular it does not have effect on rules base (2). It allows us to obtain both good accuracy and small number of the rules [8]. The purpose of evolution strategy ðμ; λÞ is also to select the number of antecedents and consequents within each rule from rule base. The number of antecedents in each of the rules is within the range ½1; n, and the number of consequents in each of the rules is within the range ½1; m. The reduction of the system is done with the use of additional chromosome Xred ch , which is described in detail in Section 3.2. It is one of the mechanisms for increasing interpretability of the system according to interpretability criteria: Q1 and Q2 defined in [14]. The self-adaptive feature of the algorithm is realized by assigning to each gene a separate mutation range described by the standard deviation

rpar ¼ ðspar ; …; spar Þ; ch ch;1 ch;L

ð14Þ

where ch ¼ 1; …; μ for the parent population or ch ¼ 1; …; λ for the temporary population. For temporary population we use the recombination (crossover) and the mutation operations:

 Crossover with averaging the values of the genes: X par0 ¼ 12  ðX par þ X par Þ; ch1;g ch1;g ch2;g

X par0 ¼ X par0 ; ch2;g ch1;g

ð15Þ

and



207



; spar0 ¼ 12  spar þ spar ch1;g ch1;g ch2;g



spar0 ¼ spar0 ; ch2;g ch1;g

ð16Þ

where g ¼ 1; …; L. Symbol ’0 ’ in Eqs. (15) and (16) denotes a value of the genes after using the crossover operator. Mutation:

spar0 ¼ spar  expðτ0  Nð0; 1Þ þ τ  N ch;g ð0; 1ÞÞ; ch;g ch;g

ð17Þ

and X par0 ¼ X par þ spar0  Nch;g ð0; 1Þ; ch;g ch;g ch;g

ð18Þ

where spar , ch ¼ 1; …; λ, g ¼ 1; …; L, denotes the current value ch;g of the mutation range of the ch-th chromosome of the g-th gene, spar0 , ch ¼ 1; …; λ, g ¼ 1; …; L, denotes a new value of the ch;g mutation range, Nð0; 1Þ is the number drawn from the normal distribution, N ch;g ð0; 1Þ is the number drawn from the normal distribution of the ch-th chromosome, of the g-th gene, and τ0 , τ denote constants chosen prior to the evolution process. The following formulas can be found in the literature [13]: C τ0 ¼ pffiffiffiffiffi 2L

ð19Þ

and C τ ¼ pffiffiffiffiffiffiffiffiffi pffiffiffi; 2 L

ð20Þ

where C takes value 1 most frequently. In order to avoid convergence of the mutation range to 0, we use the following formula:

spar0 ¼ maxfɛ0 ; spar0 g; ch;g ch;g

ð21Þ

where ɛ 0 is a small positive number chosen prior to the evolution process.

3.2. Evolution of structure Evolution of neuro-fuzzy system structure is based on the evolutionary strategy ðμ; λÞ and classic genetic algorithm. At the beginning we take the maximum number of rules, antecedents, consequents, inputs, and discretization points. For selection of optimal number of rules, antecedents, consequents, inputs and discretization points, interpretability of fuzzy sets also needs to be considered. The number of rules should be as small as possible. In the practice more than dozen rules is too difficult to analyze. In the next step, we reduce our system using the evolutionary strategy. For this purpose we use an extra chromosome Xred ch . Its genes take binary values and indicate which rules, antecedents, consequents, inputs, and discretization points are selected. The chromosome Xred ch is given by 9 8 x1 ; …; xn > > > > > > > > 1 1 Nmax Nmax > > A ; …; A ; …; A ; …; A ; > > 1 n 1 n > > = < 1 1 Nmax Nmax red red ; …; B ; …; B ; …; B ; B ¼ fX red Xch ¼ 1 m 1 m ch;1 ; …; X ch;Lred g; > > > > > > ; …; rule ; rule > > Nmax 1 > > > > def def def > > ; : y def 1;1 ; …; y 1;Rmax ; …; y m;1 ; …; y m;Rmax ð22Þ red

where L ¼ Nmax  ðn þ m þ1Þ þ n þ m  Rmax is the length of the chromosome Xred ch , where ch ¼ 1; …; μ, for the parent population or ch ¼ 1; …; λ, for the temporary population. Its genes indicate which rules (rulek , k ¼ 1; …; Nmax), antecedents (Aki , i ¼ 1; …; n, k ¼ 1; …; Nmax), consequents (Bkj ; j ¼ 1; …; m; k ¼ 1; …; Nmax), inputs (x i , i ¼ 1; …; n), and discretization points (y r , r ¼ 1; …; Rmax) are taken to the system.

208

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

We can easily notice that the number of inputs used in the system encoded in the chromosome ch can be determined as follows: red nch ¼ Xred ch fx1 g þ ⋯ þ Xch fxn g;

ð23Þ

Xred ch fxi g

where contractual notice means gene of the chromosome Xred ch associated with the input xi (as previously mentioned, if the value of the gene is 1, the associated input is taken into account during work of the system). The number of rules (Nch ) used in the system encoded in the chromosome ch may be determined analogously. For temporary population we use the recombination (crossover) and the mutation operations analogically to those in the classical genetic algorithm [47]:

 Single-point crossover, with probability pc A ½0; 1 chosen prior 

to the evolution process. Mutation, with probability pm A ½0; 1 chosen prior to the evolution process.

3.3. Chromosome population evaluation Each individual Xch of the parental and temporary populations is represented by sequence of chromosomes 〈Xpar ; spar ; Xred ch 〉, given ch ch by formulas (13), (14) and (22). The genes of the two first chromosomes take real values, whereas the genes of the last chromosome take integer values from the set f0; 1g. The system aims to minimize the following fitness function: ffðXch Þ ¼ ffaccuracyðXch Þ  ffcomplexityðXch Þ ffinterpretabilityðXch Þ;

ð24Þ

where individual components are defined as follows:  The component ffaccuracyðXch Þ determines the accuracy of the system (12) i.e. average normalized system error for all outputs and all data from the learning sequence ffaccuracyðXch Þ ¼

 1 Z  ∑ d y z;j  1 mch Z z ¼ 1 z;j ; ∑ mch j ¼ 1 maxz ¼ 1;…;Z fdz;j g  minz ¼ 1;…;Z fdz;j g ð25Þ

where mch is the number of outputs coded in the chromosome ch, Z is the number of samples of learning sequence, dz;j is the desired value of output signal j ¼ 1; …; m for input vector z ¼ 1; …; Z, y z;j is the real value of the output signal j ¼ 1; …; m for input vector z ¼ 1; …; Z. The purpose of normalization of the component ffaccuracyðXch Þ was to ensure even influence on every component of the function (24).  The component ffcomplexityðXch Þ determines complexity of the system (12) i.e. the number of reduced elements of the system (rules, antecedents – input fuzzy sets, consequents – output fuzzy sets, inputs, and discretization points) in relation to chromosome length Xred ch ffcomplexityðXch Þ ¼ wffcomplexity þ

1 L

red

Lred

∑ Xred ch;g ;

criterion: Q1 – complexity at the rule base level (evaluate the number of rules and antecedents) and Q2 – complexity at the level of fuzzy partitions (evaluate the number of inputs taking into account).  The component ffinterpretabilityðXch Þ determines the semantic interpretability of the system (12) coded in the tested chromosome 0 1 wffinterpretability 0 1C B ffintA ðXch Þ B C B B þ ffint ðX Þ C C B C B B C ch B CC ; ð27Þ ffinterpretabilityðXch Þ ¼ B 1 B CC ðX Þ þ ffint B þ5  B C ch B CC B C B C B @ þ ffintD ðXch Þ A C @ A þ ffintE ðXch Þ where variable winterpretability stabilizes the component winterpretability ; ffinterpretabilityðXch Þ A ðwinterpretability þ 1 and individual components of the formula (27) are defined as follows: ○ The component ffintA ðXch Þ minimizes the number of rules fired at the same time in the system (12) for the fuzzy sets  2 1 Z maxk ¼ 1;…;Nch fτk ðx z Þg ; ð28Þ ffintA ðXch Þ ¼ 1  ∑ Zz¼1 ∑Nch τk ðx z Þ k¼1

where τk ðx z Þ, k ¼ 1; …; Nch , means the level of activity of the rule k, described by the formula (6) and Z is the number of samples of learning sequence. Taking into account the component ffintA ðXch Þ A ð0; 1 in evaluation of the population is in accordance with the following interpretability criterion: Q3 – semantics at the rule base level (evaluate the number of rules fired at the same time). ○ The component ffintB ðXch Þ maximizes the fit to the training data of input fuzzy sets of the system (12) coded in the tested chromosome  nch Z 1 ffintB ðXch Þ ¼ ∑ ∑ 1  max fμAk ðx z;i Þg ; i Z  nch z ¼ 1 i ¼ 1 k ¼ 1;…;N ch ð29Þ where x z;i , z ¼ 1; …; Z, i ¼ 1; …; nch , is the input signal retrieved from the learning sequence (after passing through the fuzzification block), Z is the number of samples of learning sequence, nch is the number of input of particular chromosome ch, μAk ðÞ, i ¼ 1; …; nch ; i k ¼ 1; …; N ch , is a membership function of the input k fuzzy set Ai . Taking into account the component ffintB ðXch Þ A ð0; 1 in evaluation of the population is in accordance with the following interpretability criterion: Q4 – semantics at the fuzzy partition level (evaluate normalization). ○ The component ffintC ðXch Þ reduces the overlapping of the input and output fuzzy sets of the system (12) coded in the tested chromosome

ð26Þ

g¼1

where Lred is the number of elements of the system which can red be reduced, X red ch;g is the gene of the chromosome Xch that describes reduction of g-th gene. The stability of the component ffcomplexityðXch Þ is done by adding the variable wffcomplexity. Then ffcomplexityðXch Þ A ðwffcomplexity ; wffcomplexity þ 1. The value of wffcomplexity should be set experimentally. Taking into account the component (26) in evaluation of the population is in accordance with the following interpretability

0

0

B B B B B B B B B B @

∑j ¼ch1 ∑k1ch¼ 1 ∑k2ch¼ k1 þ 1 sim@

1 1

C C C C C mch 0 1C C par k1 X fA g C n Nch  1 Nch @ ch i ;A C ∑i ch ∑ ∑ sim ¼ 1 k1 ¼ 1 k2 ¼ k1 þ 1 C par k2 Xch fAi g A m

þ

Xpar fBk1 j g ch

N

1

N



ffsimC ðXch Þ ¼ 2

nch

Nch



fBk2 Xpar j g ch

;A

;

2 ð30Þ

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

X par fAki g is the part of the chromosome Xpar , which ch ch codes parameters of the input fuzzy set Aki , Xpar fBkj g is ch par the part of the chromosome Xch , which codes parameters of the output fuzzy set Bkj . The similarity of fuzzy sets in (30) can be defined with use of many different methods [26]. In our simulations we use a Jaccard index. This method can be defined as  8 9  < Gaussðx i ; x Ai;k1 ; sAi;k Þ =  1 min ;   : Gaussðx i ; x Ai;k ; sAi;k Þ ;  2 2 k1 k2 8 9: simðAi ; Ai Þ ¼  ð31Þ  < Gaussðx i ; x Ai;k1 ; sAi;k Þ =  1 max ;   : Gaussðx i ; x Ai;k ; sAi;k Þ ;  2 2 Taking into account the component ffintC ðXch Þ A ð0; 1 in evaluation of the population is in accordance with the following interpretability criteria: Q3 – semantics at the rule base level (it determines size and number of the fuzzy sets simultaneously active) and Q4 – semantics at the fuzzy partition level (via punishment of overlapping rules). ○ The component ffintD ðXch Þ increases the integrity of the shape of the input and output fuzzy sets associated with the inputs and outputs of the system (12) coded in the tested chromosome  1    par A ch fsi;k g  N1ch  ∑N Xpar fsAi;k2 g nch X k2 ¼ 1 ch B ∑ ch C B C N ch par C nch  ∑k2 ¼ 1 Xch fsAi;k2 g N ch B B i ¼ 1 C ffintD ðXch Þ ¼ ∑ B ; C  par B N ch par 1 B B k¼1B fsm;k g  Nch  ∑k2 ¼ 1 Xch fsm;k2 g C mch X C @ þ ∑ ch A ch Xpar fsBm;k2 g mch  ∑N i¼1 k2 ¼ 1 ch 0

ð32Þ where sAi;k , i ¼ 1; …; nch , k ¼ 1; …; N ch , is the width of the input fuzzy set (Gaussian function) Aki . Taking into account the component ffintD ðXch Þ A ð0; 1 in evaluation of the population is in accordance with the following interpretability criterion: Q4 – semantics at the fuzzy partition level (evaluate distinguishability). ○ The component ffintE ðXch Þ increases complementarity of the input fuzzy sets of the system (12) coded in the tested chromosome    nch  N ch Z 1   ffintE ðXch Þ ¼ ∑ ∑ 1  ∑ μAk ðx z;i Þ; ð33Þ i  Z  nch z ¼ 1 i ¼ 1  k¼1 x z;i , z ¼ 1; …; Z, i ¼ 1; …; nch , is the input signal retrieved from the learning sequence (after passing through the fuzzification block), Z is the number of samples of learning sequence, nch is the number of input of particular chromosome ch, μAk ðÞ, i ¼ 1; …; nch , k ¼ 1; …; i N ch , is a membership function of the input fuzzy set Aki . Taking into account the component ffintE ðXch Þ A ð0; 1 in evaluation of the population is in accordance with following interpretability criterion: Q4 – semantics at the fuzzy partition level (evaluate complementarity).

Note that components of the adaptation function (25), (26), (28), (29) and (33) can be implemented in such a way to dynamically use the system components (12) which anyway have to be determined during working of the system. To sum up, the proposed fitness function allows us to promote chromosomes not only with good accuracy of the neuro-fuzzy

209

system (12), but also systems which ensured interpretability criteria0 s (Q1–Q4) proposed in [14].

3.4. Initialization of the initial parents population A fundamental part of the proposed initialization method is initialization of input and output fuzzy sets (genes in chromo, ch ¼ 1; …; μ, corresponding to antecedents and consomes Xpar ch sequents). The initial population is split into two parts. In the first part initialization of fuzzy sets is particularly random (chromosomes initialized in this way are aimed at the exploration of the search area), in the second part initialization is based on clusters obtained from using Ward0 s hierarchical clustering method [59] (chromosomes initialized in this way are aimed at the exploitation of potentially promising search areas). This clustering algorithm works on the basis of learning dataset (from example obtained from identification process) fxz ¼ 1;i ¼ 1 ; …; xz ¼ 1;i ¼ n ; dz ¼ 1;j ¼ 1 ; …; dz ¼ 1;j ¼ m g; …; fxz ¼ Z;i ¼ 1 ; …; xz ¼ Z;i ¼ n ; dz ¼ Z;j ¼ 1 ; …; dz ¼ Z;j ¼ mg where xz;i , z ¼ 1; …; Z, i ¼ 1; …; n are the input signals, dz;j , z ¼ 1; …; Z, j ¼ 1; …; m are the desired signals. This algorithm allows us to obtain a specified number of clusters. Based on learning data the algorithm generates centres of clusters vk ¼ ½vk;1 ; …; vk;n þ m , k ¼ 1; …; N max , and masses uk , k ¼ 1; …; N max of the clusters (mass states for number of data belongs to clusters). On the basis of vectors vNmaxðn þ mÞ and uNmax membership functions for input and output fuzzy sets are initialized. In our simulations we use Gauss membership functions. Part of the chromosomes Xpar coding centres of input fuzzy sets Xpar fx Ai;k g is initialized by the ch ch following equation: 8 ψ  sgnðrndð  1; 1ÞÞ  rndð0; 1Þ > > vk;i þ  1 > > > maxz ¼ 1;…;Z fxz;i g  minz ¼ 1;…;Z fxz;i g > > > > > μ div 2 > < for ch ¼ 1; …;  A 1þ2  k Xpar fx g ¼ i;k ch þ ψ  sgnðrndð  1; 1ÞÞ  rndð0; 1Þ > > > 2  Nmax > > fxz;i g þ  1 > > z ¼min 1;…;Z > maxz ¼ 1;…;Z fxz;i g  minz ¼ 1;…;Z fxz;i g > > > : for ch ¼ μ div 2 þ 1; …; μ;

ð34Þ where ψ A ½0; 1 is a difference parameter of fuzzy sets (in our simulations we assumed ψ ¼ 0:1), sgnðÞ is the signum function, rndðamin ; amax Þ is a random function generating values from ½amin ; amax , μ is the number of chromosomes in initial population (1; …; μ div 2 is the first part of the initial population, μ div 2 þ 1; …; μ is the second part of chromosomes in initial population), Nmax is the maximum number of fuzzy rules. Part of the chromosomes Xparolimits fsAi;k g is initialized by the following ch equation: 8  u  > > 2  γ  k þψ  sgnðrndð  1; 1ÞÞ > > Z > >  1 > > > maxz ¼ 1;…;Z fxz;i g  minz ¼ 1;…;Z fxz;i g > > > > > for ch ¼ 1; …; μ div 2 < n o >  par A Xch si;k ¼ 2  γ  1 þ ψ  sgnðrndð  1; 1ÞÞ > > Nmax > > > 1 > > > > > maxz ¼ 1;…;Z fxz;i g  min fxz;i g > > z ¼ 1;…;Z > > > : for ch ¼ μ div 2 þ 1; …; μ;

ð35Þ

where γ is a constant selectable by trial and error (in our simulations, we assumed that γ¼0.3). The output fuzzy sets are initialized analogously. Part of the chromosomes Xpar code centres of the output fuzzy sets Xpar fy Bj;k g ch ch

210

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

is initialized via the following equation: 8 ψ  sgnðrndð  1; 1ÞÞ  rndð0; 1Þ > > vk;n þ j þ > > 1 > ðmax > z ¼ 1;…;Z fdz;i g minz ¼ 1;…;Z fdz;i gÞ > > > > μ div 2 < for ch ¼ 1; …;  n o > 1þ2  k Xpar y Bj;k ¼ ch þ ψ  sgnðrndð  1; 1ÞÞ  rndð0; 1Þ > > > 2  Nmax > > min fdz;i g þ > > z ¼ 1;…;Z > ðmaxz ¼ 1;…;Z fdz;i g  minz ¼ 1;…;Z fdz;i gÞ  1 > > > : for ch ¼ μ div 2 þ 1; …; μ:

ð36Þ Part of the chromosomes Xpar code width of output fuzzy sets ch Xpar fsBj;k g is initialized via the following equation: ch 8  u  > > 2  γ  k þ ψ  sgnðrndð  1; 1ÞÞ > > Z > > 1 > > max > z ¼ 1;…;Z fdz;i g minz ¼ 1;…;Z fdz;i g > > > > n o < for ch ¼ 1; …; μ div 2  ð37Þ sBj;k ¼ Xpar 1 ch > > þψ  sgnðrndð  1; 1ÞÞ 2  γ  > > Nmax > > >  1 > > > maxz ¼ 1;…;Z fdz;i g minz ¼ 1;…;Z fdz;i g > > > : for ch ¼ μ div 2 þ 1; …; μ Values of rest genes in the initial parent population are the following:

 Genes in chromosome Xpar , ch ¼ 1; …; μ, corresponding to ch    

discretization points y def j;r , j ¼ 1; …; m, r ¼ 1; …; Rmax, are chosen as random numbers from ½minz ¼ 1;…;Z fdz;j g; maxz ¼ 1;…;Z fdz;j g. Genes in chromosome Xpar , ch ¼ 1; …; μ, corresponding to ch weights of antecedents wAi;k , i¼1,…,n, k ¼1,…,Nmax, are chosen as random numbers from ½0; 1. Genes in chromosome Xpar , ch ¼ 1; …; μ, corresponding to ch weights of rules wrule k , k¼1,…,Nmax, are chosen as random numbers from ½0; 1. Genes in chromosome Xred ch , ch ¼ 1; …; μ, are chosen as random numbers from f0; 1g. Genes in chromosome spar , ch ¼ 1; …; μ, corresponding to the ch mutation range are set as 1 before the evolution process.

4. Simulation results In the simulations four problems were considered:

   

Box and Jenkins gas furnace problem [6], chemical plant problem [57], truck backer-upper control problem [39], delta ailerons problem [20].

     

The The The The The The

algorithm performs 2000 steps (generations). mutation probability was set as pm ¼ 0:1. crossover probability was set as pc ¼ 0:8. value of wffcomplexity was set as 1. value of wffsimilarity was set as 1. value of wffinterpretability was set as 1.

The characteristic features of the used flexible neuro-fuzzy system (12) for all problems can be summarized as follows:

 As a membership function the Gauss function was used.  Maximum number of rules was set as Nmax ¼ 9.  Maximum number of discretization points was set as Rmax ¼10.

 t-Norms and t-conorms was set as algebraic.

4.1. Box and Jenkins gas furnace problem The Box and Jenkins gas furnace data consists of 296 measurements of the gas furnace system: the input measurement uðkÞ is the gas flow rate into the furnace and the output measurement is the CO2 concentration in the outlet gas. The sampling interval value is equal to 9 s. Results for Box and Jenkins gas furnace problem are shown in Tables 1, 4 (column a) and 5 (row a) and presented in Figs. 2, 6(a), 7 (a):  Fig. 2 shows input and output fuzzy sets of the system (12).  Fig. 6(a) shows weights of antecedents and rules of the system (12).  Fig. 7(a) shows dependences between the value of accuracy (RMSE value) and the number of reduced elements (L) in the neuro-fuzzy system (12).  Table 1 shows accuracy of the system (12) and accuracy achieved by other authors.  Table 4 (column a) shows values of fitness function (24) and its components for chromosome, which represent the neuro-fuzzy system (12) selected in the evolution process.  Table 5 (row a) shows levels of reduction of the neuro-fuzzy system (12) selected in the evolution process.

The rules of neuro-fuzzy system (12) Box and Jenkins gas furnace problem are described by the formula

The characteristic features of the used evolution strategy ðμ; λÞ for all problems can be summarized as follows:

 The number of parent population μ ¼ 128.  The number of temporary population λ ¼ 256.  Constant C ¼1.2, and constant ɛ0 ¼ 0:01. Table 1 The accuracy of the various methods for Box and Jenkins gas furnace problem. Method

Number of inputs/rules

RMSE

Box and Jenkins [6] Sugeno and Yasukawa [57] Lin and Cunningham [33] Kim et al. [28] Delgado et al. [12] Rutkowski and Cpałka [50] Our result

6/– 3/6 5/4 6/2 2/4 6/4 4/3

0.4494 0.4358 0.2664 0.2190 0.4100 0.2416 0.2959

8 > > > > > Rð1Þ > > > > > > > > > > > > > > > ð2Þ > > >R > < > > > > > > > > > > > > > > > Rð3Þ > > > > > > > > > :

3  IFðx1 ISA11 Þjwτ1;1  5wrule :4  1 THENðy1 ISB11 Þ  2 3  IFðx1 ISA21 Þjwτ2;1  6 7 6 ANDðx ISA2 Þjwτ 7 rule :6 7 5 5 2;5 w2 4 5  2 THENðy2 ISB1 Þ  2 3 3  IFðx1 ISA1 Þjwτ3;1  6 7 6 ANDðx ISA3 Þjwτ 7 6 7 3 3 3;3  rule 7w : :6 6 7 3 3 6 ANDðx6 ISA6 Þjwτ3;6 7 4 5  THENðy3 ISB31 Þ  2

ð38Þ

Extended notation of the rules in the form (38), which takes into account the linguistic variable names (arising from the description of the problem) and the names of the linguistic

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

211

Fig. 2. Input and output fuzzy sets of the neuro-fuzzy system (12) for Box and Jenkins gas furnace problem.

variables represented by fuzzy sets, is as follows: 8 " #  > IFð‘gas_flow_rateðt  1Þ’IS‘low’Þjn >  > > Rð1Þ : i > >  _concentrationðtÞ’IS‘low’Þ THENð‘CO > 2 > > > 2 3 > >  IFð‘gas_flow_rateðt  1Þ’IS‘medium’Þji > >  > >  > ANDð‘gas_flow_rateðt 5Þ’IS‘near þ 0:85’Þji 7 > Rð2Þ : 6 vi 4 5 > <   THENð‘CO2 _concentrationðtÞ’IS‘medium’Þ > 2 3 > >  > IFð‘gas_flow_rateðt  1Þ’IS‘high’Þjvi >  > > 6 7 > > ANDð‘gas_flow_rateðt 3Þ’IS‘near þ 37:50’Þji  6 7 > > 7 > Rð3Þ : 6 > 6 ANDð‘gas_flow_rateðt 6Þ’IS‘near  4:50’Þjvi 7vi; > >  4 5 > >  > >  : THENð‘CO2 _concentrationðtÞ’IS‘high’Þ

ð39Þ

where ‘n’ is a neutral element in the base of rules (neutral antecedent or rule, for which the weight of importance is close to the value 0.5), ‘i’ means an element of increased importance in the base of rules (for which the weight of importance is close to the value 0.75), ‘vi’ means an element of the highest importance in the base of rules (for which the weight of importance is close to the value 1).

4.2. Chemical plant problem We consider a model of an operator0 s control of a chemical plant. The plant produces polymers by polymerizing some monomers. Since the start-up of the plant is very complicated, men have to perform the manual operations at the plant. Three continuous inputs are chosen for controlling the system: monomer concentration, change of monomer concentration and monomer flow rate. The output is the set point for the monomer flow rate. Results for chemical plant problem are shown in Tables 2, 4 (column b) and 5 (row b) and presented in Figs. 3, 6(b) and 7(b):

 Fig. 3 shows input and output fuzzy sets of the system (12).  Fig. 6(b) shows weights of antecedents and rules of the system (12).  Fig. 7(b) shows dependences between the value of accuracy (RMSE value) and the number of reduced elements (L) in the neuro-fuzzy system (12).  Table 2 shows accuracy of the system (12) and accuracy achieved by other authors.  Table 4 (column b) shows values of fitness function (24) and its components for chromosome, which represent the neuro-fuzzy system (12) selected in the evolution process.  Table 5 (row b) shows levels of reduction of the neuro-fuzzy system (12) selected in the evolution process.

The rules of the obtained neuro-fuzzy system (12) for chemical plant problem are described by the formula 8 > > > > > > > > > Rð1Þ > > > > > > > > > > > > > < > > Rð2Þ > > > > > > > > > > > > > > > > > > Rð3Þ > > :

3  6 7 6 ANDðx ISA1 Þjwτ 7 rule :6 3 3 1;3 7w1 4 5 1  THENðy1 ISB1 Þ  2 3  IFðx2 ISA22 Þjwτ2;2  6 7 6 ANDðx ISA2 Þjwτ 7 rule :6 3 3 2;3 7w2 4 5  THENðy2 ISB21 Þ  2 3  IFðx3 ISA33 Þjwτ3;3  5wrule : :4  3 THENðy3 ISB31 Þ  2

IFðx1 ISA11 Þjwτ1;1

ð40Þ

Extended notation of the rules in the form (40), which takes into account the linguistic variable names (arising from the description of the problem) and the names of the linguistic

212

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

variables represented by fuzzy sets, is as follows: 8 2 3  IFð‘monomer_concentration’IS‘near þ4:0’Þjn > >  > > ð1Þ 6  7 > ANDð‘monomore_flowrate’IS‘low’Þji > 5vi > >R : 4  > > > THENð‘set_point_for_monomer_flow_rate’IS‘low’j  > > > 2 3 > > IFð‘change_of _monomer_concentration’IS‘near þ 0:1’Þji  < 7 ð2Þ 6 ANDð‘monomore_flow_rate’IS‘medium’Þjn R :4 5 n > >  > >  THENð‘set_point_for_monomer_flow_rate’IS‘medium’Þ > > > > " # > >  > IFð‘monomore_flow_rate’IS‘high’Þji >  > > Rð3Þ : vi; > > THENð‘set_point_for_monomer_flow_rate’IS‘high’Þ  : ð41Þ where ‘n’ is a neutral element in the base of rules, ‘i’ means an element of increased importance in the base of rules, ‘vi’ means an element of the highest importance in the base of rules. 4.3. Truck backer-upper control problem The problem of packing the truck back is described by the following equations: 8 xðt þ 1Þ ¼ xðtÞ þ cos ðθðtÞ þ ϕðtÞÞ þ sin ðθðtÞÞ  sin ðϕðtÞÞ > < yðt þ1Þ ¼ yðtÞ þ cos ðθðtÞ þ ϕðtÞÞ  sin ðθðtÞÞ  cos ðϕðtÞÞ ð42Þ > : ϕðt þ 1Þ ¼ ϕðtÞ  arcsinð2  sin ðθðtÞÞÞ; b

where x is the position of the truck in the parking on the horizontal axis, y is the position of the truck in the parking on the vertical axis, ϕ is a vector representing the inclination angle relative to the vertical axis of the truck, b is the length of the truck (in the simulations we assume that b¼20), θ A ½  π=4; þ π=4 is the angle of the wheels torsion of the truck. It is assumed that the dock is at a point ð0; 0Þ, the range x is in the range ½  150; þ 150, range y is in the range ½0; þ 300, range ϕ is in the range ½  π; þ π, range θ is in the range ½  π=4; þ π=4. In the considered problem Table 2 The accuracy of the various methods for chemical plant problem. Method

RMSE

Pal and Chakraborty [40] Lin and Cunningham [33] Rutkowski [48] (N ¼6) Our result (N ¼ 3)

0.0092 0.0079 0.0042 0.0063

the input parameters are x and y, output parameter is θ, and a goal is to find such a torsion angle of the wheels to truck parked at the dock (meet the condition x ¼0, y ¼0, ϕ ¼ 0). Results for truck backer-upper control problem are shown in Tables 4 (column c) and 5 (row c) and presented in Figs. 4, 6(c) and 7(c):  Fig. 4 shows input and output fuzzy sets of the system (12).  Fig. 6(d) shows weights of antecedents and rules of the system (12).  Fig. 7(d) shows dependences between the value of accuracy (RMSE value) and the number of reduced elements (L) in the neuro-fuzzy system (12).  Table 4 (column c) shows values of fitness function (24) and its components for chromosome, which represent the neuro-fuzzy system (12) selected in the evolution process.  Table 5 (row c) shows levels of reduction of the neuro-fuzzy system (12) selected in the evolution process.

The rules of the obtained neuro-fuzzy system (12) for truck backer-upper control problem are described by the formula 8 > > > > > > > > Rð1Þ > > > > > > > > > > > > > > > > > > > > > Rð2Þ > > > > > > < > > > > > > > > > Rð3Þ > > > > > > > > > > > > > > > > > > > ð4Þ > > >R > > > > :

3   6 7 6 ANDðx ISA1 Þjwτ 7 rule :6 2 2 2;1 7w1 4 5  THENðy1 ISB31 Þ  2 3  2 IFðx1 ISA1 Þjwτ1;2  6 7 6 ANDðx ISA2 Þjwτ 7 rule :6 2 2 2;2 7w2 4 5  THENðy1 ISB21 Þ  2 3  3 IFðx1 ISA1 Þjwτ1;3  6 7 6 3 τ 7 rule : 6 ANDðx2 ISA2 Þjw2;3 7w3 4 5  THENðy1 ISB11 Þ  2 3  4 IFðx1 ISA1 Þjwτ1;4  6 7 6 7 4 : 6 ANDðx5 ISA5 Þjwτ2;4 7wrule : 4 5 4  4 THENðy1 ISB1 Þ  2

IFðx1 ISA11 Þjwτ1;1

ð43Þ

Extended notation of the rules in the form (43), which takes into account the linguistic variable names (arising from the description of the problem) and the names of the linguistic

Fig. 3. Input and output fuzzy sets of the neuro-fuzzy system (12) for the chemical plant problem.

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

213

Fig. 4. Input and output fuzzy sets of the neuro-fuzzy system (12) for the truck backer-upper control problem.

Table 3 The accuracy of the various methods for delta ailerons problem. Method

RMSE

Kre¸towski and Czajkowski [31] Zhao et al. [61] Pouzols and Lendasse [43] Halawani et al. [18] Our result (N ¼ 2)

0.000176 0.000165 0.000161 0.000172 0.000178

Table 4 The values of fitness function (24) and its components for chromosome which represent the neuro-fuzzy system (12) selected in the evolution process. Name of the component

ffaccuracyðXch Þ ffcomplexityðXch Þ ffintA ðXch Þ ffintB ðXch Þ ffintC ðXch Þ ffintD ðXch Þ ffintE ðXch Þ ffinterpretabilityðXch Þ ff ðXch Þ

variables represented by fuzzy sets, is as follows: 8 2 3  IFð‘x’IS‘very_low’Þji > >  > > ð1Þ 6 7 > > R : 4 ANDð‘ϕ’IS‘very_low’Þjvi 5vi > >  > >  > THENð‘Θ’IS‘high’Þ > > >  2 3 > >  IFð‘x’IS‘low’Þjvi > >  > > ð2Þ 6 ANDð‘ϕ’IS‘low’Þjvi 7 > > : R vi 4 5 > >  > > < THENð‘Θ’IS‘low’Þ  2 3  IFð‘x’IS‘high’Þjvi > >  > >  7 ð3Þ 6 ANDð‘ϕ’IS‘high’Þji > > : R vi 4 5 > >  > >  > THENð‘Θ’IS‘very_low’Þ > > > 2 3 > >  IFð‘x’IS‘very_high’Þji > >  > >  > > Rð4Þ : 6 ANDð‘ϕ’IS‘very_high’Þjn 7 vi; 4 5 > >  > > : THENð‘Θ’IS‘very_high’Þ 

ð44Þ

Problem number a

b

c

d

0.2959 1.1429 0.0923 0.0562 0.0232 0.3125 0.1134 1.5976 0.5402

0.0063 1.2222 0.1256 0.1023 0.0952 0.0897 0.0535 1.4663 0.0113

5.0100 1.4444 0.4512 0.3252 0.3949 0.2892 0.0417 2.5022 18.1076

0.0002 1.1481 0.3410 0.2310 0.4792 0.2358 0.0292 2.3162 0.0005

(a) Box and Jenkins gas furnace problem, (b) chemical plant problem, (c) truck backer-upper control problem, and (d) delta ailerons problem.

Table 5 Levels of reduction of the neuro-fuzzy system (12) selected in the evolution process. Problem number

R1 [%]

R2 [%]

R3 [%]

R4 [%]

R5 [%]

R6 [%]

a b c d

33.33 0.00 0.00 16.67

88.89 81.48 55.56 86.67

66.67 66.67 55.56 77.78

66.67 66.67 55.56 77.78

70.00 60.00 20.00 50.00

85.71 77.78 55.56 85.19

(a) Box and Jenkins gas furnace problem, (b) chemical plant problem, (c) truck backer-upper control problem, and (d) delta ailerons problem. R1 – reduction of inputs, R2 – reduction of antecedents, R3 – reduction of consequents, R4 – reduction of rules, R5 – reduction of discretization points, R6 – reduction of all parameters.

where ‘n‘ is a neutral element in the base of rules, ‘i‘ means an element of increased importance in the base of rules, ‘vi‘ means an element of the highest importance in the base of rules. 4.4. Delta ailerons problem The delta ailerons problem contains 7129 instances and each instance is described by five attributes (RollRate, PitchRate, currPitch, currRoll, diffRollRate). All attributes are continuous. This dataset is obtained from the task of controlling the ailerons of a F16 aircraft, although the target variable and attributes are different from the ailerons0 domain. The target variable here is a variation instead of an absolute value, and there was some preselection of the attributes. Results for delta ailerons problem are shown in Tables 3, 4 (column d) and 5 (row d) and presented in Figs. 5, 6(d) and 7(d):  Fig. 5 shows input and output fuzzy sets of the system (12).  Fig. 6(d) shows weights of antecedents and rules of the system (12).  Fig. 7(d) shows dependences between the value of accuracy (RMSE value) and the number of reduced elements (L) in the neuro-fuzzy system (12).  Table 3 shows accuracy of the system (12) and accuracy achieved by other authors.

214

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

Fig. 5. Input and output fuzzy sets of the neuro-fuzzy system (12) for the delta ailerons problem.

Fig. 6. Exemplary weights representation in the neuro-fuzzy system (12) (dark areas correspond to low values of weights and vice versa, crossed areas correspond to reduced antecedents) for the following problems: (a) Box and Jenkins gas furnace problem, (b) chemical plant problem, (c) truck backer-upper control problem, and (d) delta ailerons problem.

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

215

Fig. 7. The dependence between RMSE and the number of system parameters (12) for the following problems: (a) Box and Jenkins gas furnace problem, (b) chemical plant problem, (c) truck backer-upper control problem, and (d) delta ailerons problem. The results of the best chromosomes are marked with circles.

 Table 4 (column d) shows values of fitness function (24) and its components for chromosome, which represent the neuro-fuzzy system (12) selected in the evolution process.  Table 5 (row d) shows levels of reduction of the neuro-fuzzy system (12) selected in the evolution process.

The rules of the obtained neuro-fuzzy system (12) for delta ailerons problem are described by the formula 8 > > > > > > > > Rð1Þ > > > > > > > > > > > > < > > > > > > > > > Rð2Þ > > > > > > > > > > > :

3   6 7 6 7 1 : 6 ANDðx5 ISA5 Þjwτ5;1 7wrule 4 5 2  THENðy1 ISB21 Þ   2 3 2  IFðx1 ISA1 Þjwτ1;2  6 7 6 ANDðx ISA2 Þjwτ 7 6 7 2 2 2;2  6 7 6 7 2 : 6 ANDðx3 ISA3 Þjwτ3;2 7wrule : 6 7 j 6 7 2 6 ANDðx5 ISA5 Þjwτ3;2 7 4 5   THENðy1 ISB11 Þ 2

IFðx1 ISA11 Þjwτ1;1

variables represented by fuzzy sets, is as follows 8 2 3  IFð‘roll_rate’IS‘low’Þjn > >  > > ð1Þ 6 ANDð‘roll_rate_difference’IS‘low’Þjn 7 > > R :4 5vi > >  > >  THENð‘sa ’IS‘high’Þ > > > > > 2 3 <  IFð‘roll_rate’IS‘high’Þji  6 ANDð‘pitch_rate’IS‘near  0:001’Þji 7 > 6 7 > > > 7 > ð2Þ 6 > 7 R :6 > > 6 ANDð‘current_pitch’IS‘near þ 0:070’Þjn 7i: > > 6 ANDð‘roll_rate_difference’IS‘high’Þji 7 > > 4 5 > >  > :  THENð‘sa ’IS‘low’Þ

ð46Þ

where ‘n’ is a neutral element in the base of rules, ‘i’ means an element of increased importance in the base of rules, ‘vi’ means an element of the highest importance in the base of rules. ð45Þ

Extended notation of the rules in the form (45), which takes into account the linguistic variable names (arising from the description of the problem) and the names of the linguistic

4.5. Conclusions from the simulations The conclusions from our simulation can be summarized as follows:

 The purpose of our work was to obtain balance between good accuracy of the system (model) and high interpretability of knowledge (accumulated within model). We considered interpretability on all quadrants (Q1–Q4) proposed in [14]. The usage of interpretability criteria from [14] arises from the fact that it is an actual and very well-built review which describes and considers criteria of

216







K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

interpretability based on fuzzy rules. Our goals seem to be achieved, we obtained very clear fuzzy rules (formulas (38), (40), (43), (45)) and good accuracy of the system (see Tables 1–3). We generated diagram with dependency between the value of RMSE error and the number of reduced elements in the system for the last population (generation) of learning algorithm (see Fig. 7). It should be noted that for all simulation problems our method chose chromosome not with best accuracy, but with best balance between accuracy and interpretability (which came from our fitness function). For example for Box and Jenkins gas furnace problem, we received a chromosome which codes neuro-fuzzy system with accuracy 0.2913. A chromosome finally chosen by the system has 0.2959 accuracy, but its interpretability is much better (due of a higher number of reduced elements). The values of fitness function and its components are shown in Table 4. It can be noticed that the values of ffinterpretability (responsible for system interpretability) components have highest effort on the fitness function. The second important component is ffcomplexity (responsible for system complexity) which has lesser but still important effect on the fitness function. These components both assure high interpretability of the system. The structure of neuro-fuzzy systems was significantly reduced (see Table 5). Reduction of the rule base was one of the mechanisms responsible for achieving interpretability. It should be noted that best reduction of all elements of the neuro-fuzzy systems was achieved for Box and Jenkins gas furnace problem and the delta ailerons problem. The process of reduction allowed to reduce elements of the system which do not affect system accuracy. Reduced elements could also make conflicts in base of rules which complicates possibility to obtain higher interpretable systems. An interesting mechanism is a mechanism which automatically creates hierarchy of rules and antecedents of the rules (see formulas (39), (41), (44), and (46) and Fig. 6). As can be seen in Fig. 6, some rules are more important than the others. For example, for chemical plant problem, rule " # !  IFð‘monomore_flow_rate’IS‘high’Þji  Rð3Þ : vi THENð‘set_point_for_monomer_flow_rate’IS‘high’Þ  have highest value wrule 3 . It means that this rule is most important in the system. Therefore, if the learning input vector makes it active the influence of other rules has less impact on the system. The rest of the rules in the system affect the system results if the value of active Rð3Þ is low. For the antecedents this process works analogically.

5. Conclusions In our paper we proposed a new method for designing neurofuzzy systems for nonlinear modelling with the interpretability aspects. The neuro-fuzzy systems work on the basis of the set of fuzzy rules, however it does not guarantee interpretability of the knowledge accumulated in those systems. Our purpose was to obtain models working not only with good accuracy, but also assuring high interpretability, which can be seen as merging both grey box and white box methods. Our simulation results allow to assume that our aims were achieved.

Acknowledgement The authors would like to thank the reviewers for very helpful suggestions and comments in the revision process. The project was financed by the National Science Centre (Poland) on the basis of the decision number DEC-2012/05/B/ ST7/02138.

References [1] F. Afsari, M. Eftekhari, E. Eslami, P.-Y. Woo, Interpretability-based fuzzy decision tree classifier a hybrid of the subtractive clustering and the multiobjective evolutionary algorithm, Soft Comput. 17 9 (2013) 1673–1686. [2] D. Aziz, M.A.M. Ali, K.B. Gan, I. Saiboon, Initialization of adaptive neuro-fuzzy inference system using fuzzy clustering in predicting primary triage category intelligent and advanced systems (ICIAS), in: 4th International Conference on Department of Electrical, Electronic & Systems Engineering, University of Kebangsaan, 2012, pp. 170–174. [3] Ł. Bartczuk, A. Przybył, P. Dziwiński, Hybrid state variables-fuzzy logic modelling of nonlinear objects, Artif. Intell. Soft Comput. (2013) 227–234. [4] H.F. Blanchette, T. Ould-Bachir, J.P. David, A state-space modeling approach for the FPGA-based real-time simulation of high switching frequency power converters, IEEE Trans. Ind. Electron. 59 (2012) 4555–4567. [5] A. Botta, B. Lazzerini, F. Marcelloni, D.C. Stefanescu, Context adaptation of fuzzy systems through a multi-objective evolutionary approach based on a novel interpretability index, Soft Comput. 13 (2009) 437–449. [6] G.E.P. Box, G.M. Jenkins, Time series analysis, Forecast. Control (1976) 532–533. [7] J. Cao, H. Wang, S. Kwong, K. Li, Combining interpretable fuzzy rule-based classifiers via multi-objective hierarchical evolutionary algorithm, in: IEEE International Conference on Systems, Man, and Cybernetics, 2011, pp. 1771–1776. [8] K. Cpałka, A new method for design and reduction of neuro-fuzzy classification systems, IEEE Trans. Neural Netw. 20 (2009) 701–714. [9] K. Cpałka, On evolutionary designing and learning of flexible neuro-fuzzy structures for nonlinear classification, Nonlinear Anal. Ser. A: Theory Methods Appl. 71 (2009) 1659–1672. [10] K. Cpałka, O. Rebrova, R. Nowicki, L. Rutkowski, On design of flexible neurofuzzy systems for nonlinear modelling, Int. J. Gen. Syst. 42 (2013) 706–720. [11] M. Dekker, Advanced Process Identification and Control, Incorporated, 2002 (Chapter 1). [12] M. Delgado, A.F. Gomez-Skarmeta, F. Martin, Fuzzy clustering-based rapid prototyping for fuzzy rule-based modelling, IEEE Trans. Fuzzy Syst. 5 (1997) 223–233. [13] D.B. Fogel, Evolutionary Computation: Toward a New Philosophy of Machine Intelligence, third edition, IEEE Press, Piscataway, NJ, 2006. [14] M.J. Gacto, R. Alcala, F. Herrera, Interpretability of linguistic fuzzy rule-based systems: an overview of interpretability measures, Inf. Sci. 181 (2011) 4340–4360. [15] A.E. Gaweda, J.M. Zurada, P.B. Aronhime, Efficient data-driven modeling with fuzzy relational rule network, in: Proceedings of the 2002 IEEE International Conference on FUZZ-IEEE, 2002, pp. 174–178. [16] A. Ghandar, Z. Michalewicz, An experimental study of Multi-Objective Evolutionary Algorithms for balancing interpretability and accuracy in fuzzy rule base classifiers for financial prediction, in: IEEE Symposium on Computational Intelligence for Financial Engineering and Economics, 2011, pp. 1–6. [17] S.V. Ghomsheh, S.M. Aliyari, M. Teshnehlab, Training ANFIS structure with modified PSO algorithm, in: Proceedings of the 15th Mediterranean Conference on Control and Automation, 2007, pp. 1–6. [18] S.M. Halawani, I.A. Albidewi, A. Ahmad, A novel ensemble method for regression via classification problems, J. Comput. Sci. 7 (2011) 387–393. [19] I. Hisao, N. Yusuke, Discussions on interpretability of fuzzy systems using simple examples, Eur. Soc. Fuzzy Logic Technol.—EUSFLAT (2009) 1649–1654. [20] Homepage of Luis Torgo0 s, 〈http://www.dcc.fc.up.pt/  Eltorgo/Regression/del ta_ailerons.html〉. [21] A. Horzyk, R. Tadeusiewicz, Self-optimizing neural networks, in: Lecture Notes in Computer Science, vol. 3173, 2004, pp. 150–155. [22] H. Ishibuchi, Multiobjective genetic fuzzy systems: review and future research directions, in: Proceedings of IEEE International Conference on Fuzzy Systems, 2007, pp. 913–918. [23] R. Ishibuchi, C.L. Nascimento, Knowledge extraction using a genetic fuzzy rulebased system with increased interpretability, in: IEEE 10th International Symposium on Applied Machine Intelligence and Informatics, 2012, pp. 247–252. [24] F.C. Juang, C.Y. Chen, Data-driven interval type-2 neural fuzzy system with high learning accuracy and improved model interpretability, IEEE Trans. Cybern. 23 (2012) 1–15. [25] M. Kamyar, Takagi-Sugeno fuzzy modeling for process control industrial automation, in: Robotics and Artificial Intelligence (EEE8005), School of Electrical, Electronic and Computer Engineering, 2008. [26] G. Kaur, Similarity Measure of Different Types of Fuzzy Sets, School of Mathematics and Computer Applications, Tharpar University, 2010. [27] Z. Khanmirzaei, M. Sharifi, Modified honey bee optimization for recurrent neuro-fuzzy system model, in: The 2nd International Conference on Computer and Automation Engineering, vol. 5, 2010, pp. 780–785. [28] E. Kim, M. Park, S. Kimand, A transformed input-domain approach to fuzzy modelling, IEEE Trans. Fuzzy Syst. 6 (1998) 596–604. [29] E.P. Klement, R. Mesiar, E. Pap, Triangular Norms, Kluwer Academic Publishers, 2000. [30] M. Korytkowski, R. Scherer, L. Rutkowski, On combining back propagation with boosting, in: International Joint Conference on Neural Networks, IEEE World Congress on Computational Intelligence, Vancouver, BC, Canada, 2006, pp. 1274–1277. [31] M. Kre¸towski, M. Czajkowski, An evolutionary algorithm for global induction of regression and model tree, Int. J. Data Min. Model. Manag. 5 (2013) 261–277.

K. Cpałka et al. / Neurocomputing 135 (2014) 203–217

[32] A. Krishnaji, A.A. Rao, Implementation of a hybrid neuro fuzzy genetic system for improving protein secondary structure prediction, in: Computing and Communication Systems (NCCCS), 2012, pp. 1–5. [33] Y. Lin, G.A. Cunningham , A new approach to fuzzy-neural system modelling, IEEE Trans. Fuzzy Syst. 3 (1995) 190–198. [34] K. Łapa, A. Przybył, K. Cpałka, A new approach to designing interpretable models of dynamic systems, in: Lecture Notes in Artificial Intelligence, vol. 7895, Springer, 2013, pp. 523–534. [35] L. Magdalena, M. Ojeda-Aciego, J.L. Verdegay, A multi-objective cooperative coevolutionary approach to Mamdani fuzzy system generation, in: Proceedings of IPMU0 08, 2008, pp. 1143–1150. [36] C. Mencar, G. Castellano, A.M. Fanelli, On the role of interpretability in fuzzy data mining, Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 15 (5) (2007) 521–537. [37] C. Mencar, A.M. Fanelli, Interpretability constraints for fuzzy information granulation, Inf. Sci. 178 (2008) 4585–4618. [38] Z. Michalewicz, Genetic Algorithms þ Data Structures¼Evolution Programs, Springer, 1999. [39] D. Nguyen, B. Widrow, The truck backer-upper: an example of self-learning in neural network, IEEE Control Syst. Mag. 10 (1990) 18–23. [40] N.R. Pal, D. Chakraborty, Simultaneous Feature Analysis and SI, Neuro-Fuzzy Pattern Recognition, World Scientific, Singapore, 2000. [41] K. Patan, J. Korbicz, Nonlinear model predictive control of a boiler unit: a fault tolerant control study, Appl. Math. Comput. Sci. 1 (22) (2012) 225–237. [42] P. Pławiak, R. Tadeusiewicz, Approximation of phenol concentration using novel hybrid computational intelligence methods, Appl. Math. Comput. Sci. (2014), in press. [43] F.M. Pouzols, A. Lendasse, Evolving fuzzy optimally pruned extreme learning machine for regression problems, Evol. Syst. 1 (2010) 43–58. [44] A. Przybył, K. Cpałka, A new method to construct of interpretable models of dynamic systems, in: Lecture Notes in Artificial Intelligence, 2012, pp. 697–705. [45] A. Przybył, J. Jelonkiewicz, Genetic algorithm for observer parameters tuning in sensorless induction motor drive. Neural networks and soft computing, in: Proceedings of the 6th International Conference on Neural Networks and Soft Computing, Zakopane, 2003, pp. 376–381. [46] M.I. Rey, M. Galende, G.I. Sainz, M.J. Fuente, Checking orthogonal transformations and genetic algorithms for selection of fuzzy rules based on interpretability-accuracy concepts, in: IEEE International Conference on Fuzzy Systems, 2011, pp. 1271–1278. [47] L. Rutkowski, Computational Intelligence, Springer, Berlin Heidelberg, 2008. [48] L. Rutkowski, Flexible Neuro-Fuzzy Systems, Kluwer Academic Publishers, Boston Dordrecht, 2004. [49] L. Rutkowski, K. Cpałka, Designing and learning of adjustable quasi triangular norms with applications to neuro-fuzzy systems, IEEE Trans. Fuzzy Syst. 13 (2005) 140–151. [50] L. Rutkowski, K. Cpałka, Flexible neuro-fuzzy systems, IEEE Trans. Neural Netw. 14 (2003) 554–574. [51] L. Rutkowski, A. Przybył, K. Cpałka, Novel online speed profile generation for industrial machine tool based on flexible neuro-fuzzy approximation, IEEE Trans. Ind. Electron. 59 (2012) 1238–1247. [52] R. Scherer, L. Rutkowski, Neuro-fuzzy relational systems for nonlinear approximation and prediction, Nonlinear Anal. 71 (2009) e1420–e1425. [53] P.K. Shukla, S.P. Tripathi, A review on the interpretability-accuracy trade-off in evolutionary multi-objective fuzzy systems (EMOFS), Information 3 (2012) 256–277. [54] K. Simiński, Rule weights in a neuro-fuzzy system with a hierarchical domain partition, Appl. Math. Comput. Sci. 20 (2) (2010) 337–347. [55] K. Siwek, S. Osowski, R. Szupiluk, Ensemble neural network approach for accurate load forecasting in a power system, Appl. Math. Comput. Sci. 19 (2) (2009) 303–315. [56] D.G. Stavrakoudis, G.N. Galidaki, I.Z. Gitas, J.B. Theocharis, Enhancing the interpretability of genetic fuzzy classifiers in land cover classification from hyperspectral satellite imagery, in: IEEE International Conference on Fuzzy Systems, 2010, pp. 1–8. [57] M. Sugeno, T. Yasukawa, A fuzzy-logic based approach to qualitative modelling, IEEE Trans. Fuzzy Syst. 1 (1993) 7–31. [58] M. Szaleniec, J. Goclon, M. Witko, R. Tadeusiewicz, Application of artificial neural networks and DFT-based parameters for prediction of reaction kinetics of ethylbenzene dehydrogenase, J. Comput.-Aided Mol. Des. 20 (3) (2006) 145–157. [59] J.H. Ward, Hierarchical grouping to optimize an objective function, J. Am. Stat. Assoc. 48 (1963) 236–244. [60] B. Wyrwoł, E. Hrynkiewicz, Decomposition of the fuzzy inference system for implementation in the FPGA structure, Appl. Math. Comput. Sci. 23 (2) (2013) 473–483.

217

[61] G. Zhao, Z. Shen, C. Miao, R. Gay, Enhanced extreme learning machine with stacked generalization, in: Proceedings of the 2008 International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 2008, pp. 1191–1198. [62] S.M. Zhou, J.Q. Gan, Low-level interpretability and high-level interpretability: a unified view of data-driven interpretable fuzzy system modelling, Fuzzy Sets Syst. 159 (2008) 3091–3131.

Krzysztof Cpałka received the M.Sc. degree in electrical engineering in 1997 and the Ph.D. degree (Honours) in 2002 in computer engineering, both from the Częstochowa University of Technology, Poland. Since 2010, he has been an associate professor in the Institute of Computational Intelligence at Częstochowa University of Technology. His research interests include fuzzy systems, neural networks, evolutionary algorithms and artificial intelligence. He published over 70 technical papers in journals and conference proceedings. Krzysztof Cpałka is a recipient of the 2005 IEEE Transactions on Neural Networks Outstanding Paper Award.

Krystian Łapa is a Ph.D. student in the Institute of Computational Intelligence. He received M.Sc. in computer science in 2010 from the Częstochowa University of Technology. His research interests include neurofuzzy systems, interpretability of the fuzzy systems, nonlinear modelling, population based algorithms and computational intelligence. He published 4 technical papers in journals and conference proceedings.

Andrzej Przybył received his Ph.D. in automatics and robotics from the Poznan University of Technology in 2003. He is an assistant professor in the Institute of Computational Intelligence at Częstochowa University of Technology. He is working on developing new control methods used in mechatronics systems. His research interests include motion control systems, real-time Ethernet, FPGA devices and soft computing algorithms for electrical drives. Andrzej Przybył designed various microprocessors, digital signal processors and FPGA based embedded systems. He has published over 20 technical papers.

Marcin Zalasiń ski is a Ph.D. student in the Institute of Computational Intelligence. He received M.Sc. in computer science in 2009 from the Częstochowa University of Technology. His research interests include biometrics (especially dynamic signature verification), neurofuzzy systems, fuzzy systems, evolutionary algorithms and computational intelligence. He published 6 technical papers in journals and conference proceedings.