An adaptive neuro-fuzzy system for efficient implementations

An adaptive neuro-fuzzy system for efficient implementations

Available online at www.sciencedirect.com Information Sciences 178 (2008) 2150–2162 www.elsevier.com/locate/ins An adaptive neuro-fuzzy system for e...

814KB Sizes 3 Downloads 126 Views

Available online at www.sciencedirect.com

Information Sciences 178 (2008) 2150–2162 www.elsevier.com/locate/ins

An adaptive neuro-fuzzy system for efficient implementations J. Echanobe *, I. del Campo, G. Bosque Department of Electricity and Electronics, University of the Basque Country, Leioa, 48940 Vizcaya, Spain Received 20 September 2006; received in revised form 5 December 2007; accepted 6 December 2007

Abstract A neuro-fuzzy system specially suited for efficient implementations is presented. The system is of the same type as the well-known ‘‘adaptive network-based fuzzy inference system” (ANFIS) method. However, different restrictions are applied to the system that considerably reduce the complexity of the inference mechanism. Hence, efficient implementations can be developed. Some experiments are presented which demonstrate the good performance of the proposed system despite its restrictions. Finally, an efficient digital hardware implementation is presented for a two-input single-output neuro-fuzzy system. Ó 2007 Elsevier Inc. All rights reserved. Keywords: Neuro-fuzzy systems; Hardware implementations; Embedded systems; Non-linear systems

1. Introduction Fuzzy systems and neural-networks are widely used techniques in intelligent systems. These systems cover many different application areas such as automatic control, pattern recognition, human-machine interaction, expert systems, modelling, medical diagnosis, economics, etc. Both techniques have their own advantages and drawbacks. Fuzzy systems have the ability to represent comprehensive linguistic knowledge – given for example by a human expert and perform reasoning by means of rules. However, fuzzy systems do not provide a mechanism to automatically acquire and/or tune those rules. On the other hand neural-networks are adaptive systems that can be trained and tuned from a set of samples. Once they are trained, neural-networks can deal with new input data by generalizing the acquired knowledge. Nevertheless, it is very difficult to extract and understand that knowledge. In other words, fuzzy systems and neural-networks are complementary paradigms. In recent years, neuro-fuzzy systems have been proposed which combine the advantages of both techniques, as well as overcome the drawbacks of each one individually. These systems can combine both fuzzy and neuro paradigms in two different ways [22]: (a) by introducing the fuzzification into the neural-network structure: i.e. fuzzy neural-networks [5,6,15] and (b) by providing the fuzzy systems with learning ability by means of *

Corresponding author. Tel.: +34 946015308; fax: +34 946013071. E-mail address: [email protected] (J. Echanobe).

0020-0255/$ - see front matter Ó 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.ins.2007.12.009

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

2151

neural-network algorithms, i.e. neural-network-driven fuzzy reasoning techniques [1,3,7,8,10–14,16,20,21]. In the first case the fuzzification can be introduced in any of the network aspects: neuron inputs, output, weights, aggregation operations, transfer functions, etc. However most of the existing systems limit the fuzzification to the neuron inputs and to the aggregation operation. In the second case, neural-network methods are used both with the aim of identifying rules and membership functions and for tuning the system. However, neuro-fuzzy systems are rather complex because they integrate many different tasks working in a cooperative way. Hence, neuro-fuzzy implementations must be developed carefully in order to fulfil all the requirements that a real-life application can demand, such as cost, size, power, etc. These implementations can be carried out by means of either hardware or software platforms. Software implementations are more flexible and economical than hardware implementations, but they provide lower speed processing. Hence, hardware implementations are generally used for real-time applications where high performance is always an essential requirement. If the complexity of the neuro-fuzzy systems is even higher (i.e. numerous inputs, multiple and complex membership functions, high precision, etc.), even hardware implementations could become unfeasible, either because of their dimensions, costs and power, or because of the difficulty in reaching the high performance requirements. In this paper, we propose a neuro-fuzzy system in which the complexity is highly reduced without sacrificing appreciably its features or capabilites. The system belongs to the above cited neural-network-driven fuzzy reasoning techniques. In particular, it is of the same type as the well-known ‘‘adaptive-network-based fuzzy inference system” (ANFIS) method [8] about which many related works have been written. However, some different restrictions are applied to the system in order to reduce considerably the complexity of its inference mechanism. Hence, it can be applied to real-time applications and achieve high performance capabilities even if these are complex. The rest of the paper is organised as follows: In Section 2 the proposed neuro-fuzzy system is formally described. In addition, the computational cost for the system is examined and compared with that of the generic ANFIS system. Some examples are presented in Section 3 showing that despite the restrictions imposed on the system, functionality and performance are still maintained. Those examples concern the ability of the system to approximate non-linear functions. In all the cases, a comparison with the generic ANFIS system is carried out. In Section 4, an example of a hardware-based implementation of the inference mechanism is shown. We shall see there how the implementation becomes very simple and thus, how a high performance system is provided. Some parameters, such as hardware size, memory occupation, execution times, delays, etc., are also shown in order to analyze efficiency. Finally, the main conclusions of this work are commented on in Section 5. 2. A neuro-fuzzy system In this section, we are going to describe the proposed neuro-fuzzy system. In fact, the system is an ANFIStype neuro-fuzzy system [8] with some restrictions on its structure that lead to a very simple and rapid inference mechanism. This fact enables it to be used in applications demanding fuzzy systems with a small size, low power consumption and high inference speed. We therefore start by showing how an ANFIS system is defined. An ANFIS system is a fuzzy inference system whose parameters are trained by means of neural-network algorithms. The system can be viewed as a particular neural-network that is functionally equivalent to a fuzzy inference system. As long as the fuzzy system is represented as a neural-network, it is straightforward to train the system by means of any of the well-known neural-network learning algorithms. This training process adjusts the parameters of the network which in fact are the parameters of the fuzzy system, such as membership functions, strength of the rules, consequents, etc. In a formal way the ANFIS system can be shown as follows. Consider a rule based n-input fuzzy system with m antecedent functions per dimension. It has mn rules where the jth rule can be expressed as: Rj :

IFx1 is M 1j1

and

x2 is M 2j2 . . .

and

xn is M njn THEN f is cj

where x = (x1, x2, . . . , xn), x 2 Rn is the input vector; M 1j1 ; M 2j2 ; . . . ; M njn (1 6 ji 6 m) are fuzzy sets (antecedents), f is the output variable, and cj is a crisp consequent (i.e. a singleton). If the center of gravity defuzzification method is adopted, the output of the system is given by

2152

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

Pmn

j¼1 wj cj

f ¼ Pmn

ð1Þ

j¼1 wj

where wj ¼ M 1j1 ðx1 ÞM 2j2 ðx2 Þ . . . M njn ðxn Þ, being M iji ðxi Þ the membership function value lM ij ðxi Þ (1 6 i 6 n). i The functionally equivalent neural-network is shown in Fig. 1. It is a five layer network: (1) The first layer i i i computes the membership functions M ji ðx Þ ¼ lM ij ðx Þ. (2) The second layer computes the values wj i wj (1 6 j 6 mn). (3) The third layer normalizes these values: that is, wj ¼ w1 þþw . Then, the fourth layer commn putes wj cj (1 6 j 6 mn). Finally thePoutput of the network is provided by the fifth layer which aggregates the overall output as the sum f ¼ wj cj . It is clear that this network performs the same operation as the fuzzy system. In the following we will refer to this system as the generic ANFIS. To train the above network, a hybrid algorithm has been proposed [8]. The algorithm is composed of a forward pass which is carried out by a least squares estimator (LSE) process, followed by a backward pass which is carried out by a back propagation (BP) algorithm. The LSE computes the consequents and the BP adjusts the parameters of the antecedents. Although the BP algorithm could train the network alone, the training process can be accelerated with the LSE. This is possible due to the fact that the output of the network is linear in the consequent parameters. Let us analyze the computational cost involved in computing the inference mechanism of the generic ANFIS. First, Eq. (1) implies one division, mn  1 sums and mn  1 products in the numerator, and mn  1 sums in the denominator. On the other hand, to compute each wj, n  1 products are required, so we have (n  1)mn additional products. Finally mn membership functions must be evaluated for the n-inputs. The cost of this membership evaluation depends, logically, on the concrete form of those functions so little more can be said about it. To summarize, we have 1 division, mnn  1 products, 2(mn  1) sums and mn membership evaluations. As we can see, the inference of the rules implies a rather high computational cost. Hence, a hardware implementation is required if we want to apply it, for example, to real-time applications. However, if the number of inputs, rules or antecedents increase, even a hardware implementation is unable to fulfil the performance requirements. To overcome this problem, we propose to introduce some restrictions on the antecedents that considerably reduce the system’s complexity:

M1

1

Π

x1 Mm 1

Π

w1 w2

Ν

w1

Ν

w2

w2 c 2

Σ

M1

wm n c m n

n

xn

Mm

w1 c 1

Π

w mn

Ν

n

Fig. 1. An n-input ANFIS scheme.

wm n

f

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

2153

(1) The membership functions must be overlapped by pairs; that is, with an overlapping degree of 2. (2) The sum of the membership functions in all the antecedents for every dimension must be equal to 1; that is, M i1 ðxi Þ þ M i2 ðxi Þ þ    þ M im ðxi Þ ¼ 1, 1 6 i 6 n. (3) Triangular membership functions must be selected for the antecedents. Fig. 2 depicts typical membership functions verifying the above restrictions. In the following we will analyze the advantages of restrictions (1) to (3) on the simplicity of the fuzzy system: (1) With regard to the first restriction, it is easy to see that the number of antecedents that have to be evaluated per dimension is only 2. Therefore the number of active rules for an n-input system is 2n. In consequence, the execution of the inference can be performed faster because it is not necessary to evaluate all the rules and hence, Eq. (1) is simplified. Moreover, if all the membership functions are of the same type, differing only in their parameters, a single generic 2n rules inference mechanism can be implemented. This mechanism performs the inference process in a generic way, regardless of which rules are active. Thus, the only difference in each time inference are the parameter values of the active rules. Of course a previous step must be designed to localize the active rules for every input, but this step is quite easy, as we will see in Section 4. Therefore the HW implementation is considerably simplified. 2) The second restriction imposed on the system reduces hardware complexity even more because it implies that the denominator in Eq. (1), is always 1, and thus the division is not necessary. This is an important result because it is well-known that division operations always involve too much hardware. 3) Finally, if triangular-shape antecedents are selected, the membership value of the input for a concrete antecedent is obtained by a single product. To clarify this point, let us examine Fig. 2 where M iji and M iji þ1 are the 2 antecedents that have to be evaluated for a given input xi. It is clear that the first membership function M iji ðxi Þ is the product of ðaiji þ1  xi Þ and the right slope (absolute value) of M iji , being aiji þ1 the end point of that triangle. Also, M iji þ1 ðxi Þ is the product of ðxi  aiji Þ and the left slope of M iji þ1 , being aiji the origin of that triangle. Therefore, the membership evaluation of an antecedent requires both a product and a substraction. On the other hand, note that, because of the second restriction, these 2 slopes have necessarily the same value (except the sign) and hence fewer memory lectures are needed for membership evaluations. Thus, if miji denotes the common slope (positive) between M iji and M iji þ1 , we have that the two membership functions are given, respectively, by the following equations: M iji ðxi Þ ¼ miji ðaiji þ1  xi Þ

ð2Þ

M iji þ1 ðxi Þ

ð3Þ

¼

miji ðxi



aiji Þ

Moreover, the second restriction itself implies that both membership values M iji ðxi Þ and M iji þ1 ðxi Þ are complementary (see Fig. 2) and hence Eq. (3) can be written as M iji þ1 ðxi Þ ¼ 1  M iji ðxi Þ

ð4Þ

With the above restrictions, the new computational cost for performing an inference is as follows: first, Eq. (1) implies now only 2n  1 sums of 2n products (wjcj) and no divisions. In addition, the evaluation of each wj i

Mj

i

M j +1

i

i

m ij

i

1 i

M j (x i) =m ij (a ji +1−x i) i

i

i

i

M j +1(x i) =m ij (xi −a ji ) i

i

i a ij –1 i

a ij i

xi

a ij +1 i

Fig. 2. The two active antecedents for a generic input xi.

i

2154

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

requires the product of n membership function values, so we have 2n(n  1) more products. Finally, we have that the half of the 2n membership function values (M ij ðxi Þ) requires a total of n sums (substractions) and n products (see Eq. (2)). The other half (M ijþ1 ðxi Þ) requires only n sums (see Eq. (4)). Therefore an inference requires a total of (2n + 1)n products and 2n + 2n  1 sums. As we can see, this cost is considerably less than the initial one and it does not depend on the number of membership functions m. Moreover, as n and m increase, the difference between the two costs becomes huge. This fact can be appreciated in Table 1 where the cost for both systems, generic ANFIS and our system, is depicted for some different n and m values; clearly the difference is outstanding. For better understanding of the above restrictions and the advantages that they entail, let us explain these with the aid of Fig. 3 which shows a case example for a two-input fuzzy system with 5  5 antecedent functions per dimension. The generic ANFIS system requires the evaluation of 10 antecedents (5 per input dimension) which implies a total of 25 rules. Given the input pair (x1, x2), the only two antecedents per axis that have to be evaluated are those that are highlighted in Fig. 3 (i.e. M 13 , M 14 , M 22 and M 23 ). Therefore, our system involves the evaluation of 2  2 = 4 antecedents and 22 = 4 active rules R12  IF x1 is M 13

and

x2 is M 22 then f is c12

R13  IF x1 is M 13

and

x2 is M 23 then f is c13

R17  IF x1 is M 14

and

x2 is M 22 then f is c17

R18  IF x1 is M 14

and

x2 is M 23 then f is c18

By Eqs. (2) and (3) we have that M 13 ðx1 Þ ¼ m13 ða14  x1 Þ and M 14 ðx1 Þ ¼ m13 ðxi  a13 Þ where m13 is the common slope between the triangles M 13 and M 14 . On the other hand, M 22 ðx2 Þ ¼ m22 ða23  x2 Þ and M 23 ðx2 Þ ¼ m22 ðx2  a22 Þ where m22 is the common slope between the triangles M 22 and M 23 . Hence, the system output is given by

Table 1 Number of operations carried out by the two systems for performing an inference Unrestricted ANFIS

Our system

m=4

m=5

m=6

n=2

p s me d

31 30 8 1

49 48 10 1

71 70 12 1

10 7 (4) –

n=3

p s me d

191 126 12 1

374 248 15 1

647 430 18 1

27 13 (6) –

n=4

p s me d

1023 510 16 1

2499 1248 20 1

65,183 2590 24 1

68 23 (8) –

n=5

p s me d

5119 2046 20 1

15,624 6248 25 1

38,879 15,530 30 1

165 41 (10) –

n=6

p s me d

24,575 8190 24 1

93,749 31,248 30 1

279,935 93,310 36 1

390 75 (12) –

p: products; s: sums; me: membership evaluations; and d: divisions. Numbers in parenthesis mean that the operations involved are already included in the sums and/or products.

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

2155

f 11

2 M1

1

M1

1

M2

1

1

M 4 M 15

M3

M 14(x1) =m 13(x1 −a 13 )

2

M2

1

M 3 (x1) =m 13(a41 −x1)

2

M3 a21 a11

2

M4

a12

a13

x1

a14

a15

a22

2

M5

x

2

(x1, x 2 )

a23

2

M 2 (x2) a42 2 M 3 (x2) a25

Fig. 3. A two-input fuzzy system with the proposed restrictions. The active antecedents for the particular input are highlighted.

            f ¼ m13 a14  x1 m22 a23  x2 c12 þ m13 a14  x1 m22 x2  a22 c13 þ m13 x1  a13 m22 a23  x2 c17     þ m13 x1  a13 m22 x2  a22 c18

ð5Þ

As has been demonstrated above (see Eq. (4)), the two membership functions in every dimension are complementary, therefore Eq. (5) can be expressed now as:         f ¼ m13 a14  x1 m22 a23  x2 ðc12  c13  c17 þ c18 Þ þ m13 a14  x1 ðc13  c18 Þ þ m22 a23  x2 ðc17  c18 Þ þ c18 ð6Þ The training process clearly also benefits from the above restrictions. On the one hand, in the back propagation phase the inference mechanism is carried out many times, so as long as it is simplified, all the training process is equally accelerated. On the other hand, it is easy to note that the above restrictions also imply that the three vertex of the triangles are not free parameters; that is to say, they are ‘‘tied” together as shown in Fig. 3. Thus, for example, the point a13 is at the same time the left vertex of the triangle M 14 , the right vertex of the triangle M 12 and also coincides with the component over the axis of the central vertex of M 13 . In other words, the only antecedent parameters that have to be adjusted in the learning process are the points aiji and hence, the learning process is simplified. Thus, in this example, we have to adjust only three parameters in every axis (ai2 ; ai3 and ai4 ) against 15 parameters (five triangles with three vertex each) in the case of free triangles. Restrictions like those described above have sometimes been applied in fuzzy system implementations [2,19] with the aim of reducing complexity. Moreover, triangular membership functions have proven to be effective in some different application fields. In particular, membership functions like those used in this work produce the zero value of the reconstruction error in encoding (fuzzification) and decoding (defuzzification) processes [17,18]. In addition, although it can be thought that these fuzzy systems lose flexibility and modeling ability, it has been proven [19] that they are still universal approximators. In the next section, we will present different experiments that show how the proposed system has good approximation ability despite the restrictions. 3. Examples and results In this section, we present some experiments that show how the neuro-fuzzy system presented in Section 2 still performs well despite the imposed restrictions. To be exact, we are going to see that our system has good approximation capability with respect to the generic ANFIS system (i.e. gaussian-type antecedents with any overlapping degree). Note, however, that the generic system might necessarily be more flexible and adaptable than our system. Hence, the conclusions presented in this section do not try to show which one of the two methods approximates better. Instead we try to show that our system retains quite good approximation abil-

2156

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

ity. To do this, we have carried out approximation experiments with some non-linear functions that have been used in some other related works for testing these kinds of systems [1,4,9,15]. In every experiment, the two systems are trained with a collection of input–output data pairs – taken from the analytical functions – and the results obtained are compared and discussed. These results concern parameters such as training error-rate, ability to generalize to non-trained data, learning speed, etc. The training error rate is given by the mean square error – MSE – between desired output (training data) and actual output (system output). On the other hand, the ability to generalize is given by the generalization mean square error – GMSE – which is the MSE obtained with a very large collection of points. In all the experiments the training starts with the membership functions equally separated in each axis, and up to 1000 iterations are performed so the systems have enough time to converge. Furthermore, many different learning rates (for the back propagation algorithm) have been tested in order to obtain always the best values for every experiment. 3.1. Experiment 1 The first experiment deals with the modeling of the following non-linear function: f ¼ sinðpxÞ sinðpyÞ

ð7Þ

where x and y are the inputs and f is the output. y ranges from 1 to 0 and x and f range from 1 to 1. Fig. 4a shows this function graphically. To train the systems, 231 input–output pairs ((x, y), f) are taken by sampling x and y with, respectively, 21 and 11 equally distributed points. In these experiments m is taken varying from 5 to 10. On the other hand, the generalization ability is analyzed by computing the GMSE for a total of 10,000 (100  100) test points. Table 2 shows the results obtained in these experiments. First, we can see that the MSE values obtained by our system are very small values: for example, a value of 104 means a 0.01% in the output interval [1, 1].

a

b 9

1

8 7

0.5

6

f

f

0

5 4

−0.5

3 2

−1 1

1 5 0.8

1

0.6

y

4

5

0.5

0.4

0

0.2

0.5 0

x

y

4

3 3

2

2 1

−1

x

1

Fig. 4. Output surface for: (a) f = sin(px) sin(py). (b) f = (1 + x2 + y3/2)2. Table 2 Results of experiment 1 m

5 6 7 8 9 10

Our system

Unrestricted ANFIS

Learning MSE (104)

Generalization GMSE (104)

Learning MSE (104)

Generalization GMSE (104)

16.7655 5.2783 2.3479 1.1108 0.7706 0.0664

19.5400 6.8400 4.7400 2.5800 1.7400 0.0800

0.0113 0.0034 0.0027 0.0018 0.0012 0.0010

0.0141 0.0076 0.0088 0.0102 0.0133 0.0194

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

2157

Hence, our system has rather good learning ability. Next, as was expected, the MSE values are smaller in the generic ANFIS than in our system. For example, taking m = 5 we have MSE = 0.0113  104 against MSE = 16.76  104. Clearly, the generic ANFIS is more flexible and hence it will produce generally better results. However, we can also see that as m is increased, the difference between the two systems is reduced. In fact, when m = 10 our system provides a value as small as 6.64  106. This fact can also be seen in Fig. 5 where the training error curves are depicted for the two systems. With respect to the generalization ability, we can make similar conclusions for the GMSE values. That is, we can see that although the generic system generalizes better, our system also provides small GMSE values. Furthermore, these values can be reduced by taking larger m values. Fig. 6 shows the output surfaces obtained by our system with different m values, for these 100  100 test points. We can see how the output surface is closer and closer to the original one as m increases. Finally, regarding the learning speed, we can see also in Fig. 5 that our system converges almost as rapidly as the ANFIS does. Thus, after approximately 100 iterations (or even less) the system almost reaches the final MSE value. 3.2. Experiment 2 The second experiment deals with the non-linear function of the following equation: f ¼ ð1 þ x2 þ y 3=2 Þ2

ð8Þ

where 1 6 x,y 6 5. This function is depicted in Fig. 4b. In order to train the systems we have collected 121 input–output pairs ((x, y), f) obtained by sampling x and y with, respectively, 11 equally distributed points. Here m is taken varying from 5 to 9. The generalization ability is analyzed here for a total of 10,000 (100  100) test points Table 3 shows the results obtained. First, we can see again that our system provides low error rates when it learns from training data. Also we can see here that for m low (e.g. m = 5) the ANFIS gives better results than our system. However, when m is higher, our system even surpasses the ANFIS one, obtaining values for MSE as low as 13  108. Fig. 5 shows the learning curves for this experiment. We see how our system clearly provides better results for m P 7. Therefore we can conclude that examples (or functions) can be found where the triangular membership functions can adjust a collection of training points better than the ANFIS. Better results, therefore, can be obtained. Related to the generalization ability, it is shown in Table 3 that, both in our system and in the ANFIS one, the GMSE is small but it is hardly reduced as m increases. In fact, in the ANFIS system the GMSE is even slightly higher for m high. In any case, the generalization ability pro-

a 10

b 10

−2

−1

m=5

m=5

−2

10

−3

10

m=6 m=7

−3

10

m=9

mse

mse

m=6

m=8

−4

10

−5

m=5 −4

m=6

10

m=7

m=10

10

−5

10

m=8

−6

10

m=8

−6

10

0

m=9

m=10

−7

100

200

300

400

m=9

m=6

m=7

10

m=7

m=5

m=8

m=9

−7

500

iterations

600

700

800

900

1000

10

0

100

200

300

400

500

600

700

800

900

1000

iterations

Fig. 5. Training error curves with different m values for experiment 1 (a) and experiment 2 (b). Solid curves and normal labels correspond to our system and dotted curves and cursive labels correspond to the generic ANFIS system.

2158

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

m=7

m=5

f

1

1

0.5

0.5

f

0 −0.5

0 −0.5

−1 1

−1 1 0.8

y

0.8

1

0.6 0

0.2 0

x

−0.5

−1

1 0.6

0.5

0.4

0.5

y

0.4

0 0.2 0

−1

−0.5

x

m=9

m=10

1

1

0.5

0.5

f

f

0

0

−0.5

−0.5

−1 1

−1 1 0.8

1

0.6

y

0.4

0 0.2 0

−0.5

−1

0.8

1

0.6

0.5

y

x

0.5

0.4

0

0.2 −1

−0.5

x

Fig. 6. Experiment 1: output surface obtained by the proposed system for m = 5, 7, 9 and 10. Table 3 Results of experiment 2 m

Our system

Unrestricted ANFIS 4

5 6 7 8 9

4

Learning MSE (10 )

Generalization GMSE (10 )

Learning MSE (104)

Generalization GMSE (104)

143.84 6.352 0.048 0.011 0.0013

194.90 119.86 99.32 95.58 85.88

0.9546 0.5792 0.2536 0.0509 0.0054

7.2868 7.4196 8.8260 9.912 14.240

vided by our system can be considered to be rather good, as can be seen in Fig. 7 where the output surfaces are showed for m = 5 and m = 10. 3.3. Experiment 3 The third experiment concerns the three-input (i.e. three-dimensional) non-linear function given by the equation: f ¼ ð1 þ x0:5 þ y 1 þ z0:5 Þ

2

ð9Þ

where 1 6 x, y, z 6 6. In order to train the systems, we have collected 1331 input–output pairs ((x, y, z), f) obtained by sampling x, y and z with, respectively, 11 equally distributed points. Here, m is taken varying from

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

2159

m=5

f

m=10

9

9

8

8

7

7

6

6

5

f 5

4

4

3

3

2

2

1 5

1 5 4

4

5 4

3

y

3

2

2 1

1

5

y

4

3 3

2

x

2 1

x

1

Fig. 7. Experiment 2: output surfaces obtained by the proposed system for m = 5 and m = 10.

5 to 7. For the generalization experiments we have taken a total of 1000 (10  10  10) non-training points. Table 4 shows the results obtained. As in previous examples, we can also see here that our system has very good learning ability. Moreover, for m high it provides again better MSE values than the ANFIS. 3.4. Experiment 4 Finally, with the aim of analyzing the performance of the system in a high dimensional problem, an experiment with a four-input non-linear function has been carried out. The function is given by the following equation: qffiffiffiffiffiffiffiffiffiffiffiffiffiffi f ¼ 4ðx1  0:5Þðx4  0:5Þ sin 2p x22 þ x23 ð10Þ where 1 6 xi 6 1 (1 6 i 6 4). To train the systems, 4096 (i.e. 84) input–output pairs have been collected by sampling xi with, respectively, eight equally distributed points. In addition, other 2401 intermediate points have been collected for testing the generalization ability. The results obtained are shown in Table 5. The behaviour of the system is as good as in previous examples. Moreover, in this case both systems (our system and generic ANFIS) lead to very similar results for the different m values.

Table 4 Results of experiment 3 m

Our system

Unrestricted ANFIS 4

5 6 7

4

Learning MSE (10 )

Generalization GMSE (10 )

Learning MSE (104)

Generalization GMSE (104)

534.3 7.877 0.053

470 245 213

21.4 9.6 0.22

67.3 47.9 48.97

Table 5 Results of experiment 4 m

4 5 6

Our system

Unrestricted ANFIS

Learning MSE

Generalization GMSE

Learning MSE

Generalization GMSE

0.70 0.16 0.05

0.42 0.15 0.05

0.9 0.1 0.08

0.5 0.06 0.05

2160

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

From these experiments, we can conclude that the proposed system has quite good approximation capability. Generally speaking, it has been shown that the system is able to approximate the target functions with a precision comparable to that of generic ANFIS by refining the partitions of the input universes (i.e. by adding new antecedents). Note however, that the use of more antecedents does imply hardly more computation requirements because the number of active rules is always 2n, regardless of the total number of rules in the system (see Table 1). Only a few additional comparisons would be required to localize the active antecedents. Therefore, the precision of the system can be arbitrarily high without sacrificing its response time. This is a clear advantage of the proposed system. 4. An efficient hardware implementation In Section 2 we have explained how the inference mechanism of an ANFIS system is largely simplified when the proposed restrictions are applied and hence, efficient implementations can be achieved. In the following paragraphs, we explain how these implementations can be performed. First, we shall assume that after a previous off-line training phase, the points a1j1 and a2j2 , the slopes m1j1 ; m2j2 and also the consequents cj are stored in a memory. Then, for every new input, the output is provided by computing Eq. (6). To compute this equation we need previously to perform the next three steps: (1) first, we have to localize the two active triangles in each dimension. This first step is easily performed by making comparisons between the inputs xi and the points aiji ; (2) then we calculate the differences between the input coordinates and the vertex of these active triangles (i.e. ðxi  aiji Þ); and (3) finally we load from memory the n involved slopes and the 2n involved consequents. This is easily performed because the previous comparisons also determine the values to be loaded. In summary, these steps are very simple and can be carried out rapidly. Once the above values are available, we can finally compute Eq. (6). This equation is a typical sum of products which allows high degree of concurrence or parallelism, so it can also be performed rapidly in a hardware implementation. As an example, we have carried out a digital hardware implementation of the inference mechanism for a two-input system. To understand this implementation better, we describe first the process in the algorithm below in which we assume that the 2  m slopes and the m  m consequents are stored in the arrays slope[2][m] and conseq[m][m], respectively. INPUT1 j = 1; while ðxi < aij Þ do j = j + 1; x1 ¼ ðx1  a1j Þ; m1 = slopes[1][j]; INPUT2 k = 1; while ðx2 < a2k Þ k ¼ k þ 1; x2 ¼ ðx2  a2k Þ; m2 = slopes[2][k]; consequentS c1= conseq[j][k]; c2= conseq[j][k + 1]; c3= conseq[j + 1][k]; c4= conseq [j + 1][k + 1]; OUTPUT l1 ¼ m1 x1 ; l2 ¼ m2 x2 ; f = l1l2c1 + l1(1  l2)c2 + (1  l1) l2c3 + (1  l1)(1  l2)c4. Let us now describe a digital hardware implementation for this algorithm. Note that although the algorithm describes sequentially the operations to be performed, some of them can be executed concurrently.

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

2161

ALU FETCH x1

(x−a1j )

Find Active 1 Triangles M j

8

x2

aij

8

8

24

32

m2 SELECT

j2

24 8 8 8 8

32

c1 c2 c3 c4

ci

ji

8

16

8 8

2

mi 8

32

f

(x−a2j )

Select Active Consequents and Slopes

i

24

j1

2

index

16

32

m1

Find Active 2 Triangles M j

8

8 8

1

1

24

8

8

8 SLOPES 6 TRIANGLE INITIAL POINTS

8bits−MEMORY

25 CONSEQUENTS

Fig. 8. Scheme of the system implementation.

The scheme of the hardware implementation is depicted in Fig. 8. First, the ‘‘FETCH” module localizes the active triangles for each dimension by comparing the inputs x1 and x2 with the points a1j1 and a2j2 , respectively. As a result, it provides the x1 and x2 values and the index for selecting the involved slopes and consequents. Only a comparator, a substractor and a counter for each input is required in this module. Next, the ‘‘SELECT” module receives the previously calculated index and, by using multiplexers, it selects the slopes and consequents involved. An 8-bit fixed point numerical representation has been used for all these values: the inputs and slopes are unsigned values and the consequents are signed ones. Once all the needed values have been collected, the last step is to compute the output as is indicated in the algorithm. The process is carried out by the ‘‘ALU” module and, as can be seen, it has a full parallel architecture which performs four products in parallel. Note also that another feature of this design that simplifies it further, is that the values 1  miji ðxi  aiji Þ are obtained directly by complementing the respective miji ðxi  aiji Þ. This implementation has been programmed and verified over an ‘‘Altera” Stratix II (EP2S15) FPGA using Quartus II 4.0 software for system description, timing simulation and programming. The fitter reports the use of 577/12480 (4%) look-up tables (LUTs) and 14/96 (14%) DSP blocks. On the other hand, the timing analyzer reports 116.54 MHz of maximum clock frequency. The ‘‘FETCH” and ‘‘SELECT” modules are executed in just one clock cycle, and the ‘‘ALU” module in three clock cycles; therefore an inference is performed in just four clock cycles. Hence, the system is able to perform an inference in about 34 ns or, in other words, it performs 29 millions of inferences per second. 5. Conclusions A neuro-fuzzy system allowing simple and efficient implementations has been proposed. The neuro-fuzzy system is an ANFIS-type system, in which some restrictions are imposed so a very simple inference mechanism

2162

J. Echanobe et al. / Information Sciences 178 (2008) 2150–2162

is obtained. In particular, these restrictions are applied to the antecedent functions in three aspects: (1) they have to be triangular; (2) they have to be overlapped with an overlapping degree of two; and (3) given a particular input, the sum of the membership functions in all the antecedents for every dimension must be equal to 1. Some experimental results show that, despite the imposed restrictions, the system retains quite good approximation capability. A high performance HW implementation has been also presented for this neuro-fuzzy system. This implementation has been carried out over an ‘‘Altera” Stratix II (EP2S15) FPGA, achieving a performance of about 27 millions of inferences per second. Acknowledgements This paper has been partially supported by the University of the Basque Country (UPV224.310-E-15871/ 2004) and the Basque Country Government under Grant (SA-2006/00015). References [1] K. Basterretxea, J.M. Tarela, I. del Campo, G. Bosque, An experimental study on nonlinear function computation for neural/fuzzy hardware design, IEEE Transactions on Neural Networks 18 (1) (2007) 266–283. [2] I. Baturone, S. Sanchez-Solano, Microelectronic design of universal fuzzy controllers, Mathware and Soft Computing 8 (2001) 303– 319. [3] H.R. Berenji, P. Khedkar, Learning and tuning fuzzy logic controllers through reinforcements, IEEE Transactions on Neural Networks 3 (5) (1992) 724–740. [4] V. Cherkassky, D. Gehring, F. Mulier, Comparison of adaptive methods for function estimation from samples, IEEE Transactions on Neural Networks 7 (4) (1996) 969–984. [5] M.M. Gupta, D.H. Rao, Neuro-Control Systems Theory and Applications a selected reprinte volume, IEEE Press, New York, 1994. [6] K. Hirota, W. Pedrycz, Neurocomputations with fuzzy flip-flops, Proceedings of the International Joint Conference on Neural Networks 2 (1993) 867–1870. [7] H. Ishibuchi, R. Fujioka, H. Tanaka, Neural Networks that learn from fuzzy if-then rules, IEEE Transactions on Fuzzy Systems 1 (2) (1993) 85–97. [8] J.-S. Jang, ANFIS: adaptive-network-based fuzzy inference system, IEEE Transactions on System, Man and Cybernetics 23 (1993) 665–685. [9] S.L. Lee, C.S. Ouyang, A neuro-fuzzy system modeling with self-constructing rule generation and hybrid SVD-Base learning, IEEE Transactions on Fuzzy Systems 11 (3) (2003) 341–353. [10] Y.-G. Leu, W.-Y. Wang, T.-T. Lee, Observer-based direct adaptive fuzzy-neural control for nonaffine nonlinear systems, IEEE Transactions on Neural Networks 16 (4) (2005) 853–861. [11] F.-J. Lin, P.-H. Shen, Adaptive fuzzy-neural-network control for a DSP-based permanent magnet linear synchronous motor servo drive, IEEE Transactions on Fuzzy Systems 14 (4) (2006) 481–495. [12] J. Bosco Mbede, P. Ele, C.-M. Mveh-Abia, Y. Toure, V. Graefe, S. Ma, Intelligent mobile manipulator navigation using adaptive neuro-fuzzy systems, Information Sciences 171 (4) (2005) 447–474. [13] P. Melin, O. Castillo, Intelligent control of a stepping motor drive using an adaptive neuro-fuzzy inference system, Information Sciences 170 (2–4) (2005) 133–151. [14] P. Melin, O. Castillo, An intelligent hybrid approach for industrial quality control combining neural networks, fuzzy logic and fractal theory, Information Sciences 177 (7) (2007) 1531–1728. [15] S.-K. Oh, W. Pedrycz, H.-S. Park, Genetically optimized fuzzy polynomial neural networks, IEEE Transactions on Fuzzy Systems 14 (1) (2006) 125–144. [16] M. Panella, An input–output clustering approach to the synthesis of ANFIS networks, IEEE Transactions on Fuzzy Systems 13 (1) (2005) 69–81. [17] W. Pedrycz, Why triangular membership functions? Fuzzy Sets and Systems 64 (1994) 21–30. [18] W. Pedrycz, F. Gomide, Fuzzy systems engineering: toward human-centric computing, IEEE Press, John Wiley & Sons, New Jersey, 2007 (Chapter 9). [19] R. Rovatti, Fuzzy piecewise-multilinear and piecewise-linear systems as universal approximators in Sobolev norms, IEEE Transactions on Fuzzy Systems 6 (2) (1998) 235–249. [20] G. Serra, C. Bottura, An IV-QR algorithm for neuro-fuzzy multivariable online identification, IEEE Transactions on Fuzzy Systems 15 (2) (2007) 200–210. [21] H. Takagi, I. Hayashi, NN-driven fuzzy reasoning, International Journal of Approximate Reasoning 5 (3) (1991) 191–212. [22] L.H. Tsoukalas, R.E. Uhrig, Fuzzy and Neural Approaches in Engineering, John Wiley & Sons, New York, 1997.