Control of a mobile robot using generalized dynamic fuzzy neural networks

Control of a mobile robot using generalized dynamic fuzzy neural networks

Microprocessors and Microsystems 28 (2004) 491–498 www.elsevier.com/locate/micpro Control of a mobile robot using generalized dynamic fuzzy neural ne...

352KB Sizes 0 Downloads 79 Views

Microprocessors and Microsystems 28 (2004) 491–498 www.elsevier.com/locate/micpro

Control of a mobile robot using generalized dynamic fuzzy neural networks M.J. Er*, Tien Peng Tan, Sin Yee Loh School of Electrical and Electronic Engineering, Nanyang Technological University, Block S1, Nanyang Avenue, Singapore, Singapore 639798 Received 13 April 2003; revised 20 April 2004; accepted 28 April 2004 Available online 28 May 2004

Abstract This paper presents the design and implementation of a neural fuzzy controller suitable for real-time control of an autonomous mobile robot. The neural fuzzy controller is developed based on the Generalized Dynamic Fuzzy Neural Networks (GDFNN) learning algorithm of Wu et al. (IEEE Transactions on Fuzzy System 9 (4), 2001, 578– 594). Not only the parameters of the controller can be optimized, but also the structure of the controller can be self-adaptive. Experimental results show that in comparison with a conventional fuzzy-logic-based controller, the proposed controller is superior in performance. q 2004 Elsevier B.V. All rights reserved. Keywords: Generalized dynamic fuzzy neural networks; Structure learning; Mobile robot control

1. Introduction One of the challenging tasks for mobile robot navigation is to ensure that the robot follows a certain trajectory and avoid any obstacles placed along the trajectory. Mobile robots can be deployed for military applications, intelligent transportation, seaport automation, airport automation, hospital services or outdoor storage and retrieval systems. The robot’s actions directly depend on the perception of the world by means of its sensors. A fuzzy-logic-based approach was selected initially since fuzzy logic is able to provide human reasoning capabilities to deal with uncertainties. Fuzzy logic systems represent knowledge in linguistic form, which permits the designer to define a highly abstract behavior in an intuitive fashion. Unlike the conventional fuzzy control algorithm that requires predefined and fixed fuzzy rules, the Generalized Dynamic Fuzzy Neural Networks (GDFNN) learning algorithm enables fuzzy rules to be recruited or deleted dynamically and parameters estimated automatically. It has been proven to be a superior learning algorithm. Not only it can learn from the environment at a fast rate, but also it supports dynamic self-organizing and adaptive learning. Furthermore, it allows the mobile robot to build its own * Corresponding author. Tel.: þ 65-67906850; fax: þ 65-63162065. E-mail address: [email protected] (M.J. Er). 0141-9331/$ - see front matter q 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.micpro.2004.04.002

controller online by means of supervised learning. Because of its properties, the GDFNN is also employed in a wide range of applications such as static function approximation, nonlinear system identification and multilink robot control [3].

2. Problem formulation The objective of this work is to implement the neural fuzzy controller on the Khepera II mobile robot [1,2] so as to control it for wall-following tasks in real time. The Khepera II robot is widely used around the world as a platform for various robotics experiments and applications. It is cylindrical in shape, small and compact, measuring 70 mm in diameter and 30 mm in height. Its weight of 80 g and small size allow the experiment to be performed in a small area. Fig. 1 shows a typical Khepera environment whereby the robot is controlled via a serial link by the workstation. The basic configuration of Khepera is composed of the CPU and the sensor/motor board. The sensory/motor board includes two DC motors coupled with incremental sensors and eight analogue infrared (IR) proximity sensors. Each IR sensor is composed of an emitter and an independent receiver. The sensors measure the absolute ambient light

492

M.J. Er et al. / Microprocessors and Microsystems 28 (2004) 491–498

Fig. 1. The Khepera II robot with a gripper module and its working environment.

and the estimation, by reflection, of the relative position of an object to the robot and these readings provide feedback information to the GDFNN-based controller.

3. Review of conventional fuzzy logic approach To facilitate the comparison with the GDFNN approach, we first review the conventional fuzzy logic control approach. A block diagram of the conventional fuzzy logic controller is shown in Fig. 2. The eight IR sensors on the Khepera II are shown in Fig. 3. The sensor readings are integer values in the range of [0,1023]. A sensor value of 1023 indicates that the robot is very close to the object while a sensor value of 0 indicates that the robot does not receive any reflection of the infrared signal. The readings are normalized to the range of [0,1]. Four input linguistic variables, namely distance to the left Dl ; to the front Df ; to the right Dr and to the back Db ; of the robot are defined from the eight IR sensors as shown in Fig. 3. The diagonal left and right sensors S1 and S4; respectively are not used. The physical domains over which these linguistic values are defined are given as follows: D l ¼ S0 ;

Df ¼

S2 þ S3 ; 2

The Mamdani type of fuzzy reasoning method is used to compute the output of the fuzzy controller and defuzzification is carried out using the Mean-of-Maxima method. For each input linguistic variable, we define two membership functions and each is defined as a Gaussian function of the following form:

mij ¼ e2ðxj 2aij Þ =2bij 2

2

where xj is the input sensor value (normalized), aij is the center value and bij is the width of the corresponding membership function curve. The membership function curves for each output linguistic variable, left motor speed and right motor speed, are approximated by singleton functions for simplicity. The set of rules to control the Khepera to follow a wall are constructed in an intuitive manner, mimicking a human operator’s wall following and obstacle avoidance strategy. For example, if there is an obstacle near to the right, the robot is driven such that it tilts left (illustrated in Rule 3) to avoid the obstacle. Likewise, if the robot moves away from the obstacle on the right, it is driven again to tilt right

Dr ¼ S5 ;

S þ S7 : Db ¼ 6 2

Fig. 2. Block diagram of the fuzzy logic controller.

ð2Þ

ð1Þ

Fig. 3. Position and orientation of sensors on the Khepera II.

M.J. Er et al. / Microprocessors and Microsystems 28 (2004) 491–498

493

4. Generalized dynamic fuzzy neural-networks-based approach We now introduce the GDFNN. The architecture of the GDFNN [3] is based on extended ellipsoidal basis function (EBF) neural networks, which are functionally equivalent to TSK fuzzy systems [4], in the special case where the rule consequents are constant. In the proposed design, the fuzzy controller has four inputs and two outputs y1 and y2 ; corresponding to the two motor speeds. The structure of the GDFNN is shown in Fig. 7. Each of the four input linguistic variables has two membership functions. The membership functions shown in level 2 are Gaussian functions of the following form:

Fig. 4. Environment used for evolving the robot.

(illustrated in Rule 1) so as to follow the wall very closely. The rules implemented are as follows. Fig. 4 shows the environment where the robot performs its wall-following task. It should be noted that the rules generated in Table 1 were derived such that the robot will navigate in an anticlockwise manner. While implementing the fixed-rule-based approach, it was discovered that only some rules are activated during the experiment. Some of the rules are redundant since they are not activated at all in the process. Fig. 5 shows the respective frequency of rules activated for the wall-following task. The speeds of the motors (left and right) corresponding to the rules activated are as follows (Fig. 6). Under this scheme, the Khepera robot can follow a wall successfully. However, adding rules manually is time consuming and it can be seen that the approach is inflexible. The GDFNN-based controller seeks to overcome this major drawback.

"

2ðxi 2cij Þ2 mij ðxi Þ¼exp s2ij

# i¼1;2;…;r; j¼1;2;…;u

ð3Þ

where mij is the jth membership function of input variable xi and cij and sij are the center and width of the jth Gaussian membership function of xi ; respectively. The T-norm operator is used to compute each rule’s firing strength and the output of the jth rule Rj ðj ¼ 1; 2; …; uÞ in layer 3 is given by "

Fj ðx1 ;x2 ;…;xr Þ¼exp 2

r X ðxi 2cij Þ2 i¼1

#

s2ij

j¼1;2;…;u:

ð4Þ

The outputs are required to be crisp values and where defuzzification is performed in layer 5 using the center-of-gravity method, the mth crisp output is

Table 1 Linguistic rules for wall following Rule No.

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16

Distance

Motor speed

Direction

Left ðDl Þ

Front ðDf Þ

Right ðDr Þ

Back ðDb Þ

Left motor

Right motor

Far Far Far Far Far Far Far Far Near Near Near Near Near Near Near Near

Far Far Far Far Near Near Near Near Far Far Far Far Near Near Near Near

Far Far Near Near Far Far Near Near Far Far Near Near Far Far Near Near

Far Near Far Near Far Near Far Near Far Near Far Near Far Near Far Near

Fwd slow Fwd fast Fwd medium Fwd medium Fwd fast Fwd fast Fwd fast Fwd fast Fwd slow Fwd slow Fwd medium Fwd medium Bwd fast Bwd fast Bwd fast Stop

Fwd medium Fwd fast Fwd slow Fwd slow Bwd fast Bwd fast Bwd fast Bwd fast Fwd medium Fwd medium Fwd medium Fwd medium Fwd fast Fwd fast Fwd fast Stop

Tilt right a bit Front Tilt left a bit Tilt left a bit Tilt left Tilt left Tilt left Tilt left Tilt right a bit Tilt right a bit Front Front Tilt right Tilt right Tilt right Stop

494

M.J. Er et al. / Microprocessors and Microsystems 28 (2004) 491–498

Fig. 5. Frequency of rules being fired.

given by yi ¼

u X

Fig. 7. Architecture of the generalized dynamic fuzzy neural networks.

F j wij

i ¼ 1; 2; …; m:

ð5Þ

j¼1

is the M-distance, PX ¼ ðx1 ; …; xr ÞT [ Rr ; Cj ¼ ðc1j ; c2j ; …; crj ÞT [Rr and 21 is defined as follows: j

In layer 4, normalization is performed such that

Fi F i ¼ X u Fj

2

i ¼ 1; 2; …; u:

ð6Þ

j¼1

In order to construct and train the GDFNN, two types of learning, namely structure learning and parameter learning are used. The fuzzy system is constructed rule by rule. The firing strength of each rule shown in Eq. (4) can be regarded as a function of regularized Mahalanobis distance (M-distance), i.e.

Fj ¼ expð2md2 ðjÞÞ

ð7Þ

where vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u 21 X u mdðjÞ ¼ tðX 2 Cj ÞT ðX 2 Cj Þ j

ð8Þ

1 6 s2 6 1j 6 6 6 6 0 21 X 6 ¼6 6 6 j 6 6 0 6 6 4 0

0

···

1 s22j

0

0

..

···

0

.

3 0 7 7 7 7 7 0 7 7 7 7 7 7 0 7 7 7 1 5 s2rj

j ¼ 1; 2; …; u:

ð9Þ

From the viewpoint of fuzzy logic control, a fuzzy rule is a local representation over a region defined in the input space. If a new sample falls within the local region, the GDFNN will not generate a new rule but accommodate the new sample by updating the parameters of existing rules. When an observation ðX k ; Y k Þ; k ¼ 1; 2; …; n; enters the system, we calculate the M-distance mdk ðjÞ between the observation X k and centers Cj ðj ¼ 1; 2; …; uÞ of existing EBF units according to Eqs. (8) and (9). Next, find J ¼ arg min ðmdk ðjÞÞ:

ð10Þ

1#j#u

If mdkmin ¼ mdk ðJÞ . kd

Fig. 6. Speeds of the motor driven by the rules being fired.

ð11Þ

where kd is a pre-specified threshold that if there is decay during the learning process, the existing system is not satisfied with the new sample and a new rule should be considered. Once a new rule is generated, the next step is to assign initial centers and widths of the corresponding membership function. The incoming multidimensional input vector X k is projected to the corresponding onedimensional membership for each input variable and the Euclidean distance edi between the data xki and the boundary

M.J. Er et al. / Microprocessors and Microsystems 28 (2004) 491–498

set Fi is computed as follows: edi ðjÞ ¼

lxki

2 Fi ðjÞl

j ¼ 1; 2; …; u þ 2

ð12Þ

where Fi [ {ximin ; ci1 ; ci2 ; …; ciu ; ximax }: Next, find jn ¼ arg

min

j¼1;2;…;uþ2

ðedi ðjÞÞ:

If edi ðjn Þ # kmf

ð14Þ

where kmf is a predefined constant that controls the similarity of neighboring membership functions, we assume that xki can be completely represented by the existing fuzzy set Aijn ðcijn ; sjin Þ without generating a new membership function. Otherwise, a new Gaussian membership function whose center is ciðuþ1Þ ¼ xki

ð15Þ

and whose width is ð16Þ

where ci21 and ciþ1 are the two centers of neighboring membership functions of the new membership function, and k is an overlap factor, generally chosen to be 1.05 –1.2, is allocated. Sometimes, a rule may be active initially, but contributes little to the system gradually. The redundant rules not only suffer heavier online computation load but also deteriorate the whole system performance. If inactive rules can be deleted as learning progresses, a more parsimonious network topology can be achieved. In the GDFNN learning algorithm, a pruning mechanism called Error Reduction Ratio (ERR) method is used [3].

The linear least square (LLS) solution of G is given by G ¼ ðPT PÞ21 PT D; or gi ¼

pTi D pTi pi

i ¼ 1; 2; …; u:

ð21Þ

The quantities G and u satisfy the following equation N u ¼ G:

ð22Þ

As pi and pj are orthogonal for i – j; the sum of squares or energy of D is given by DT D ¼

u X

g2i pTi pi þ ET E:

ð23Þ

If D is the desired output vector after its mean has been removed, the variance of D is given by n21 DT D ¼ n21

u X

g2i pTi pi þ n21 ET E:

ð24Þ

i¼1

P It is seen that ui¼1 g2i pTi pi =n is the part of the desired output variance that can be explained by the regressor ET E=n Pu pi and 2 T is the unexplained variance of D: Thus, i¼1 gi pi pi =n is the increment to the explained desired output variance introduced by pi ; and the ERR due to pi is defined as erri ¼

g2i pTi pi DT D

i ¼ 1; 2; …; u:

ð25Þ

Substituting gi by Eq. (21), we have

4.1. Error reduction ratio Given n input – output pairs {XðkÞ; YðkÞ}; k ¼ 1; 2; …; n; consider Eq. (5) as a special case of the linear regression model: u X

ð20Þ

i¼1

siðuþ1Þ ¼ k max{lciðuþ1Þ 2 ci21 l; lciðuþ1Þ 2 ciþ1 l}

yðkÞ ¼

triangular matrix. This transformation makes it possible to calculate individual contributions to the desired output energy from each basis vector. Substituting Eq. (19) into Eq. (18) yields D ¼ PN u þ E ¼ PG þ E:

ð13Þ

495

wj ðkÞF j ðkÞ þ eðkÞ:

erri ¼

ðpTi DÞ2 pTi pi DT D

i ¼ 1; 2; …; u:

ð26Þ

The above equation offers a simple and effective means of seeking a subset of significant regressors.

ð17Þ 4.2. Sensitivity of fuzzy rules

j¼1

Arranging Eq. (17) in matrix form: D ¼ Hu þ E

ð18Þ

where D ¼ ½yð1Þ; …; yðnÞT [ Rn is the desired output, H ¼ ðF 1 · · ·F u Þ [ Rnu contains the regressors, u ¼ ½w1 · · ·wu T [ Ru contains real parameters, and E [ Rn is the error vector that is assumed to be uncorrelated with the regressors F i ði ¼ 1; 2; …; uÞ: We can transform H into a set of orthogonal basis vectors by QR decomposition H ¼ PN

ð19Þ

where P ¼ ðp1 ; p2 ; …; pu Þ [ R has the same dimension as H with orthogonal columns and N [ Ruu is an upper nu

For multi-output systems which have m outputs (in our case, m ¼ 2), we can define the ERR matrix D ¼ ðr1 ; r2 ; …; ru Þ [ Rmu whose elements are obtained from Eq. (26) and the jth column of D as the total ERR corresponding to the jth rule. Furthermore, define sffiffiffiffiffiffiffi rTj rj j ¼ 1; 2; …; u ð27Þ hj ¼ m then hj represents the significance of the jth rule. If

hj , kerr

j ¼ 1; 2; …; u

ð28Þ

496

M.J. Er et al. / Microprocessors and Microsystems 28 (2004) 491–498

where kerr is a pre-specified threshold, then the jth rule is deleted. After the system structure is adjusted according to the learning algorithm, parameter learning is performed. We attempt to minimize: V¼

1 EðyðkÞ 2 yd ðkÞÞ2 2

ð29Þ

where yd ðkÞ is the desired output and V is the cost function. When there is more than one output, the square of the Euclidean distance between the actual and the desired output vectors may be used. For each training data set, starting at input nodes of layer 1, the forward pass is used to compute all the node outputs until layer 4 and the consequent parameters are identified by the well-known

Fig. 8. Flowchart of the learning algorithm for the GDFNN.

M.J. Er et al. / Microprocessors and Microsystems 28 (2004) 491–498

497

Table 2 Linguistic rules after learning wall-following Rule No.

Distance

01 02 03 04 05 06 07 08

Connection weights

Left ðDl Þ

Front ðDf Þ

Right ðDr Þ

Back ðDb Þ

Left motor

Right motor

Far Far Far Far Near Near Near Near

Far Somehow far Average Average Far Far Near Near

Average Near Somehow far Near Far Near Far Near

Far Near Far Near Far Near Far Near

20.133 0.030 1.335 1.391 20.03 0.038 0.168 0.243

1.170 0.010 21.74 21.348 0.055 20.032 20.175 20.279

LLS method or RLS method to improve the learning speed. Then, in the backward pass, the error signals propagate backward and the premise parameters are updated by the gradient descent method of Jang et al. [5]. Suppose that u fuzzy rules are generated for n observations with r input variables and the LLS method is used. Rewriting Eq. (5) in matrix form yields Y ¼ WF

ð30Þ

Fj sij ðk þ 1Þ ¼ sij ðkÞ 2 Gs X u Fj

2ðxi 2 cij Þ2 s3ij

!

j¼1



m X

ðym ðkÞ 2 ydm ðkÞÞðwmj 2 ym ðkÞÞ:

Fig. 8 shows a flowchart for the GDFNN learning algorithm.

where W ¼ ½w1 · · ·wu  [ Ru ; F ¼ ðF 1 · · ·F u ÞT [ Run and Y [ Rn : Assume that the desired output is T ¼ ðt1 ; t2 ; …; tn Þ [ Rn : The problem of determining the optimal parameters W p can be formulated as a linear problem of minimizing kW F 2 Tk2 and W p is determined by the pseudoinverse technique W p ¼ TðFT FÞ21 FT

ð31Þ

where FT is the transpose of F and Fþ ¼ ðFT FÞ21 FT is the pseudoinverse of F: Using a standard gradient descent method, the parameter vector is updated as follows: Zðk þ 1Þ ¼ ZðkÞ 2 G7z VðZðkÞÞ where G is a pre-determined learning rate, and " # ›V ›V ›V 7z VðZðkÞÞ ¼ ; ; …; ›z1 ›z2 ›z q

Fig. 9. Learning errors for wall following.

ð32Þ

ð33Þ

is the gradient of V with respect to the parameters z1 ; …; zq : In the GDFNN, if u fuzzy rules are generated with r input variables and m output variables, z consists of two sets of parameters.z ¼ ðc11 ; …; cru ; s11 ; …; sru Þ: The parameters are updated as follows: ! Fj 2ðxi 2 cij Þ Cij ðk þ 1Þ ¼ Cij ðkÞ 2 Gs u X s2ij Fj j¼1



m X m¼1

ðym ðkÞ 2 ydm ðkÞÞðwmj 2 ym ðkÞÞ

ð35Þ

m¼1

ð34Þ Fig. 10. Tuned membership function curves.

498

M.J. Er et al. / Microprocessors and Microsystems 28 (2004) 491–498

5. Experimental results and discussions The GDFNN-based controller was implemented to navigate the Khepera II robot. The robot has to learn simple behaviour from a supervisor. This means that it is necessary to have a supervisor or ‘teacher’ over the network operating in an online manner. A wall-following robot was developed prior to this and was chosen to train the GDFNN controller. Wall-following was conducted in the environment as shown in Fig. 4. We define three inputs (the back sensor does not provide much information) and two outputs for the controller, normalized within the interval [0,1] and [2 1,1] respectively. The rules after learning are shown in Table 2. Experimental results show that with only eight rules, the robot is able to learn successfully the desired behavior after about 400 iterations with the instantaneous error decreasing significantly to about an average of 0.1. This corresponds to an actual output error of about 1 pulse per 10 ms or about 8 mm/s, which is relatively insignificant. It is also observed that there are a number of error peaks and the peaks occur every time the robot makes a 908 turn in our test environment. Of course, the error peaks decrease with time because the robot is learning. The interval between the error peaks also shows that the robot has adjusted itself to the desired behavior (Fig. 9). The direction of robot movement is anticlockwise during training. The tuned membership functions are as shown in Fig. 10. No rules were needed first. After training, only eight rules were generated automatically and the robot could complete the wall-following task successfully. The number of rules dose not increase exponentially with increase in the number of input variables. Those rules whose membership

functions overlap with functions of other rules to a large extent will not be generated and those rules that contribute little or may degrade the system performance will be deleted automatically. Compared with the former approach, the GDFNN is a more efficient algorithm. Not only the parameters of the fuzzy system can be adjusted, but also the structure of the fuzzy system can be self-adaptive.

6. Conclusions In this paper, the GDFNN-based controller has been successfully implemented on the Khepera II mobile robot. By virtue of the GDFNN learning algorithm, not only the parameters of the controller can be adjusted, but also the structure of the controller can be self-adaptive. The experiment shows that the developed controller has a parsimonious structure and the performance of the system is superior to the conventional fuzzy logic approach.

References [1] Khepera II, K-team, 2002, http://www.k-team.com. [2] F. Mondada, E. Franzi, P. Ienne, Mobile robot miniaturization: a tool for investigation in control algorithms, Informatik (1994) 17 –20. [3] S. Wu, M.J. Er, Y. Gao, A fast approach for automatic generation of fuzzy rules by generalized dynamic fuzzy neural networks, IEEE Transactions on Fuzzy System 9 (4) (2001) 578–594. [4] M. Sugeno, T. Takagi, Fuzzy identification of systems and its application to modeling and control, IEEE Transactions on Systems, Man, and Cybernetics 15 (1) (1985) 116 –132. [5] J.-S.R. Jang, C.T. Sun, E. Mizutani, Neuro-fuzzy and Soft Computing, Prentice-Hall, Englewood Cliffs, NJ, 1997.