A Fuzzy Classifier System for evolutionary learning of robot behaviors

A Fuzzy Classifier System for evolutionary learning of robot behaviors

AP~]ED MAT~EMATHC5 AND COJ~PUTAT[ON ELSEVIER Applied Mathematics and Computation 91 (1998) 73-81 A Fuzzy Classifier System for evolutionary learni...

331KB Sizes 7 Downloads 44 Views

AP~]ED MAT~EMATHC5 AND

COJ~PUTAT[ON ELSEVIER

Applied Mathematics and Computation 91 (1998) 73-81

A Fuzzy Classifier System for evolutionary

learning of robot behaviors Yasushi Iwakoshi a,., Takeshi Furuhashi b, Yoshiki Uchikawa b a Chubu Hitachi Electric Co. Ltd., Furo-cho, Chikusa-ku, Nagoya 464-01, Japan b Department of Information Electronics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-01, Japan

Abstract

This paper presents an evolutionary learning of robot behaviors by Fuzzy Classifier System (FCS). The FCS introduces a fuzzy rule base and a fuzzy inference system in place of the rule base and the production system of the Classifier System (CS). The FCS has a feature in that it is robust to environmental changes. The FCS is applied to behavioral learning of a mobile robot for evolving fuzzy control rules. Simulations are done under eight different conditions. Robustness of the acquired fuzzy rules is compared to that of the rules obtained by the CS. © 1998 Published by Elsevier Science Inc. All rights reserved. Keywords." Fuzzy Classifier System (FCS); Fuzzy logic; Evolutionary learning

1. Introduction

Our problem is whether a robot, having short-sighted sensors and no maps, can reach a goal within a limited number of steps without crashing into walls. O n u m a and Hoshino have studied this problem by using the Classifier System (CS) [1]. They pointed out that the CS was not robust to environmental changes. Valenzuela-Rendon proposed a Fuzzy Classifier System (FCS) [2] by introducing a fuzzy rule base and a fuzzy inference system in place of the rule base and the production system of the CS, respectively. The FCS can handle

*Corresponding author. E-mail: [email protected]. 0096-3003/98/$19.00 © 1998 Published by Elsevier Science Inc. All rights reserved. PII: S0096-3003(97) 10006-6

74

Y. Iwakoshi et al. / Applied Mathemati~w and Computation 91 (1998) 73~1

continuous variables. The FCS was, however, applied to an approximation of a single input-single output function in Ref. [2]. Studies on apportionment of credits to fuzzy rules for describing multi-input systems have not been done. The authors have studied application of the FCS to acquire knowledge of large scale systems [3-5]. A new method for apportionment of credits to the fuzzy rules was proposed in Refs. [4,5]. By this method, fuzzy rules which describe knowledge in complex multi-input/output systems were made possible to be found. This paper studies an evolutionary learning of robot behaviors by the FCS and examines robustness of the FCS to changing environment. The FCS is applied to find fuzzy control rules for driving a mobile robot. Simulations incorporating experimental conditions are done. Robustness of the found fuzzy rules is compared to that of the production rules acquired by the CS.

2. Mobile robot

The robot used for experiments is shown in Fig. 1. The robot is a micromouse on which eight infrared sensors are installed. The allocation of the sensors is shown in Fig. 2. The infrared sensors sl~8 detect the distance between an obstacle and the sensor itself within the range of 5-35 cm. The robot is movable controlled by its own CPU.

Fig. 1. Exterior of robot.

Y. lwakoshi et al. / Applied Mathematics and Computation 91 (1998) 73~81

$2'~r'~ //"k'~l~ ~;,~,,,k .~$6

~.~

~

75

facefront

~.sensor ",,,frame

I s8 200mm Fig. 2. Allocation of sensors.

3. Fuzzy classifier system Fig. 3 shows the configuration of the FCS. The FCS consists of a fuzzy inference system, an apportionment of credit system, a fuzzy rule base, and a rule generation mechanism. 3.1. F u z z y rule base

This fuzzy rule base has n fuzzy rules. The input variables of each rule are the detected values of infrared sensors Sl-S8. The antecedent part of each rule consists of eight loci and each locus has one of the five labels, small (S), medium small (MS), medium (M), medium big (MB), big (B), of membership

FuzzyClassifierSystem ( RuleGeneration1 Mechanism(GA)

<---( J Fuzzy ] inferencel fApportionment System J ~ofCredit , Sensin9/~--~/ Action Payoff/~ ( Environment ) Fig. 3. Configuration of fuzzy classifiersystem.

E lwakoshi et al. / Applied Mathematics and Computation 91 (1998) 73-81

76

1

~MS

M MB B . _ .

~/""~f"

grade

0 5 20 35 Distancefromobstacle(cm) (a)

I NB NS ZO PS PB

gradle~ -40 Left

0 40 Right (b) Steeringangle

Fig. 4. Membership functions.

functions shown in Fig. 4(a). The output is the steering angle u of the robot. The locus for this output also has one of the five labels of the membership functions in Fig. 4(b).

3.2. Fuzzy inference system The fuzzy inference system senses the environment by the eight infrared sensors Sl-S8 and drives the robot. This system generates the command of the steering angle u for the drive system of the robot using the fuzzy rules in the fuzzy rule base. The inference method used in this system is the productsum-centre of gravity method. The control of the robot is done until the robot reaches the goal or it collides with the wall. The payoffs are given to the FCS for the result of the control.

3.3. Apportionment of credit system The apportionment of credit system delivers credits to the fuzzy rules in the fuzzy rule base. The credits to each rule are proportional to the payoffs from the environment and the degrees of contributions of the rules to the results of control. The credit ~ of a fuzzy rule is updated as follows: (a) When the robot reaches the goal, ng--1

= ~ + ~ - - ~ % _ , × 1000 × lukg-il,

(1)

i 0

where i is the sampling sequence of the controller, ng is the number of steps from the start to the goal, kg is the sampling sequence at the time of the goal, Ogxis truth value of the fuzzy rule at x, and ux is the defuzzified value of the output of the rule at x, i.e. the central position of the membership function in the consequent portion. (b) When the robot reaches one of the sub-goals in the maze, nsg 1

= ~ + ~-~'~ok,, ~ × lOOO × luk~-,I, i=O

(2)

Y. Iwakoshi et al. / Applied Mathematics and Computation 91 (1998) 73-81

77

where ksg is the sampling sequence at the time to reach the sub-goal and nsg is the number of steps to the sub-goal from a previous sub-goal. (c) When the robot collides with the wall, nf--1

-- ~ - ~ o k f - i

x 1000 × lull-el,

(3)

i=0

where kr is the sampling sequence at the time of collision and nf is the number of steps to consider the rules contributing to the failure.

3.4. Rule generation mechanism Fuzzy rules are selected and reproduced using the genetic algorithm (GA). The FCS controls the robot in ne mazes. Then the G A operation is applied to the fuzzy rules with the accumulated credits. The nsel rules with the least credits ~ are changed. This change to each rule is done by crossover or mutation selected randomly. One point crossover operation is applied to a rule in the nsel rules and a reproduced chromosome from remaining n - nsel rules. One of the two generated rules is randomly chosen to replace the rule in the nsej rules. If the labels of membership functions in the antecedent are the same as those of one of the existing rules, this rule will be deleted. The crossover operation is repeated until a rule with a new antecedent part is generated. When the mutation operation is selected, the labels of membership functions in the antecedent of one of the nse~ rules are changed. This mutation operation is also applied to generate a rule with a new antecedent part. Another mutation operation, which is to change the labels of the membership functions in the consequent portions is applied to the n rules with a probability of Pro. After these genetic operations are applied, the credits of the n rules are set at zero. The rules at the next generation are now generated and the control of the robot is resumed.

4. Simulations Simulations were done under ne = 8 different conditions as shown in Fig. 5. Each maze was 5 m wide and 5 m long, and divided into 25 sub-areas. The center of every border of the sub-areas was the sub-goal. The actual experimental conditions including the robot in Figs. 2 and 3 and the environment were incorporated into the simulations. The default direction of the robot at the start point was parallel to the wall. We changed the starting direction randomly within the range of :k30 degrees from the default direction at every trial. The fuzzy rule base has n --- 100 rules. The number of steps to consider the

78

Y. lwakoshi et al. I Applied Mathematics and Computation 91 (1998) 73-81

tl ! Fig. 5. Mazes for learning. contributing rules nr was 2. The probability of mutation in the consequent portion of the fuzzy rules pm was 0.02. The number of rules to be screened out by the selection operation nse] was 10. By the genetic operations, the FCS could find fuzzy rules which were able to control the robot to reach the goal under all the conditions without colliding with the wall. The tracks of the robot were shown in Fig. 6. The robustness of the acquired fuzzy rules were compared with those obtained by the CS. Fig. 7 shows the conditions for the comparison. Production rules were in the rule base of the CS. Each input space was divided crisply as shown in Fig. 7. When the distance from an obstacle to the sensor was less than 20 cm, it was detected as near. When the distance was larger than 20 cm, it was

Fig. 6. Tracks of the robot.

E lwakoshi et al. I Applied Mathematics and Computation 91 (1998) 73~1

79

Table 1 Success rates of rules by FCS and CS in unknown mazes

Success rate FCS (%)

CS (%)

1

9

2

2 3 4 5 6 7 8 9 10

0 18 3 7 22 0 14 31 ll

0 3 3 0 22 0 6 19 11

Start point

Average

11.5

6.6

far. * meant "do not care of obstacle". The steering angle u was discretized in five different angles. The CS could also find rules to control the robot to reach the goal. Fig. 7 shows the mazes used for evaluating the acquired rules by the FCS and the CS. The G A operations were stopped. Ten start points were selectFCS

CS

Fuzzy Rule Base

Production Rule Base

doWt care-" =1=" 1 S MS M MB B ~ ~ / \,; "..

grade~

oI

grade

/~.~,,

5 20 35 Distancefrom obstacle(cm) Antecedent~

5

~

~b

Distance from obstacle(cm) Antecedent " ~ f

~, s2 s~ ~,, s5 ~,~ s7 s8 v

Is I~sl~sl., I B,I,,,,81~BIMBIPB ,,,,6 ,,,s zo

~ ~7 s~ s .

s., s . s .

Iol0111.111"1'

s7 s,,

t grade

-40

Left

0

40

Right~

O' -40

<

Left

Steering angle Fig. 7, Conditions of FCS and CS.

0

u

Io1"1 Consequent~ /

40

Right~._

Steering angle

80

Y. Iwakoshi et al. / Applied Mathematics and Computation 91 (1998) 73~81

Fig. 8. Mazes for evaluation. ed as indicated by the numbers in Fig. 8. At each start point, both the rules of the FCS and the CS were tested 360 times, respectively. The initial direction of the robot was rotated 1° at every test. Table 1 shows the success rates at each start point. The rules acquired by the FCS were better than those by the CS. Fig. 9 shows an example of track of the robot controlled by the rules of the FCS starting from the number 8 start point. The robot went along the left side wall.

5. Conclusions This paper presented an evolutionary learning of robot behaviors by FCS. The FCS has a feature in that it is robust to environmental changes.

Fig. 9. Example of the result.

E Iwakoshi et al. I Applied Mathematics and Computation 91 (1998) 73-81

81

S i m u l a t i o n s were d o n e u n d e r eight different c o n d i t i o n s . R o b u s t n e s s o f the acquired fuzzy rules was b e t t e r t h a n t h a t o f the rules o b t a i n e d b y the CS. T h e a u t h o r s also further s t u d y the r e p r e s e n t a t i o n o f fuzzy c o n t r o l rules a n d the a p p o r t i o n m e n t o f credit system for a c q u i r i n g m o r e r o b u s t fuzzy rules.

References [1] K. Onuma, T. Hoshino, Evolutionary learning of robot behaviors by classifier systems, JSAI Technical Report, 1994, pp. 9-14. [2] M. Valenzuela-Rendon, The fuzzy classifier system: A classifier system for continuously varying variables, in: Proceedings of the Fourth International Conference on Genetic Algorithm, 1991, pp. 346-353. [3] T. Furuhashi, K. Nakaoka, K. Morikawa, Y. Uchikawa, Controlling Excessive Fuzziness in a Fuzzy Classifier System, Proceedings of the Fifth International Conference on Genetic Algorithm, 1993, pp. 635. [4] T. Furuhashi, K. Nakaoka, Y. Uchikawa, A study on knowledge finding using fuzzy classifier systems, J. Japan Soc. Fuzzy Theory Systems 7 (4) (1995) 839-848. [5] K. Nakaoka, T. Furuhashi, Y. Uchikawa, A study on apportionment of credits of fuzzy classifier system for knowledge acquisition of large scale systems, in: Proceedings of Third IEEE International Conference on Fuzzy Systems, 1994, pp. 1797 1800.