On-line evolvable fuzzy system for ATM cell-scheduling

On-line evolvable fuzzy system for ATM cell-scheduling

Journal of Systems Architecture 52 (2006) 169–183 www.elsevier.com/locate/sysarc On-line evolvable fuzzy system for ATM cell-scheduling Meng Hiot Lim...

260KB Sizes 0 Downloads 37 Views

Journal of Systems Architecture 52 (2006) 169–183 www.elsevier.com/locate/sysarc

On-line evolvable fuzzy system for ATM cell-scheduling Meng Hiot Lim *, Ju Hui Li, Qi Cao Center for Integrated and Circuits Systems, School of EEE, Block S1, Nanyang Technological University, Singapore 639798, Singapore Received 5 February 2003; received in revised form 22 December 2004; accepted 5 May 2005 Available online 11 July 2005

Abstract The algorithms for solving ATM cell-scheduling problem include first-in-first-out (FIFO), static priority (SPR), dynamically weighted priority scheduling (DWPS) as well as other traditional schemes. However, these traditional algorithms lack flexibility. FIFO and SPR cannot adapt to changes in the cell flow environment. DWPS on the other hand, is more adaptable to changing traffic flow. But if the cell flow changes dramatically, the performance of this method is also not very good. In order to address these issues, we propose the framework of evolvable fuzzy system (EFS). The system is intrinsically evolvable and able to carry out on-line adaptation to meet the desired QoS requirement. The EFS is realizable as a form of evolvable fuzzy hardware (EFH) by means of a reconfigurable fuzzy inference chip (RFIC). With an implementation of the EFS as EFH which carries out intrinsic evolution and on-line adaptation, some open issues pertinent to evolvable hardware (EHW) can be addressed.  2005 Elsevier B.V. All rights reserved. Keywords: Evolvable fuzzy system; Evolvable fuzzy hardware; Intrinsic evolution; On-line adaptation; Cell-scheduling algorithm

1. Introduction Genetic algorithm (GA) can be considered as a computational simulation of natural evolution, which is in fact inspired by DarwinÕs theory of evolution [2]. The innovation in field programmable *

Corresponding author. Tel.: +65 67905408; fax: +65 67933318. E-mail addresses: [email protected] (M.H. Lim), [email protected] (J.H. Li), [email protected] (Q. Cao).

gate array (FPGA) technology has provided a novel and flexible means for implementing hardware systems [5]. Combining such flexibility in hardware implementation with evolutionary algorithms, the field of evolvable hardware (EHW) is given a new dimension. We quote from Yao and Higuchi [8] one definition of EHW: ‘‘Evolvable hardware (EHW) refers to hardware that can change its architecture and behavior dynamically and autonomously by interacting with its environment’’. In recent years, EHW has gained recognition as a possible alternative to many applications. Many

1383-7621/$ - see front matter  2005 Elsevier B.V. All rights reserved. doi:10.1016/j.sysarc.2005.05.003

170

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183

research papers focus on EHW [8,10,11,13–22,24]. EHW has a potential capability of dealing with unknown environmental interface without the need for humansÕ intervention. It can be useful for space exploration vehicles and other unmanned applications. Another important application area of EHW is real-time control area. EHW can explore many possible alternatives in the solution space to find an appropriate architecture or scheme to perform the task. The solutions found by EHW often are beyond humansÕ preconceived knowledge. At times, there are difficulties to fully understand many of the circuit architectures evolved by EHW [8]. Although we may have high expectations of EHW, the research results for real-time applications so far have not been very exciting. Nowadays, a lot of work has been done using EHW to solve many kinds of problems, for example, robot control [13,22], artificial hand control [16], ATM cell-scheduling [11,25] and so on. Many of them rely on general computing machines to execute the evolution and evaluation of feasible architecture. In [16], the author proposed a gate-level evolvable hardware chip. The evolution and evaluation mechanisms have been integrated on a chip. This can be looked upon as progress which is closer towards practical application. However, it is still necessary to rely on some form of general computational platform for full functionality. The reasons for relying on a general computing machine can be ascribed to the complexity of the systems. Without general computing platforms, only small and simple systems can be evolved efficiently. In order to address such issues, a method referred to as increased complexity evolution has been proposed [17]. The main idea of this method is to divide a complex system into many simple sub-systems and evolve the small sub-systems. The whole network consisting of the sub-systems can also be evolved if necessary. This is called second-level evolution. Some experimental results and comparisons on evolving character recognition systems were given. The results show that this method can dramatically decrease the time spent on evolution and evaluation. From the viewpoint of the granularity of EHW, the sub-system can be looked upon as a form of first-level evolution granularity. From the transistor level [18], gate

level [15,16,22], functional level [11,15] to sub-system level [17], the granularity level of EHW is increasing. According to this tendency of change in granularity, we can adopt another scheme of evolutionary granularity—fuzzy rule. In this mechanism, the system will explore good fuzzy rule sets to carry out real-time control. The final fuzzy rule set can be downloaded into a reconfigurable fuzzy inference chip (RFIC). If the fuzzy rule set changes, the configuration of the RFIC will also adapt accordingly [24]. In order to demonstrate the workability of our idea on an application area, we attempt to solve the ATM cell-scheduling problem using our evolvable fuzzy system (EFS). There are many algorithms to cater for the desired QoS (quality of service) requirements of ATM cell-scheduling. The common switching schemes are first-in-firstout (FIFO), static priority (SPR) and dynamically weighted priority scheduling (DWPS) [21]. FIFO is a very simple method to do cell scheduling. It is easy to implement in hardware, but is not very good in terms of QoS performance. SPR is also a simple method to do cell scheduling. However, it tends to be unfairly biased towards certain cell class [9,21]. DWPS is a significant improvement over the SPR scheme. It adjusts the priority according to the cell flow scenarios. The adaptation scheme tends to be simple and may not be very efficient if the cell flow changes dramatically. Fuzzy logic is a powerful methodology in embedded control. It has been widely used in many real-time applications. Network traffic control is a real-time control problem, hence ideal for fuzzy control system. Some strong performance of fuzzy system in traffic control have been reported by Catania et al. [26–30]. Although fuzzy system performs well in its applications, it can be tedious and complicated to configure manually. For the application of cell scheduling, appropriate fuzzy rules, membership functions, and fuzzy inference schemes have to take into account all possible cell flow scenarios. This may not be very practical in some scenarios since the traffic flow and QoS requirements can vary significantly from time to time. A fuzzy system that is manually derived may not be appropriate for all the possible scenarios. As a result, drastic changes in the traffic

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183

flow may cause the QoS performance to deteriorate. In this respect, our EFS is very appropriate to handle such a situation. The principle of our EFS is to evolve the fuzzy system according to the changing patterns of the traffic flow and adapt to the traffic pattern on-line, which means the fuzzy system should be updated immediately according to changes of the traffic flow. In such a way, a good system performance can be consistently maintained. This kind of EFS is desirable to be implemented ultimately as a form of intrinsic evolvable fuzzy hardware (EFH). This paper will focus the evolution scheme and the performance demonstration of the EFS. The detailed implementation of EFH is beyond the scope of this paper. There have been significant research works which focused on the application of GA and fuzzy system. For example, in [3,4,6,7], fuzzy system or fuzzy logic controller had been developed using GA or other evolutionary algorithms. In [3,6,7], the goal is to use genetic computing to design a final fuzzy rule set or membership functions or both. After that, the fuzzy system will not change or adapt to changing environment. In [4], the author used a specified learning algorithm to select some suitable parameters (for example, fuzziness degree, shape of the membership function, and rule set index as well as rule weight) among a large data set stored in the memory. These however are different from our proposed EFS. Our scheme calls for intrinsic evolution and on-line adaptation which is implementable as a form of EFH. In this article, we present some simulation results to demonstrate the potential usefulness of the proposed EFS for ATM cell-scheduling. The QoS performance of the EFS is compared with the other three schemes, namely FIFO, SPR and DWPS. From the viewpoint of evolution, EHW can be classified into two types, intrinsic and extrinsic EHW [19]. Extrinsic evolvability relies on a simulated evolution process, which is decoupled from the hardware [8]. For example, a software representation of a circuit (such as HDL or C) is subject to an evolutionary algorithm, and only the elite design is downloaded into the reconfigurable hardware [18]. This kind of EHW can help to explore the solution space and relieves system developers of the tedious and time-consuming design task.

171

Extrinsic EFH relies significantly on a simulator to evaluate the chromosomeÕs performance. This puts very high speed and accuracy requirements on the simulator. This is a major drawback which limits its applicability. In principle, intrinsic EHW can modify its own hardware configuration and behavior autonomously according to changes in the external environment. With intrinsic EFH, each potential solution is evaluated by the hardware itself. Thus, it can avoid the major drawback of extrinsic EHW which relies on a simulator for performance evaluation. Although EHW has achieved much progress since its conceptualization, there are still many open issues to be addressed. These issues have constrained the usefulness of EHW, from a practical point of view. The main issues are on-line adaptation, scalability as well as termination of evolution [8]. We briefly outline these issues and provide further discussion in Section 4 on how the EFS described in this paper overcome these issues. EHW can be classified into on-line adaptation and off-line adaptation system. On-line adaptation means the system hardware adapts its architecture while the system is in operation. The ideal case is that the EHW can evolve and adapt immediately as soon as the working environment changes. Off-line adaptation means that the system has to suspend its normal operational mode during the evolution process. After the evolutionary phase, the system cannot evolve to adapt to environmental changes any more unless the system suspends its operation again. This limits the applicability of off-line adaptive EHW in time-critical problem areas. For example, in [11,13], only the final elite chromosome was downloaded into the FPGA after some off-line evolutionary processes. Many research works have been done to realize off-line adaptation, but not on-line adaptation. On-line adaptation is very hard to realize because the system may have to reconfigure each chromosome into the hardware to evaluate. Some chromosomes may result in very poor performance. If these chromosomes are reconfigured into the hardware, the system may lead to some damages or disasters [8]. Scalability refers to the length of the chromosome. In [18], researchers try to evolve some analog circuits on field programmable transistor array

172

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183

(FPTA) chip. The cell matrix is 16 · 16, and the chromosome has to be coded as a 16 · 16 · 24 structure. This corresponds to a very large search space. In [11], the system has been implemented as a form of functional level EHW. In order to realize the ATM cell-scheduling, a chromosome was structured as 100 numeric characters. Since the coding of genes are integer values, the search space is inevitably very large. The third issue to be addressed pertains to termination. Termination of evolution refers to the stopping condition for the evolution, for example, the number of evolutionary runs. In order to achieve a close to optimal solution, the system may require a long search time. However, it is usually impossible to predict with certainty the number of runs required to achieve a desirable solution. In this respect, the issue on termination is difficult to solve from a practical point of view. This paper consists of five sections. In Section 2, we will describe our proposed EFS in the context of ATM cell-switching problem. Section 3 outlines the simulation methodology. Results of simulation and comparisons with three other scheduling schemes will also be presented in this chapter. In Section 4, the description of how to tackle the open issues of EHW is given. Finally in Section 5, some concluding remarks and outline some aspects of the future work are presented.

2. Evolutionary framework — ATM Cell-scheduling 2.1. Problem descriptions A single node multiplexer with two-input channel is adopted to demonstrate the viability of EFS. The simplified block architecture of the ATM cell multiplexer is as shown in Fig. 1. In this block diagram, BUF1 and BUF2 refer to buffers while MP refers to the multiplexing unit. The fuzzy switching control block is the part of the hardware that handles cell scheduling. The control scheme is derived through some evolutionary mechanisms, which forms the backbone of the EFS. For illustration, we classify the ATM services into two types, class1 and class2.

FSC (Fuzzy Switching Control)

Class1 Class2

BUF1 BUF2

M P OUT

Fig. 1. Simplified multiplexer block diagram.

For ATM switching, class1 can be a form of CBR (constant bit rate) traffic, rt-VBR (real-time variable bit rate) or both. The class2 traffic type may refer to nrt-VBR (non-real-time variable bit rate), UBR (unspecified bit rate) or ABR (available bit rate) [1,23]. The class1 type is delay sensitive while class2 is considered to be not sensitive to delay. These two cell sources must be multiplexed on the OUT channel by the MP unit through time division. We assume that the bandwidth of either input channel is the same as the capacity of the OUT channel. It is also assumed that the capacities of the input buffers are finite and fixed. To begin with, we define two symbols for the inputs, c1 and c2. The symbol c1 refers to the status of class1 buffer, which is a function of V1 and Vmax. V1 is the current cell rate of class1 cell flow while Vmax is the maximum cell rate of the line capacity. The symbol c2 refers to buffer status of BUF2, which is a function of L2 and Lmax. L2 is the number of empty units in BUF2 while Lmax is the length of class2 cell buffer. For c1 and c2, the memberships are characterized by the linguistic terms very small, small, medium, large, very large. The output of the FSC unit OUT is characterized by the terms true, false. It is clear that OUT need not be fuzzy and hence characterized by two discrete values. Functionally, a true means that the MP unit allocates time packets to cater for the class1 cell flow in BUF1. A false implies switching reverting to cells in BUF2. Based on the above characterization of the switching network, it is possible to define the nrule heuristics to control the switching behavior. With the fuzzy memberships defined, one can rely on intuitive logic to define the necessary input–

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183 Table 1 A 25-rule fuzzy system for ATM cell-scheduling

Class1

c1

BUF1

Class2

c2

very small

small

medium

large

very large

very small small medium large very large

T T T T T

F T T T T

F T T T T

F F T T T

F F F F T

173

M P

OUT

BUF2

TB1

TB2

FSC (Fuzzy Switching Control)

T  true, F  false.

Scheduling Model

output mappings as shown in Table 1. The 25-rule system serves as the default ATM cell-scheduling algorithm on system startup. We refer to these rules as the core rule set. It is noted that although our illustration shows an exhaustive rule set, actual systems may start off with a core rule set of n being less than 25. 2.2. Genetic coding As an illustration of the coding mechanism, consider the rule set in Table 1. The genetic code for the 25-rule system is represented as a genetic string structure ‘‘1222211122111121111211111’’. The allelic code 1 and 2 correspond to the labels true (T) and false (F) respectively. The position of the gene in the string corresponds to a specific rule in Table 1, when interpreted accordingly in a row wise manner. 2.3. Evolution scheme To satisfy the real-time requirement and on-line evolution, we design the system block architecture as in Fig. 2. In Fig. 2, MP is the multiplexing unit that switches between class1 and class2 according to the switching scheme in the FSC block. In this system, the training buffers TB1 and TB2 are used to collect class1 and class2 cells respectively. The size of TB# (TB1 or TB2), represented by T, is 2 or 3 times that of BUF# (BUF1 or BUF2). When a training buffer is full, the evolution process is triggered. Fitness evaluation is carried out by subjecting each chromosome to the scheduling model according to the cell flow stored in TB#. The purpose of the scheduling model is

Evolution Module

Fig. 2. Adaptation framework for EFS.

to simulate the function of the ATM network as in Fig. 1. If a system rule set which is better than the working chromosome is found after the evolution process, the working chromosome will be replaced immediately. Functionally, the scheduling model simulates the ATM network to derive the cell delay and cell loss parameters. These parameters enable the fitness value to be calculated. Basically, the evolution module serves to evolve the fuzzy rule set and it interacts with the scheduling model to evaluate the fitness of each evolved rule set. If evolution is triggered, it works in the background while the MP unit is in operation. 2.4. Evolution of rule set As rationalized above, we need not set the maximum number of fuzzy rules. GA can be adopted to explore alternative fuzzy rule set with good system performance. It is likely that instead of the maximum rule set, an alternative with a smaller number of rules can be derived. For instance, 0220001200011000011000011 is an example of a 10-rule fuzzy system derived from the evolutionary process. In the string, 0 implies that there is no rule defined for the specific inputs scenario. Using GA, a good fuzzy rule set for the cell multiplexer can be derived through learning from past data on ATM traffic flow. The most important component necessary for evolution is a mechanism to incorporate the fitness function in order to control the evolution. Based

174

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183

on the problem specifications, the goal is to use evolvable fuzzy system to schedule ATM cells such that the QoS requirements are satisfied. The QoS parameters considered are cell delay and cell loss. Since two input channels are considered, one being delay sensitive while the other is not, the overall fitness function need to consider only the cell delay of class1. EFS is a work-conserving scheduling scheme which means that the multiplexer cannot be idle if there are cell units queued in BUF1 or BUF2. When the required class1 delay is satisfied, the surplus bandwidth of the output channel will be allocated to class2. The fitness is determined using Eq. (1) based on the scheduling performance for T number of cells stored in TB#.    1 X T   ð1Þ F ¼j mðiÞ  k  q  t  T i¼1

for performance comparison. With FIFO, the cells are scheduled according to their arrival time. SPR is a static priority scheduling scheme. The delay sensitive traffic is always given the highest priority. DWPS refers to dynamic weighted priority scheduling scheme [21]. The priority assignment for the traffic flow can be determined based on Eq. (2), where i represents the channel number. ti is a constant value assigned to each traffic. Delay sensitive traffic is always assigned a small value. Ti(t) is the waiting time of the cell in the first unit of the queuing buffer. c is an emphasis parameter. Its recommended value is 0.9. Qi is the priority value assigned to channel i. The longer the cell stayed in the buffer, the smaller Qi is and the higher is the priority for the channel. ti Qi ¼ ð2Þ ½T i ðtÞc

In Eq. (1), i is the cell number being scheduled by the evaluation process and m(i) is the delay of each class1 cell unit transmitted in the scheduling PT model during the evaluation process. T1 i¼1 mðiÞ is the average cell delay of class1. t is the size of training buffer TB#. q is a constant depending on the bandwidth capacity of the output channel. k is a weighting ratio for adjusting the significance of cell delay. It can be tuned to bias the EFS to the desired class1 average delay. The product term k · q · t is the expression forP the desired cell delay. T The smaller the value of jT1 i¼1 mðiÞ  k  q  tj, the better the chromosome performs. In general, the overall performance of the switching system corresponds to the fitness value. j is a very large constant to adjust the value of F to be proportional to the fitness of the chromosome. After k is fixed, the EFS will try to search for good chromosomes to satisfy the desired cell delay during every evolution process.

In our system simulator, cell flow traffic is simulated by reading from two input file streams. Each input stream corresponds to one cell class. For each simulation time step, the system reads from the files and checks if the arrival time of the cell being handled is less than or equal to the simulation time. With each cell arrival, the system checks the status of the BUF#. When BUF# is full, cell loss occurs and the corresponding cell loss counters Loss # (Loss1 or Loss2) is incremented. If not, the cell is pushed into BUF# and TB#. The time step Tstep for simulation is set at 0.01 ls. The simulator uses event driven mechanism, the event being the arrival of ATM cells. Based on a cell which is 53 bytes long and a bandwidth of 155.52 Mbits per second, the maximum speed of cell flow is 366793 cells per second. Accordingly, the minimum cell period is about 2.73 ls. The data format of each cell in the input files is as shown in Fig. 3. Cell# is the cell number. Arrival_time represents the arrival time of each cell. Bit_rate refers to the bit rate of the cell itself. The Next_bit_rate is the bit rate of the cells which follow the current cell. This data format is for the simulation purpose. In a practical ATM network, the current bit rate can be measured by a specific function block. The notice of the Next_bit_rate if it changes can be realized by using the control channel if necessary. The overall flow diagram of

3. Simulation The EFS for ATM cell-scheduling proposed in this paper is simulated to study its QoS performance in terms of cell loss and cell delay. The EFS simulation is implemented in C++. Similarly, FIFO, SPR and DWPS schemes are also simulated

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183

Cell#

Arrival_time

Bit_rate

175

Next_bit_rate

Fig. 3. Cell format.

the procedures for the system simulator is as shown in Fig. 4. The system triggers evolution when one of the TB# is full. When evolution is triggered, the system registers the time, denoted by the variable E_start. After evolution, if the Flag status is 1, it means that a different chromosome has been found and the system replaces the working chromosome by the one found after a specified time E_time. E_time represents the time cost of the evolution process. Ta1 and Ta2 refer to the cell arrival time for class1 and class2 respectively while t is the current running time of the simulation. For the EFS simulation, the parameter k is assigned 0.4 denoting that the desired average delay is 0.4 · q · t = 0.4 · 2.73 · 300  327 ls. The size of BUF# is set to 100 units while t, the size of TB#, is 3 times that of BUF#. The value of q is 2.73, which is the transmission delay of each cell through the output channel. For the evolution process, the population size is set to 10 and the generation number is set to 20. This ensures that the computation time for each evolution remains manageable. A robust scheduling algorithm should perform well under various cell flow conditions [12]. Two kinds of cell flow representative of most cell flow scenarios are simulated. The major advantage of EFS to be demonstrated is that EFS can achieve an average cell delay in both scenario which is very close to the desired value denoted by k · q · t in Eq. (1). This also means that EFS can have a stable QoS performance even if the traffic scenario changes dramatically. After the simulation, we compare the results of our system with the other three scheduling methods that are popular for ATM cell-scheduling. In both scenarios, we simulate the switching behavior for 2 s. 3.1. Simulation scenario1 In this part of the simulation, we generate the traffic cell flow as in Fig. 5. Class1 is the CBR cell

Initiatialize

Read Class1

t > Ta1? Y N

BUF1 full?

N Y Add to BUF1, TB1

Loss1++

Read Class2

t > Ta2? Y N

BUF2 full?

N Y Add to BUF2, TB2

Loss2++

TB1 or TB2 full?

Y (Trigger Evolution) E_Start=t

N Send_Buf Empty?

Evolution

Y Scheduling Calculate cell_delay

N

Better fuzzy rule set found?

Y Flag=1

t = t + Tstep

N

N

Flag=1 and t>E_Start+E_Time?

Y Replace Fuzzy rule set

Fig. 4. Simulation flow diagram.

Flag=0

176

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183 Class2 Cell Loss

5

2.5

Class1

x 10

DWPS EFS

ON

2

flow with a bit rate of 155.52 Mbits per second. Class2 is VBR cell flow, also with a bit rate of 155.52 Mbits per second. The VBR specified has a 2 ms ON period and a 2 ms OFF period. Class1 is delay sensitive and class2 is loss sensitive. Hence in the following analysis, class1 cell delay and class2 cell loss will be mainly considered. We compare the results of EFS scheduling with FIFO and DWPS. In this scenario, static priority scheduling can only send class1 cells, so it need not be considered. EFS evolves as soon as one of the TB# is full. In this scenario, the working rule set was changed by 2202 times. This indicates that during the normal scheduling operation, EFS performs evolution in the background and found rule sets performing better than the adopted working rule set for 2202 times and adopt that as a new working rule set. Based on our simulation results, the comparison between the three schemes in terms of cell loss and cell delay is presented in Figs. 6–9.

FIFO

1.5

1

0.5

0

0

0.5

1 1.5 simulate time (µS)

2

2.5 6

x 10

Fig. 7. Class2 cell loss in scenario2.

Class1 Cell Average Delay 450 400

FIFO

350

average cell delay (µS)

Fig. 5. Two classes of cell flow.

cell loss (cell)

OFF

Class2

EFS DWPS

300 250 200 150 100 50

Class1 Cell Loss

5

2.5

x 10

0 0

cell loss (cell)

1 1.5 simulate time (µS)

2

2.5 6

x 10

FIFO

2

Fig. 8. Class1 cell delay in scenario1.

1.5

EFS DWPS

1

0.5

0

0.5

0

0.5

1 1.5 simulate time (µS)

Fig. 6. Class1 cell loss in scenario1.

2

2.5 x 106

In Figs. 6 and 7, the differences in cell losses for class1 and class2 for the three scheduling algorithms are not significant. The total cell losses for the three scheduling algorithms are about the same. This indicates that none of the three schemes biases too much on class1 and ignores the QoS of class2. From Figs. 8 and 9, it can be seen that cell delays for class1 using EFS and DWPS are about 330 ls. This is much better than that of FIFO. The cell loss of class2 using EFS and DWPS are bigger than that of FIFO. So in this scenario, if

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183

practical point of view, this scenario is more likely to occur compared to the first scenario. In this scenario, k is still assigned 0.4 indicating the desired class1 average delay is about 327 ls. The working rule set was changed by 1689 times under this scenario. This means the working rule set did not change as often as in scenario1. The figures (Figs. 11–14) that follow show the simulation results of EFS compared to the other three scheduling schemes.

Class2 Cell Average Delay 1000

EFS

900 DWPS 800

average cell delay (µS)

700 600 FIFO

500

177

400 300 200 100

Class1 Cell Loss

4

10

0 0

0.5

1

1.5

simulate time (µS)

2

2.5

x 10

FIFO

9

6

x 10

8

Fig. 9. Class2 cell delay in scenario1.

QoS of class1 (class1 cell delay) is considered, EFS and DWPS are better. While the QoS of class2 (class2 cell loss) and taking into account the balance in terms of cell loss for class1 and class2, FIFO appears to be the best. The advantages of EFS are not evident when one considers only this scenario. When one considers other traffic scenarios such as the one outlined in Sections 3.2 and 3.3, the strengths of the EFS scheme become evident.

cell loss (cell)

7 6 5 4 3

EFS

2 1 SPR 0

0

0.5

3.2. Simulation scenario2

2

2.5 6

x 10

Fig. 11. Class1 cell loss in scenario2.

Class2 Cell Loss

4

20

x 10

DWPS SPR

18 EFS 16 14 cell loss (cell)

In this simulation scenario, we generate cell flows as in Fig. 10. Class1 refers to CBR cell flow with a bit rate of 100 Mbits per second. Class2 is VBR cell flow with unknown random bit rate. The minimum bit rate for VBR is 55.52 Mbits per second while the maximum is 155.52 Mbits per second. In this scenario, since the sum of the CBR and VBR bit rate is larger than the OUT channel capacity, cell loss is unavoidable. From a

DWPS

1 1.5 simulate time (µS)

12 FIFO

10 8 6

Class1

4 2

Class2 Fig. 10. Two classes of cell flow.

0

0

0.5

1 1.5 simulate time (µS)

Fig. 12. Class2 cell loss in scenario2.

2

2.5 6

x 10

178

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183 Class1 Cell Average Delay

600

FIFO

average cell delay (µS)

500

400 DWPS 300

EFS

200

100

SPR

0 0

0.5

1 1.5 simulate time (µS)

2

2.5 6

x 10

Fig. 13. Class1 cell delay in scenario2.

Class2 Cell Average Delay 800 DWPS SPR 700 EFS

average cell delay (µS)

600

FIFO

500 400 300 200 100 0 0

0.5

1 1.5 simulate time (µS)

2

2.5 6

x 10

Fig. 14. Class2 cell delay in scenario2.

In Figs. 11 and 12, class1 cell loss of DWPS and SPR is very close to zero. This indicates that DWPS although with some capability of adaptation, still cannot keep a good balance between class1 cell loss and class2 cell loss. On the other hand, EFS can achieve good balance of class1 and class2 cell loss. The class1 cell delay in Fig. 13 is about 300, which is much smaller than that of FIFO and DWPS. Although class1 cell delay of EFS is larger than that of SPR, but in terms of maintaining a balance between class1 cell loss

and class2 cell loss, EFS is much better than SPR. Considering the cell loss and cell delay on the whole, EFS performs better than the other three schemes. It should also be noted that the desirable class1 delay in scenario1 and scenario2 denoted by the value of k is about 327 ls while the achieved class1 cell delays in scenario1 and scenario2 are about 330 ls and 300 ls respectively. The class1 cell delay achieved by DWPS is 330 ls in scenario1 and about zero in scenario2. This indicates that if the traffic scenario changes dramatically, the class1 cell delay using DWPS can also change dramatically. In other words, EFS can maintain considerably stable QoS performance without being affected by the changes of traffic flow pattern. Although SPR and FIFO can also have stable class1 cell delay in both scenarios, but the class1 cell delay of FIFO is very big and SPR bias too much on class1 and ignores the QoS of class2. On the other hand, EFS can be adjusted very easily to perform like FIFO or SPR by changing the value of k. For example, EFS will deliver performance similar to that of SPR if k is set to 0. This property of EFS is further demonstrated in Section 3.3. DWPS also provides a parameter to adjust, but the result of adjusting the parameter will only make the cell scheduling results closer to that of SPR or FIFO, with no clear emphasis on the QoS performance. The class1 average cell delay cannot be easily predicted after tuning the DWPS system while in EFS, this is not the case. In this respect, EFS has much better features, giving system managers greater flexibility of control than other schemes. 3.3. Tuning performance In the above two sections, the adaptation ability of EFS on changing traffic has been demonstrated. From the simulation results presented, it can be seen that even with dramatically changing cell flows, EFS still maintains desirable QoS for class1 and allocate the surplus bandwidth to class2. Another major advantage of EFS is that by adjusting the value of k, the QoS performance of EFS can be conveniently adjusted. In this section, we demonstrate the performance tunability aspect of the EFS on scenario2.

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183 Table 2 Tuning performance of EFS in Scenario2 k

Class1 desired delay (ls)

Class1 average delay (ls)

Class1 cell loss (cell)

Class2 cell loss (cell)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0 81.9 163.8 245.7 327.6 409.5 491.4 573.3 655.2 737.1 819

1.4 113.7 181.1 235.8 297.3 398.2 505.2 549.0 615.5 654.8 675.1

0 2140 7056 22,139 27,758 23,375 80,995 111,769 150,127 168,856 177,800

198,953 196,813 191,799 176,750 175,505 171,180 117,860 87,086 48,728 29,999 21,106

Table 2 shows the class1 cell delay, class1 cell loss and class2 cell loss with different values of k. The adjustment step is 0.1. The first column is the desired class1 delay indicated by k. The second column is the simulated class1 average delay. The third and fourth columns are the class1 cell loss and class2 cell loss respectively. From the table, it can be seen that when k is between 0 and 0.8, the class1 average delay is close to the desired delay. With the increase of k, class1 average delay increases and class2 cell loss decreases. This indicates that by adjusting the value of k, the balance between the class1 and class2 QoS can be adjusted. If QoS of class1 is very important, a small value of k can be assigned. When QoS of class2 is deemed to be more important, a larger value of k can be assigned. Some traditional schemes such as DWPS also possess such a tunability property. One can tune the system to achieve a bias toward a certain QoS characteristic. However, with EFS, the systemÕs performance can be tuned to achieve a desired pre-specified level of class1 QoS. Although the QoS can be tuned for EFS, one must do so according to the pre-established characterization of the tuning parameter. Based on Table 2, it is observed that with values of k from 0 to 0.8, a wide range of QoS tunability can be achieved. Meanwhile, values of between 0.8 and 1 have little effect on the class1 average delay as k increases. This is mainly because EFS is a work conserving system. It means that the output channel cannot be idle if there are cell units in the queuing buffer. Even with such a limitation, EFS is still

179

more flexible than the traditional schemes and can deliver very good flexibility to the system manager, allowing control of the network payload according to the subscribed QoS and limited bandwidth capacity.

4. Discussions Our main aim in this paper is to achieve intrinsic evolution and on-line adaptation. Our proposed EFS is a viable alternative in addressing some of the open issues pertaining to EHW discussed in Section 1. With the EFS presented in this paper, evolution is not at the gate level or the functional level, but rather at the fuzzy rule level. Gate level represents a very small granularity for EHW. On the other hand, functional level although may involve multiple gates, the granularity is still considered small for EHW compared to the potential complexity of a practical system. To complete a control function may involve thousands of gates or hundreds of functional components (adder, sin, cos, and fpu) [11]. This is an important issue that has to be addressed before EHW can be applied to solve real-time problems. From the viewpoint of real-time applications, we believe that fuzzy rule is a very suitable level of granularity for evolution. Based on this notion, we have experimented with evolving fuzzy rule sets in our EFS within the context of ATM cell-scheduling. Results presented in this paper suggest that fuzzy rule represents an appropriate level of granularity for evolution. In our EFS, in order to carry out on-line adaptation without deteriorating the performance of the system by the weak chromosomes, we incorporate a scheduling model to simulate real-time cell scheduling. This allows for evaluation and calculation of the fitness value of a chromosome. In essence, the scheduling model is a circuit to simulate the switching performance of the multiplexer system. The basic architecture of the scheduling model is depicted in Fig. 15. The process flow is as shown in Fig. 16. It is similar to the flow diagram in Fig. 4 with some differences. The flow in Fig. 4 represents a whole simulation flow described as a C++ program flow.

180

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183

TB1

Initiate

E-BUF1 E_MP

TB2

Read class1 from TB1

E-BUF2

MCU

FLC

Fig. 15. The diagram of scheduling model.

The flow in Fig. 16 is a process flow of the scheduling model of the EFS system. Using this scheme, the chromosome can be evaluated while the multiplexing system is still operational. This way, the systemÕs operation is not disrupted. In the figure, some of the variables are assigned with a prefix ‘‘E_’’. These variables refer to conditions in the simulated multiplexer, not the real multiplexer module. The meaning of these variables is the same as the corresponding variables without the ‘‘E_’’ prefix in Fig. 4. In order to achieve on-line adaptation and intrinsic evolution for real-time control, there is another issue that must be addressed. During evolution, training data is required. In [11], the researchers adopted an approach whereby the data for training and testing are the same. This scheme can work well if the real time data is characteristically similar to the training data. When the cell flow model changes, this kind of EHW system will have to relearn the new situation as well as the old situation and try to evolve new solutions to deal with both. If the real-time data change dramatically, it is not practical to use all the real time data to train the system. We believe that in many realtime control areas, there is no need to train the system in this way. In fact, we can apply the principle of ‘‘locality’’ to address this issue. In computer operating system, the design of the cache memory system is based on this principle. Accordingly, if a program is currently accessing a certain part of the memory, then there is a great likelihood that the program will also access the part of the memory within the same locality in the next period. The proposition of the locality principle is based on the assumption that the time period is very small. Based on this justification, we can train the EFS using the data flow of the previous period to

E_t > E_Ta1? Y N

E_BUF1 full?

N

Y Add to E_BUF1

E_Loss1++

Read class2 from TB2

E_t > E_Ta2? Y N

E_BUF2 full?

N

Y Add to E_BUF2

E_Loss2++

E_Send_Buf Empty?

Y Scheduling

N

Calculate cell_delay E_t = E_t + E_Tstep

Fig. 16. Process Flow of Scheduling Module.

approximate the expected data model of the subsequent time period. The smaller the time period is, the more flexible the EFS adapts to the cell flows. The best chromosome after a pre-specified time period will be used to do scheduling in the next

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183

time period. It can be seen from our simulation results that our training principle based on the assumption of locality is appropriate, as we have expected. The third problem we addressed is the termination of evolution. In many EHW systems, the evolution system must run thousands of generations to even get close to the optimal chromosome. This often takes a long time. For example in [11], in order to get close to the optimal functional EHW for ATM cell-scheduling, the authors reported an evolutionary cycle of 2500 generations with a population size of 400. In [18], the number of generations was fixed at 10,000 to derive the Gaussian DC curve. Such evolutionary time scale is not appropriate if used in real-time intrinsic EHW control system. In our EFS, the population size is 10, and the number of generations is fixed. From our experimental simulations, we achieved good results even with a small number of generations. After evolution, the best chromosome derived replaces the current working chromosome if it results in better performance based on the cell flow sample used in training. Otherwise, the systemÕs operating mode remains status quo. The goal is to get a fuzzy rule set which is better than the working rule set. Even if the derived rule set is not optimal, it is deemed to be sufficient if it is better than the current one. Using this idea, the criterion for the termination of evolutionary cycles can be easily satisfied. After the evolutionary cycle, if the EFS is not able to find a solution better than the current one, the current rule set is retained.

5. Conclusion The results presented in this paper serve to demonstrate the viability of the proposed scheme and highlight some ideas to address the open issues in EHW. From the simulation results, the overall performance of the proposed EFS is competitive. More importantly, the approach allows for greater flexibility in controlling the dynamic behavior of the system by means of an evolutionary framework. Through simulations of the proposed framework, we have demonstrated the effectiveness and efficiency of the EFS scheduling scheme.

181

The work discussed forms the basic framework of embedded fuzzy control and embedded evolutionary learning. The whole scheme for ATM cell-scheduling is realizable as EHW. There are essentially two types of EHW, extrinsic EHW and intrinsic EHW [20]. From an implementation point of view, it is desirable to realize the evolution mechanism and evaluation mechanism within the hardware framework of the scheduler with current system-on-chip (SoC) technology. This notion suggests the implementation of a form of intrinsic evolvable and on-line adaptive fuzzy hardware for dynamic real-time applications. Our future work will focus on the hardware implementation aspect of such evolvable fuzzy hardware (EFH).

References [1] M.P. Clark, ATM Networks: Principles and Use, John Wiley, Chichester, New York, 1996. [2] B.P. Buckles, F.E. Petry, Genetic Algorithms, IEEE Computer Society Press, Los Alamitos, CA, 1992. [3] M.H. Lim, S. Rahardja, B.H. Gwee, A GA paradigm for learning fuzzy rules, Fuzzy Sets and Systems 82 (1996) 177– 186. [4] J.M. Jou, P.Y. Chen, S.F. Yang, An adaptive fuzzy logic controller: Its VLSI architecture and applications, IEEE Transactions on Very Large Scale Integration (VLSI) Systems 8 (1) (2000) 52–60. [5] H.D. Garis, Evolvable hardware: principles and practice, Communications of the Association for Computer Machinery (CACM Journal) (August) (1997). [6] Y.S. Shi, R. Eberhart, Y.C. Chen, Implementation of evolutionary fuzzy systems, IEEE Transactions on Fuzzy Systems 7 (2) (1999) 109–119. [7] C.H. Wang, T.P. Hong, S.F. Tseng, Integrating fuzzy knowledge by genetic algorithms, IEEE Transactions on Evolutionary Computation 2 (4) (1998) 138–149. [8] X. Yao, T. Higuchi, Promises and challenges of evolvable hardware, IEEE Transactions on Systems, Man and Cybernetics, Part C, Applications and Reviews 29 (1) (1999) 87–97. [9] J.H. Li and M.H. Lim, A Framework for Evolvable Fuzzy Hardware, In: Special session on Evolutionary Computing for Control and System Applications, ICONIP/SEAL/ FSKD Joint Conference, 18–22 Nov 2002, Singapore. [10] M. Iwata, I. Kajitani, H. Yamada, H. Iba, T. Higuchi, A pattern recognition system using evolvable hardware, In: Proc. Int. Conf. Parallel Probl. Solving Nature (PPSNÕ96). [11] W.X. Liu, M. Murakawa, T. Higuchi, ATM cell scheduling by function level evolvable hardware, in: T. Higuchi, M. Iwata, W. Liu (Eds.), Proceedings of the 1st International Conference on Evolvable System: from Biology to

182

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19] [20]

[21]

[22]

[23]

[24]

[25]

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183 Hardware (ICES 1996), vol. 1259, Springer-Verlag, Berlin, Germany, 1997, pp. 180–192. D.B. Schwartz, ATM scheduling with queuing delay predictions, in: Conference Proceedings on Communications Architectures, Protocols and Applications, ACM SIGCOMM Computer Communication Review 23 (4) (1993). K.C. Tan, C.M. Chew, K.K. Tan, L.F Wang, Y.J. Chen, Autonomous robot navigation via intrinsic evolution, in: Proceedings of the Congress on Evolutionary Computation, 2002, CECÕ02, vol. 2, pp. 1272–1277. M.D. Bugajska, A.C. Schultz, Coevolution of form and function in the design of micro air vehicles, in: Proceedings NASA/DoD Conference on Evolvable Hardware, 2002, pp. 154–163. T. Higuchi, M. Iwata, I. Kajitani, M. Murakawa, S. Yoshizawa, T. Furuya, Hardware evolution at gate and function levels, in: Proceedings of the Biologically Inspired Autonomous Systems: Computation, Cognition and Action, Durham, NC, March, 1996. M. Iwata, I. Kajitani, Y. Liu, N. Kajihara, T. Higuchi, Implementation of a gate-level evolvable hardware chip, in: Proceedigns of ICES2001, LNCS 2210, pp. 38–49, 2001. J. Torresen, Increased complexity evolution applied to evolvable hardware, in: ANNIEÕ99, November 1999, St. Louis, USA. J. Langeheine, K. Meier, J. Schemmel, Intrinsic evolution of quasi DC solutions for transistor level analog electronic circuits using a CMOS FPTA chip, in: Proceedings of NASA/DoD Conference on Evolvable Hardware, 2002, pp. 75–84. H. de Garis, LSL Evolvable Hardware Workshop Report, in: ATR, Japan, Technical Report, October 1995. T.G.W. Gordon, P.J. Bentley, On evolvable hardware, in: S. Ovaska, L. Sztandera (Eds.), Soft Computing in Industrial Electronics, Physica-Verlag, Heidelberg, Germany, 2002, pp. 279–323. T. Lizambri, F. Duran and S. Wakid, Priority scheduling and buffer management for ATM traffic shaping, in: The Seventh IEEE Workshop on Future Trends of Distributed Computing Systems, December 20 1999. D. Keymeulen, K. Konada, M. Iwata, Y. Kuniyoshi, T. Higuchi, Robot learning using gate-level evolvable hardware, in: Proceedings of the Sixth European Workshop on Learning Robots, in: A. Birk, J. Demiris (Eds.), Lecture Notes in Artificial Intelligence, Springer-Verlag, Berlin, 1998. ATM Forum, ATM Traffic Management Specification 4.0, April 1996. Available from: . M.H. Lim, Q. Cao, J.H. Li, W.L. Ng, Evolvable hardware using context switchable fuzzy inference processor, Special Issue of IEE Proceedings of Computers and Digital Techniques 151 (4) (2004) 301–311. W. Liu, M. Murakawa, T. Higuchi, Evolvable hardware for on-line adaptive traffic control in ATM networks, in: Proceedings of the Second Annual Conference on Genetic

[26]

[27]

[28]

[29]

[30]

Programming, Morgan Kaufmann Publishers, Los Altos, CA, 1997, pp. 504–509. V. Catania, G. Ficili, S. Palazzo, and D. Panno, A fuzzy expert system for usage parameter control in ATM networks, in: Proceedings of the GlobecomÕ95, Singapore, November 13–17, 1995, pp. 1338–1342. V. Catania, G. Ficili, S. Palazzo, D. Panno, A comparative analysis of fuzzy versus conventional policing mechanisms for ATM networks, IEEE/ACM Transactions on Networking 4 (3) (1996) 449–459. V. Catania, G. Ficili, S. Palazzo, D. Panno, Using fuzzy logic in ATM source traffic control: lessons and perspectives, IEEE Communications Magazine 34 (11) (1996) 79– 81. G. Ascia, V. Catania, D. Panno, An adaptive fuzzy threshold scheme for high performance shared-memory switches, in: Symposium on Applied Computing, Proceedings of the ACM Symposium on Applied Computing 2001, Las Vegas, NV, USA, pp. 456–461. G. Ascia, V. Catania, G. Ficili, D. Panno, A fuzzy buffer management scheme for ATM and IP networks, in: Proceedings of INFOCOM, 22–26 April 2001, Anchorage, AK, USA, vol. 3, pp. 1539–1547.

Dr. Meng-Hiot Lim joined the faculty at the School of Electrical and Electronics Engineering, Nanyang Technological University in 1989 and is currently an Associate Professor. He received his B.Sc., M.Sc. and PhD from the University of South Carolina. His research interests include computational intelligence, combinatorial optimization, e-based applications, evolvable hardware systems, reconfigurable circuits and architecture, computational finance and graph theory. Prior to joining the university, he worked as a Reliability Engineer for International Rectifier. This company is based in California and specializes in power MOSFETS. During 1999–2000, he was on a sabbatical with the Department of Electrical and Computer Engineering, University of Missouri-Rolla as a Visiting Associate Professor. He is holding a concurrent appointment of Deputy Director of the Centre for Financial Engineering, a multi-disciplinary research centre anchored at the Nanyang Business School. The Centre, besides promoting multi-disciplinary research, oversees the highly regarded M.Sc. in Financial Engineering program. As one of the founding directors of the Centre, he has played a significant role in the planning and formulation of the curriculum of this program. During his tenure with NTU, he has had several funded research projects. He has also offered technical consultancy for companies and delivered specialized short courses. He is a regular participant/organizer of major international conferences and publishes his works regularly in technical journals. He is the co-editor of a volume on Recent Advances in Simulated Evolution and Learning published by World Scientific Publishing. He was

M.H. Lim et al. / Journal of Systems Architecture 52 (2006) 169–183 the Local Arrangement Chair of the highly successful ICONIP/ SEAL/FSKD-2002 held in Singapore and will carry out similar task for CECÕ2007. He has served as Asian Liaison in the organizing committee of IJCNN-03, held on July 2003 in Portland, Oregon. For the past few years, he has been cited in the Marquis WhoÕs Who in the World, WhoÕs Who in Finance and Industry, WhoÕs Who in Science and Engineering, and Asia Pacific WhoÕs Who. He is a member of Sigma Xi and Tau Beta Pi and previously held executive committee positions with IEEE Singapore Section and IEEE Computer Chapter. He is a firm believer of the educational paradigm that veers towards empowering students to explore and challenge norms beyond classroom reality.

Ju Hui Li received the Bachelor of Engineer in Computer Communication from Hunan University, Changsha, P.R. China in 1994 and Master of Engineer in Computer Application from Northwestern Polytechnical University, XiÕan, P.R. China in 2001. He is currently working with Semiconductor-APIC, Philips Electronics Singapore Pte. Ltd. His research interests are evolvable hardware, evolvable fuzzy hardware, digital IC design and mix-signal IC design.

183

Qi Cao received the B.En. Degree in Control Science and Engineering from Huazhong University of Science and Technology, Wuhan, P.R. China in 2000. From 2000 to 2002, he was a hardware design engineer in ALi Corp.. He is currently a Ph.D. candidate at School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore. His research interests focus on in-system reconfigurable fuzzy inference processor design and SoC design.