Scheduling of scientific workflows using a chaos-genetic algorithm

Scheduling of scientific workflows using a chaos-genetic algorithm

Procedia Computer Science Available online at www.sciencedirect.com ProcediaComputer Computer Science Science 00 (2009) 000–000 Procedia 1 (2012) 14...

1MB Sizes 11 Downloads 157 Views

Procedia Computer Science

Available online at www.sciencedirect.com

ProcediaComputer Computer Science Science 00 (2009) 000–000 Procedia 1 (2012) 1445–1454

www.elsevier.com/locate/procedia www.elsevier.com/locate/procedia

International Conference on Computational Science, ICCS 2010

Scheduling of scientific workflows using a chaos-genetic algorithm Golnar Gharooni-farda, Fahime Moein-darbarib, Hossein Deldaric, Anahita Morvaridid,* a,b

Computer Science department of Islamic Azad University, Mashhad Branch, Emamiyeh Blvd., Ghasem abad, Mashhad, Iran c,d Computer Science department of Ferdowsi University, Azadi Square, Mashhad, Iran

Abstract

The main idea of developing Grid is to make effective use of the computation power distributed all over the world. Economical issues are the most vital motivations of resource owners to share their services. This means that users are required to pay for access to services based on their usage and level of QoS they need. Therefore total cost of executing an application is becoming one of the most important parameters in evaluating QoS, which users tend to decrease. Since, many applications are described in the form of dependent tasks, scheduling of these workflows has become a major challenge in grid environment. In this paper, a novel genetic algorithm called chaos-genetic algorithm is used to solve the scheduling problem considering both user’s budget and deadline. Due to the nature of chaotic variables such as pseudo-randomness, ergodicity and irregularity, the evolutional process of chaos-genetic algorithm makes individuals of subgenerations distribute ergodically in the defined space and circumvents the premature of the individuals of traditional genetic algorithms (TGA). The results of applying chaos-genetic scheduling algorithm (CGS) showed greater performances of CGS compared to traditional genetic algorithm (TGS) on both balanced and unbalanced workflows. c 2012 Published by Elsevier Ltd. Open access under CC BY-NC-ND license. ⃝ Keywords: Grid computing; Chaos-genetic algorithms; Workflow scheduling; Deadline constraints; Budget constraints

1. Introduction Grid computing is based on local grid computing which is basically, a kind of distributed computing (such as cluster computing and, point- to-point computing) which is capable of supporting diverse computing services. This has been made possible by the extra high speed internet and powerful processors that can execute middle wares without distracting computer’s regular job. The main differences between Grid environment and traditional distributed systems are, -

There is no central control over the computers. General-purpose protocols are used. The Quality of Services is usually very high.

As the internet speed increases, the difference between two PCs working next to each other in a single building, or far from each other in a city or country gradually fades out. Therefore, users are able to execute their tasks on * Anahita Morvaridi. Tel.: +98-0915-521-1840. Email Address: [email protected]

c 2012 Published by Elsevier Ltd. Open access under CC BY-NC-ND license. 1877-0509 ⃝ doi:10.1016/j.procs.2010.04.160

1446

G. Gharooni-fard et al. / Procedia Computer Science 1 (2012) 1445–1454

Golnar Gharooni-fard / Procedia Computer Science 00 (2010) 000–000

geographically distributed sources. The main idea behind introducing Grid was that we could utilize the computation power in the same way that we use water, electricity and gas power. In other words, we are searching for a way to be able to connect to the tremendous computational power of the whole universe, where the costs are directly dependent to the amount of energy being utilized. This has led to the idea of “economical Grid” which adds the concept of considering execution cost issues in computation algorithms. Therefore, taking into account the main objective of increasing the performance, our focus is no longer limited to raising the speed of the computation but also to reducing its execution cost. In general, we can say that traditional models for scheduling Grid are pretty frail. Considering Grid characteristics, a user may request an application that can be executed on the other side of the world, where network properties such as bandwidth, management policies, computational capabilities and etc are totally different. Therefore Grid scheduling has turned to be a major challenge. Here’s a list of the most important challenges of scheduling in Grid environment: -

Sources are usually shared between the users so there may be a competition among them. The scheduler is not in control of the sources. The number of available sources is constantly changing. Sources are located on different management sites. Sources are heterogeneous. Most of the workflow applications are data-centric and therefore need a large amount of data transfer between two sites.

In this paper we investigate the problem of scheduling workflows considering the QoS constraints. Since this problem is an NP-complete one, we proposed a meta-heuristic algorithm based on genetic algorithms to solve the workflow scheduling problem with the objective of minimizing time and cost of the execution. The cost of a service is normally related to the quality of the service it provides. Generally, service providers charge more money in response to higher quality of service. In addition, users may not always desire to complete workflows earlier than they require. Cheaper services with lower QoS that is sufficient to meet the user’s requirements are sometimes preferred. Therefore, a trade off between the time and monetary cost needs to be considered. Given this motivation, we suggest a method considering time and cost simultaneously, when scheduling a workflow execution. The remainder of the paper is organized as follows. We introduce related work in the next section. Then a general overview of the scheduling problem is explained, followed by defining the basic concepts used in our algorithm. Our proposed chaos-genetic algorithm is presented in section 4. Experimental details and simulation results are presented in section 5. Finally, we conclude the paper with directions for further work in the last section. 2. Related work Several heuristics have been proposed to solve the workflow scheduling problem. Generally, scheduling algorithms can be classified into two major groups, in view of their main objectives. First, a group of works that only attempt to minimize workflow execution time, without considering user’s budget. Minmin, which sets the highest priority to tasks with the shortest execution time, and Maxmin, which sets the high priority to the tasks with the long execution times are two major heuristic algorithms employed for scheduling workflows on Grids. Sufferage, is another heuristic algorithm which sets high scheduling priority to tasks whose completion time by the second best resource is far from that of the best resource which can complete the task at earliest time. These algorithms have been used to schedule EMAN bio-imaging application in [1]. Blythe et al [2] developed a workflow scheduling algorithm based on Greedy Randomized Adaptive Search Procedure (GRASP) [3] and compared it with Minmin in different scenarios. In [4], another heuristic algorithm based on genetic algorithms was proposed which takes into account the information of the entire workflow. Another workflow level heuristic is a Heterogeneous-Earliest-Finish-Time (HEFT) algorithm proposed by Wieczorek et al. in [5]. Second, a group of works which address scheduling problems based on user’s budget constraints. Nimrod-G [6] schedules independent tasks for parameter-sweep applications to meet user’s budget. More recently, Tsiakkouri et al [7] developed scheduling approaches, LOSS and GAIN, to adjust a schedule which is generated by a time optimized heuristic and cost optimized heuristic to meet user’s budget constraints. Our aim is to introduce a new

G. Gharooni-fard et al. / Procedia Computer Science 1 (2012) 1445–1454

1447

Golnar Gharooni-fard / Procedia Computer Science 00 (2010) 000–000

method based on genetic algorithms to solve the scheduling problem considering the budget and deadline of entire network. 3. Problem description A workflow application can be modelled as a Directed Acyclic Graph (DAG). There is a finite set of tasks Ti ( i = 1,2, …, n) and a set of directed arcs of the form ( Ti ,Tj ), where Ti is the parent task of Tj , and Tj is the child of Ti. A child task can never be executed unless all of its parent tasks have been completed. Let B be the cost constraint (budget) and D be the time constraint (deadline), specified by the user’s workflow execution. ௝ Let m be the total number of services available. There’s a set of services ܵ௜ ሺ݅ ൌ ͳǡʹǡ ǥ ǡ ݊ǡ ݆ ൌ ͳǡʹǡ ǥ ǡ ݉௜ ǡ ݉௜ ൑ ݉ሻcapable of executing taskܶ௜ , but each task can only be assigned for execution one of these services. Services ௝ ௝ have varied processing capability delivered at different prices. We denote ‫ݐ‬௜ as the processing time, and ܿ௜ as the ௝ service price for processing ܶ௜ on serviceܵ௜ . ௝ The scheduling problem is to map every ܶ௜ onto a suitable ܵ௜ in order to improve the execution time and cost of a workflow according to the user’s budget and deadline. In the next section, we’ll introduce the main concepts used to design the algorithm. 3.1. Genetic Algorithms Genetic Algorithms were introduced by John Holland in early seventies as a special technique for function optimization. Genetic algorithms are based on the biological phenomenon of genetic evolution. The basic idea is as that the genetic pool of a given population potentially contains the solution, or a better solution, to a given adaptive problem. This solution is not active because the genetic combination on which it relies is split between several subjects. Only the association of different chromosomes can lead to the solution. During reproduction and crossover, new genetic combinations occur and, finally, a subject can inherit a good gene from both parents. The algorithm operates in an iterative manner and evolves a new generation from the current generation by application of genetic operators. A new generation is created by first increasing the population by random individual solutions and then selecting a constant number of solutions based on their fitness values [8]. Therefore given a clearly defined problem to be solved and strings of candidate solutions, a simple GA works as follows: 1. 2. 3. 4. 5.

Initialize the population. Calculate fitness for each individual in the population. Reproduce selected individuals to form a new population. Perform crossover and mutation on the population. Loop to step 2 until some condition is met.

In some GA implementations, operations other than crossover and mutation are carried out in step 4. Crossover, however, is considered by many to be an essential operation of all GAs. Termination of the algorithm is usually based either on achieving a population member with some specified fitness or on having run the algorithm for a given number of generations. 3.2. Chaos Chaos is a none-periodic, long-term behaviour in a deterministic system that exhibits sensitive dependence on initial conditions. Edward Lorenz irregularity in a toy model of the weather displays first chaotic or strange attractor in 1963. It was mathematically defined as randomness generated by simple deterministic systems. A deterministic structure can have no stochastic (probabilistic) parameters. Therefore chaotic systems are not at all equal to noisy systems driven by random processes. The irregular behaviour of the chaotic systems arises from intrinsic nonlinearities rather than noise. In general, the most important defining property of chaotic variables is Sensitive dependence to Initial Conditions (SIC), which requires that trajectories originating from very nearly identical initial conditions diverge at an exponential rate. Pseudo-randomness and ergodicity are other dynamic characteristics of a chaotic structure [9]. The

1448

G. Gharooni-fard et al. / Procedia Computer Science 1 (2012) 1445–1454

Golnar Gharooni-fardd / Procedia Computer Science 00 (2010) 000–000

latter ensures that the track of a chaotic variaable can travel ergodically over the whole space of intterest. The variation of the chaotic variable has a delicate innherent rule in spite of the fact that it looks like a disorderr. 3.3. Chaos-Genetic Algorithm Recently, the idea of using chaotic systems innstead of random processes has been noticed in several fieelds. One of these fields is optimization theory. In random-bbased optimization algorithms, the role of randomness can be played by a chaotic dynamics. Experimental studies assert a that the benefits of using chaotic signals instead of random signals are often evident although it is not maathematically proved yet [10]. For example in genetic algorithms, chaotic sequences increase the value of some measured algorithm-performance indexes with respectt to random sequences. In this paper a Chaos-Genetic Scheduling alggorithm, CGS, is proposed that combines the concept off chaos with genetic algorithms when looking for an optimal solution, in order to possess a joint advantage of GA and d the chaotic variable [11]. Firstly, CGS takes the advantagess of the characteristics of the chaotic variable to make the individuals of subgenerations distributed ergodically in thee defined space and thus to avoid the premature converg gence of the individuals in the subgenerations. Secondly, CG GS also takes the advantage of the convergence characterristic of GA to overcome the randomness of the chaotic prrocess and hence to increase the probability of finding g the global optimal solution. The idea of combining chaos with Genetic Algorithm A has also been studied in other computer-relateed fields. In [12] a chaos-genetic based approach is proposeed in order to solve the Network-on-Chip mapping prob blem. In the field of neural networks, chaos search is used to t accompany GA in order to overcome the weakness off Traditional Genetic Algorithm (TGA) [13]. In [14] a chaos--genetic algorithm based on the chaos optimization algoriithm (COA) and genetic algorithm, is presented to overcom me premature local optimum and increase the convergence speed of genetic algorithm. Simulation results indicate that the Chaos GA can improve convergence speed and a solution accuracy, in all the literature mentioned above. 4. The proposed algorithm For a workflow scheduling problem, a feasiblle solution is required to meet several conditions. A task can only be started after all its predecessors have completedd, every task appears once and only once in the schedulle, and each task must be allocated to one available time slot of a service capable of executing the task.

Fig.1. Illustration of problem encoding, (a) sample workflow w, (b) set of source-to-task assignments, (c) an example of a one-dimenssional chromosome, (d) execution order of the sample chromosomee.

Each individual in the population represents a feasible solution to the problem, and consists of a vector of task assignments. Each task assignment includes fouur elements (task ID, service ID, start time, end time). The first two parameters identify to which service each task is assigned. Since involving time frames during the genetiic operation may lead to a very complicated situation [15],, in this work we ignore the time frames. Therefore, th he operation strings (chromosomes) encode only the service allocation for each task and the order of the tasks allocaated on each service. Different execution priorities of such parallel p tasks within the workflow may impact the perfformance of workflow execution significantly. For this reasoon, the solution representation strings are required to sho ow the order of task assignments on each service in addition to t service allocation of each task. As it is also used in [15 5], we create a 2D string to represent a schedule as illustratedd in Fig.1. One dimension represents the numbers of servicces whereas

G. Gharooni-fard et al. / Procedia Computer Science 1 (2012) 1445–1454

1449

Golnar Gharooni-fard / Procedia Computer Science 00 (2010) 000–000

the other dimension shows the order of tasks on each service. Two-dimensional strings are then converted into a one-dimensional string for genetic manipulations. As stated earlier, the problem is to schedule a workflow execution considering both time and user budget constraints. The first decision to be made is how to represent the solution. Fig.1 shows an example of an individual in the initial population. Initializing the population is another important issue, which is usually done randomly. Therefore a random number generator is used to produce values between 1 to n. For each task, these random values are chosen from the sources that are capable of executing that task. The length of the chromosome depends on the number of tasks in the workflow. A chaotic mapping operator is then applied to the initial population generating a new chaotic population. The evolution process of the chaotic variables could be defined through the following equation: ሺ௞ାଵሻ

ܿ‫ݏ‬௜

ሺ௞ሻ

ሺ௞ሻ

ൌ Ͷܿ‫ݏ‬௜ ൫ͳ െ ܿ‫ݏ‬௜ ൯ǡ݅ ൌ ͳǡ ʹǡ ǥ ǡ ݉ሺͳሻ

in which ܿ‫ݏ‬௜ is the i-th chaotic variable and k and k+1 denote the number of iterations. Note that values of ܿ‫ݏ‬௜ are distributed in the range of (0,1). The chaotic mapping operator works as follows: 1. 2. 3.

Divide the interval (0,1) to m equal sub-intervals ( m denotes the number of resources capable of executing a special task). The value of each gene in the first randomly produced population is mapped to new values of ܿ‫ݏ‬௜ in the range of (0,1). ሺଵሻ

These values of ‫ݏ‬௜ , i = 1,2,…, n are linearly mapped using the operator ሺభሻ

௦೔

ሺଵሻ

ൌ ܿ‫ݏ‬௜ ሺʹሻ 

௠೔

4. 5.

where mi is the total number of resources capable of executing Ti. ሺଶሻ The next iteration chaotic variables ܿ‫ݏ‬௜ , will be produced through applying Equation.1 to the values of ሺଵሻ ܿ‫ݏ‬௜ , generated in the previous step. ሺଶሻ ሺଶሻ The chaotic variables ܿ‫ݏ‬௜ , are then used to produce ‫ݏ‬௜ , using ሺଶሻ

‫ݏ‬௜

ሺଶሻ

ൌ  ඃܿ‫ݏ‬௜

ൈ ݉௜ ඇሺ͵ሻ ሺ௞ሻ

Thus, we can continue to produce the values of ‫ݏ‬௜ for each chromosome, through the operators defined in (1) (3). At this stage, the fitness of all 20 individuals is evaluated. The fitness value is often proportional to the output value of the function being optimized according to the given objectives. As the goal of scheduling is to improve the performance of a workflow execution by minimizing the time and cost, the fitness function separates evaluation in two parts [15]: cost-fitness and time-fitness. For the budget constrained scheduling, the cost-fitness component produces results with less cost. The cost fitness function of an individual I is defined by: ‫ܨ‬௖௢௦௧ ሺ‫ܫ‬ሻ ൌ

௖ሺூሻ ஻

ሺͶሻ

where c(I) is the sum of the task execution cost and data transmission cost of I and B is the budget of the workflow. For the budget constrained scheduling, the time-fitness component is designed to produce individuals that satisfy deadline constraint. The time-fitness function of an individual I is defined by: ‫ܨ‬௧௜௠௘ ሺ‫ܫ‬ሻ ൌ

௧ሺூሻ ஽

(5)

where t(I) is the completion time of I, D is the deadline of the workflow. The final fitness function combines the two parts and it is expressed as:

1450

G. Gharooni-fard et al. / Procedia Computer Science 1 (2012) 1445–1454

Golnar Gharooni-fardd / Procedia Computer Science 00 (2010) 000–000

‫ܨ‬ሺ‫ܫ‬ሻ ൌ ቊ

‫ܨ‬௖௢௦௧ ሺ‫ܫ‬ሻ ൅ ‫ܨ‬௧௜௠௘ ሺ‫ܫ‬ሻǡ݂݅‫ܨ‬௖௢௦௧ ሺ‫ܫ‬ሻ ൐ ͳ‫ܨݎ݋‬௧௜௠௘ ሺ‫ܫ‬ሻ ൐ ͳ ௖ሺூሻ ௠௔௫௖௢௦௧



௧ሺூሻ ௠௔௫௧௜௠௘

ǡ ‫݁ݏ݅ݓݎ݄݁ݐ݋‬

(6)

o the current population and maxtime denotes the largest completion where maxcost is the most expensive solution of time of the current population. t the next Elitism is incorporated into the algorithm by transferring the single fittest individual directly to generation. Crossover is performed on randomlly selected individuals according to the idea that it may result in an even better individuals by combining the two fittest f ones [10]. The crossover operator used in this alg gorithm is a basic two-point crossover which works as follow ws: 1. 2. 3. 4. 5.

Two random parents are chosen in the current c population. Two random points are selected from thhe schedule order of the first parent. All tasks between these two points are chosen as successive crossover points. The locations of all tasks of the crossovver points between the two parents are exchanged. Two new offsprings are generated by combining task assignments taken from two parents.

Fig.2. shows an example of the process explaineed above.

Fig.2. Crossover operation

Finally, a constant mutation rate (0.05) is applied in our proposed algorithm. Mutation aims to reeallocate an alternative service to a task in an individual. An example of the mutation process is illustrated in fig.3. It is implemented as follows: 1. 2.

mosome. A task is randomly selected in a chrom An alternative service which is also cappable of executing the task is randomly selected to replacee the current task allocation.

Fig.3. Mutation operation

The new population is now ready for anotherr round of chaotic mapping, crossover, and mutation, producing yet another generation. So the initial population is replaced by these newly generated individuals. Obvio ously, more generations are produced until the stopping coondition (a maximum number of generations k) is met.. The fittest chromosome is thus returned as a solution.

1451

G. Gharooni-fard et al. / Procedia Computer Science 1 (2012) 1445–1454

Golnar Gharooni-fard / Procedia Computer Science 00 (2010) 000–000

5. Experimental

results

According to workflow projects, workflow applications can be categorized as either balanced structure or unbalanced structure. Our proposed algorithm is applied to examples of both balanced and unbalanced structures. We use two common workflow applications for our experiments: A balanced application (fMRI workflow shown in fig.4 (a)) and an unbalanced structure (DNA workflow, shown in fig.4 (b)).

(a)

(b)

Fig.4.(a) A balanced workflow (fMRI), (b) An unbalanced workflow (DNA)

The two metrics used to evaluate our algorithm (CGS), are execution time and cost. Table 1 show service speed and corresponding price (time and cost) for executing T1 on different sources, for fMRI workflow. First column of the table denotes the number of sources capable of executing T1. For example, in fMRI, T1 can be executed on five sources S1-S5. Table 1. Data samples for executing T1 in fMRI workflow

Source ID 1 2 3 4 5

Time 14 11 10 16 8

Cost 150 144 151 119 157

The following parameter settings are the default configuration for simulating both Genetic Algorithm and ChaosGenetic Algorithm. Population size of 10 normal chromosomes followed by 10 chaotic chromosomes, crossover probability of 0.98 and mutation probability of 0.05.In order to be able to evaluate the results of our proposed algorithm (CGS), we also implemented a traditional genetic algorithm to solve the workflow scheduling problem. Since GA is a stochastic search algorithm, each of the experiments was repeated 10 times and the average values are used to report the results. ϰϯϬϬ

ϰϯϬϬ

ϰϮϬϬ

ϰϮϬϬ

ϰϭϬϬ

ϰϭϬϬ ϰϬϬϬ ŽƐƚ;'ΨͿ

ŽƐƚ;'ΨͿ

ϰϬϬϬ ϯϵϬϬ ϯϴϬϬ ϯϳϬϬ

ϯϵϬϬ ϯϴϬϬ ϯϳϬϬ

ϯϲϬϬ

ϯϲϬϬ

ϯϱϬϬ

ϯϱϬϬ

ϯϰϬϬ

ϯϰϬϬ ϮϬϬ

ϮϮϬ

ϮϰϬ dŝŵĞ;,ŽƵƌƐͿ

ϮϲϬ

ϮϴϬ

ϮϬϬ

(a) Fig.5. Distribution of individuals when executing (a) TGS (b) CGS on DNA workflow

ϮϮϬ

ϮϰϬ dŝŵĞ;,ŽƵƌƐͿ

(b)

ϮϲϬ

ϮϴϬ

1452

G. Gharooni-fard et al. / Procedia Computer Science 1 (2012) 1445–1454

Golnar Gharooni-fardd / Procedia Computer Science 00 (2010) 000–000

1.4

1.4

1.2

1.2

1

ĐŽƐƚͬďƵĚŐĞƚ

ĐŽƐƚͬďƵĚŐĞƚ

As mentioned earlier, the major characterisstic of CGS is that it prevents the premature convergeence of the individuals of TGS and thus increases the probaability of finding better solutions. In order to illustrate the distribution of the individuals in our problem, we run our algorithm a (CGS) to execute DNA workflow and comparee the results with TGS. The distribution of the individuals off total 100 generations is illustrated in fig.5. As it is clear in Fig.5 (b), the individuals of suubgenerations generated by CGS are almost evenly scatterred over the defined space and do not concentrate to the centre of the space anymore (see fig.5 (a)). Although the sam me numbers of solutions are considered in plotting both figgures, the reason they look less in the case of TGS is th hat they are mostly very close to each other, and therefore thhe differences are not really clear in Fig.5 (a). In order to evaluate algorithm on reasonable budget and deadline constraints, we also implemented a Traditional Genetic algorithm for scheduling workflow appplications (TGS), so that it would be possible to comparee the results obtained from CGS with the ones gained from TGS, T for the same workflow applications.

0.8 0.6

d'^

0.4

'^

0.2

1 0.8 0.6

d'^

0.4

'^

0.2 0

0 3000 4000 5000 6000 7000 8000

3000 4000 5000 6000 7000 8000

ƵƐĞƌďƵĚŐĞƚ

ƵƐĞƌďƵĚŐĞƚ

(a)

(b)

C on balanced (fMRI) and unbalanced (DNA) structures Fig.6. Comparison between the execution cost of TGS and CGS

1.4

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

1.2

d'^ '^

ƚŝŵĞͬĚĞĂĚůŝŶĞ

ƚŝŵĞͬĚĞĂĚůŝŶĞ

In Fig.6 (a) the results are obtained under the assumption of D = 220(H) and in Fig.6 (b) we assume, D = 240(H). he total cost The values of these assumptions are made baseed on [15]. The values in vertical axis are the result of th divided by the user budget constraint, starting frrom G$3000 to G$8000. We observe that both TGS and CGS C cannot satisfy the low budget constraint (about G$30000), and TGS shows the worst results in both applicationss. However, results are gradually improved under medium m budget constraints (about G$5000). Obviously, the descending behaviour of the diagram shows that as the budgget increases, it’ll be easier for the algorithms to meet the user budget constraint. On the other hand, considering thee differences between two approaches, it’s obvious that TGS takes much longer to complete even when the budgeets are high. Therefore, CGS shows better performance compared c to TGS in both applications.

1 0.8 0.6

d'^

0.4

'^

0.2 0

160 180 200 220 240 260 280

180 200 220 240 260 280 300

ĚĞĂĚůŝŶĞ

ĚĞĂĚůŝŶĞ

(a)

(b)

C on balanced (fMRI) and unbalanced (DNA) structures Fig.7. Comparison between the execution time of TGS and CGS

G. Gharooni-fard et al. / Procedia Computer Science 1 (2012) 1445–1454

1453

Golnar Gharooni-fard / Procedia Computer Science 00 (2010) 000–000

Fig.7 illustrates the comparison between the execution times of the two algorithms with the medium budget of 5000. We change the user deadline values from 180(H) to 300(H) for DNA and 160(H) to 280(H) for fMRI, since the latter is a balanced workflow and takes less time to complete. It can be seen that TGS takes longer to complete in most of the conditions. The differences are better observed in the unbalanced workflow structure (see fig.7 (b)). In all the above illustrations, there may be states where CGS and TGS show similar results (for instance in fig.6 (a) under budget constraint of 7000). These are the conditions where, TGS solutions are not trapped in a local optimum so it works as well as CGS in finding the good results for a given problem. In those conditions, CGS does not do any good in saving the suitable solutions. In the rest of the states though, TGA, is stocked somewhere in a local optimum (as it usually does), which prevents the algorithm from producing better possible results. Our chaosgenetic algorithm (CGS), takes the advantages of the characteristics of the chaotic variable to make the individuals of subgenerations distributed ergodically in the defined space and thus to avoid from the premature of the individuals in the subgenerations. It also takes the advantage of the convergence characteristic of TGA to overcome the randomness of the chaotic process and hence to increase the probability of finding the global optimal solution. 7. Conclusion and future works In this work we introduce a novel chaos-genetic based algorithm that uses chaotic sequences instead of random processes in traditional genetic algorithms. We evaluate our approach by employing it to both balanced and unbalanced workflow structures. The results show better performances of Chaos Genetic Scheduling (CGS) algorithm in both cases, when compared with Traditional Genetic (TGS). The reason is that, chaos-genetic algorithm uses the characteristics of chaotic variables in scattering the solutions among the whole search space and thus avoids the premature convergence of the solutions and produces better results within a shorter time. We will be further enhancing our scheduling algorithm by considering other QoS properties such as reliability. The performance of the algorithm can be improved by using the properties of chaotic sequences in other random decisions made in traditional genetic algorithms such as specifying crossover points. We can also apply other onedimensional chaotic maps instead of Logistic map and compare the performance of our algorithm to find out which one works best for our scheduling problem. Acknowledgments This work is supported by the Iranian Telecommunication Research Center (ITRC) and the Young Researchers Club. References 1. A. Mandal et al., “Scheduling Strategies for Mapping Application Workflows onto the Grid”, In IEEE International Symposium on High Performance Distributed Computing (HPDC 2005), 2005. 2. J. Blythe et al., “Task Scheduling Strategies for Workflow-based Applications in Grids”, In IEEE International Symposium on Cluster Computing and Grid (CCGrid), 2005. 3. T. A. Feo and M. G. C. Resende, “Greedy Randomized Adaptive Search Procedures”, Journal of Global Optimization, 6:109-133, 1995. 4. R. Prodan and T. Fahringer, “Dynamic Scheduling of Scientific Workflow Applications on the Grid using a Modular Optimisation Tool: A Case Study”, In 20th Symposium of Applied Computing (SAC 2005), Santa Fe, New Mexico, USA, March 2005. ACM Press. 5. M. Wieczorek, R. Prodan and T. Fahringer, “Scheduling of Scientific Workflows in the ASKALON Grid Environment”, Special Issues on scientific workflows, ACM SIDMOD Record, 34(3):56-62, ACM Press, 2005. 6. R. Buyya, J. Giddy, and D. Abramson, “An Evaluation of Economy-based Resource Trading and Scheduling on Computational Power Grids for Parameter Sweep Applications”, In 2nd Workshop on Active Middleware Services (AMS 2000), Kluwer Academic Press, August 1, 2000, Pittsburgh, USA. 7. E. Tsiakkouri et al., “Scheduling Workflows with Budget Constraints”, In the CoreGRID Workshop on Integrated research in Grid Computing, S. Gorlatch and M. Danelutto (Eds.), Technical Report TR-05-22, University of Pisa, Dipartimento Di Informatica, Pisa, Italy, Nov. 28-30, 2005, pages 347-357 . 8. Melanie, Mitchell. 1998. An Introduction to Genetic Algorithms, A Bradford Book The MIT Press, Cambridge, Massachusetts. London England. 9. Peter Stavroulakis, 2006, Chaos Application in Telecommunications, Taylor & Francis. 10. M.Bucolo, R.Caponetto, L.Fortuna, M.Frasca, and A.Rizzo, 2002, Does chaos work better than noise? IEEE Circuits and Systems Magazine 2 (3), 4-19. 11. X.F.Yan , D.Z.Chen and S.X.Hu, 2003, Chaos-genetic algorithms for optimizing the operating conditions based on RBF-PLS model” Elsevier Computers and Chemical Engineering , 1393-1404. 12. F. Moei-darbari, A. Khademzadeh, and G. Gharooni-fard, “Evaluating the performance of chaos genetic algorithm for solving the Network-on-Chip mapping problem”, in proc. IEEE International Conference on Computational Science and Engineering. Vancouver, Canada. vol. 2, pp.366-373, August 2009.

1454

G. Gharooni-fard et al. / Procedia Computer Science 1 (2012) 1445–1454

Golnar Gharooni-fard / Procedia Computer Science 00 (2010) 000–000 13. Y. Yong, S. Wanxing, and W. Sunam, “Study of chaos genetic algorithms and its application in neural networks”, in proc. IEEE TENCON’02, pp. 232- 235, November 2008. 14. C. Cheng, W. Wang, D. Xu, and K.W. Chau, “ Optimizing hydropower reservoir operation using hybrid genetic algorithm and chaos”, Water Resources Management, Vol. 22, No. 7, 2008, pp 895-909. 15. J. Yu and R. Buyya, Scheduling Scientific Workflow Applications with Deadline and Budget Constraints using Genetic Algorithms, ScientificProgramming,14:217-230, 2006.