A novel multi-objective bacteria foraging optimization algorithm (MOBFOA) for multi-objective scheduling

A novel multi-objective bacteria foraging optimization algorithm (MOBFOA) for multi-objective scheduling

Accepted Manuscript Title: A novel multi-objective bacteria foraging optimization algorithm (MOBFOA) for multi-objective scheduling Author: Mandeep Ka...

778KB Sizes 0 Downloads 26 Views

Accepted Manuscript Title: A novel multi-objective bacteria foraging optimization algorithm (MOBFOA) for multi-objective scheduling Author: Mandeep Kaur Sanjay Kadam PII: DOI: Reference:

S1568-4946(18)30068-1 https://doi.org/doi:10.1016/j.asoc.2018.02.011 ASOC 4703

To appear in:

Applied Soft Computing

Received date: Revised date: Accepted date:

13-8-2017 7-2-2018 7-2-2018

Please cite this article as: Mandeep Kaur, Sanjay Kadam, A novel multiobjective bacteria foraging optimization algorithm (MOBFOA) for multiobjective scheduling, (2018), https://doi.org/10.1016/j.asoc.2018.02.011 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

*Highlights (for review)

Highlights

ip t

cr

us an



M



d



te



We propose a novel multi-objective bacteria foraging algorithm (MOBFOA), which is an extension of the original BFOA to handle multi-objective problems. An adaptive step-sized based chemotaxis process enables the proposed MOBFOA to converge to optimal or near-optimal solution (rather than getting stuck in a local minima). The proposed MOBFOA uses a new bacteria fitness assignment (strength of a solution) method and bacteria selection procedure for simultaneous optimization of multiple objectives. A comparative study is carried out between our proposed MOBFOA and other stochastic algorithms (OMOPSO and NSGA-II) with respect to convergence towards Pareto-optimal front and the spread of solutions in the search space. The proposed MOBFOA is a viable tool to handle multi-objective scheduling problem.

Ac ce p



Page 1 of 36

us

a Department of Computer Science Savitribai Phule Pune University (SPPU), Pune, India b Center for Development of Advanced Computing SPPU Campus, Ganeshkhind, Pune, India

cr

Mandeep Kaura , Sanjay Kadamb,∗

ip t

A novel multi-objective bacteria foraging optimization algorithm (MOBFOA) for multi-objective scheduling

an

Abstract

Nowadays in cloud and grid environment, many users compete for the resources, so the schedule should be generated in the shortest possible time. To address this prob-

M

lem, there have been several research initiatives to use evolutionary and swarm based algorithms to find near-optimal scheduling solutions. The state-of-the-art evolutionary algorithms for handling single/multi-criteria scheduling of m jobs on n resources are

d

still evolving, with efforts aimed at reducing their space/time complexity, maintain-

te

ing diversity in the population and directing the search towards the true Pareto-optimal solutions. In this paper, we have proposed an improvised multi-objective bacteria foraging optimization algorithm (MOBFOA). we have attempted to modify the original

Ac ce p

BFOA to handle the multi-objective scheduling problems using Pareto-optimal front approach. The improvisation is in terms of selecting bacteria positions from both the dominant as well as non-dominant fronts to obtain diversity in the solutions obtained. The accuracy and speed of convergence of the BFOA has been improved by introducing adaptive step size in chemotactic step. The proposed MOBFOA uses new fitness assignment method and bacteria selection procedure for simultaneous optimization of multiple objectives, where each solution evaluation is computationally expensive. This paper focuses on the scheduling of independent jobs considering multi-objective tradeoffs among the objective functions desired by the users in grid/cloud environment. The ∗ Corresponding

author Email addresses: [email protected] (Mandeep Kaur), [email protected] (Sanjay Kadam)

Preprint submitted to Journal of LATEX Templates

February 15, 2018

Page 2 of 36

performance of the proposed MOBFOA is discussed in terms of convergence towards the Pareto-optimal front and distribution of solutions in the search space. The paper

ip t

also provides a comparative study of the results obtained by the proposed MOBFOA with other stochastic optimization algorithms, namely, the non-dominated sorting ge-

cr

netic algorithm-II (NSGA-II) and optimised multi-objective particle swarm optimization (OMOPSO).

us

Keywords: BFOA, Scheduling, Cloud computing, Multi-objective, Makespan.

1. Introduction

an

Grid and cloud computing has evolved as a new global infrastructures of the 21st century to solve large-scale problems in engineering, science and other allied areas

M

in a distributed environment [1, 2]. One important aspect of such environments is to discover data and computational resources, which are heterogeneous, dynamic in nature, and have different administrative controls and allocate these resources to user

d

jobs [3]. The challenge is to schedule the user jobs on these resources in a given timeframe for utilizing the grid resources optimally and to take case of the user specified

te

scheduling criteria [4]. The scheduling in grid and cloud is about mapping n independent/dependent jobs onto m resources that optimises a given criteria specified by

Ac ce p

the user or the system. A scheduling solution that takes care of multiple objectives or criteria is called Pareto-optimal if no solution is as good as it is for all the objectives and is better for at least one objective. The Pareto-optimal front comprises set of all non-dominated solutions [5, 6]. Grid as well as cloud scheduling problem is NP-hard, hence, it is not possible to get

optimal scheduling solutions in a polynomial time [7]. If the exhaustive search is used for such problems, the time taken for producing the scheduling solution could be very high. There have been several research initiatives to use evolutionary and swarm intelligence based algorithms to find the near optimal scheduling solutions. In multi-criteria scheduling two or more conflicting criteria are considered for optimization during the scheduling process. In this paper we attempt to prepare schedules that can handle multiple scheduling criteria with our proposed multi-objective optimization technique

2

Page 3 of 36

(MOBFOA). In this paper, we propose an improvised multi-objective bacteria foraging optimiza-

ip t

tion algorithm (MOBFOA), which generates a set of trade-off solutions with respect to

multiple criteria for scheduling of independent jobs on heterogeneous resources. We

cr

propose a new strategy to assign fitness value to each bacterium, which is based on

the difference between the strength values of all bacteria it dominates and the strength

us

values of all bacteria that dominate it. The crowding distance together with the fitness values are then used to determine the bacteria that go to the next generation. We also use adaptive step size in the chemotaxis process. With these proposed changes we

an

compare the non-dominated solutions obtained by MOBFOA in terms of their convergence and diversity with the solutions obtained by other well known algorithms such

genetic algorithm-II (NSGA-II). 1.1. Multi-objective optimization

M

as multi-objective particle swarm optimization (OMOPSO) and non-dominated sorting

d

Many optimization scheduling problems contain one or more objective functions to

te

be optimised. The objectives may conflict with each other. In such problems, there is no single solution for these problems. Instead, good ‘trade-off’ solutions can be found that represent the best possible compromises among the scheduling criteria or objectives

Ac ce p

[8]. The following issues are to be considered while formulating the multi-objective optimization problem [9]: a) Selection of solutions that guide the search towards the Pareto-optimal front, b) a mechanism to preserve the non-dominated solutions across the populations, and c) a mechanism to retain diversity in the population. The multi-objective optimization algorithms can use several methods for handling

multi-criteria problems. These include aggregation, lexicographic ordering, sub-population, Pareto-based approach or hybrid [9]. In aggregation all the objectives are combined into a single-objective. In lexicographic ordering the user can rank the objectives in the order of relevance or importance. The optimum solution is then obtained by minimizing the objective functions separately. The search begins with the most important objective and proceeds according to the assigned order of relevance of the objective functions. In sub-population approach, the sub-populations exchange information in 3

Page 4 of 36

order to produce trade-offs among the objectives that are optimized separately. In the Pareto-based method, the idea is to select the non-dominated solutions and guide the

ip t

search towards the true Pareto-optimal front [10].

Mathematically, we can define the multi-objective optimization problem as mini-

cr

mizing or maximizing each objective, that is, minimize f (y) ≡ < f1 (y), . . . , fn (y) >,

where y ∈ Y and Y is a solution space. A solution can dominate another solution if it is as good as the other solution and better than other solution in at least one objective

us

function. That is, y ∗ dominates y, if and only if fi (y ∗ ) 6 fi (y) and fj (y ∗ ) < fj (y), for all i ∈ [1, . . . , n] and for some j ∈ [1, . . . , n]. Fig. 1 shows solutions for the

an

minimization problem with two conflicting objectives, f1 (y) and f2 (y). The solution y3 dominates x2 , because both objective values of y3 are lower than those of x2 . However, y3 does not dominate y4 , since f2 (y3 ) > f2 (y4 ). We say that y3 and y4 are

M

non-dominated solutions. These are optimal in one objective but none of these two solutions is superior to the other with respect to the two objective functions. The set of non-dominated solutions form a non-dominated set S that satisfies the following con-

d

ditions: a) Any solution in S is non-dominated to any other solution in S with respect

te

to all objectives, b) Any solution in S dominates at least one solution not belonging to S. For example, the set of solutions y1 to y4 (Fig. 1) is a non-dominated set for the

Ac ce p

given problem.

Figure 1: Five non-dominated and four dominated solutions for the objectives f1 (y) and f2 (y).

4

Page 5 of 36

1.2. Organization of the paper

ip t

The paper is organized into eight sections. The first section provides background details and motivation of the research work. The second section describes the existing multi-objective scheduling approaches. The third section describes the problem

cr

statement. The fourth section describes the proposed multi-objective BFO algorithm (MOBFOA). The fifth section provides insights into other multi-objective optimization

us

algorithms considered for comparative study with our proposed approach. The sixth section of this paper elaborates simulated results for independent scheduling mechanism. The seventh section provides detailed discussion on the results and observations.

an

The last section provides conclusion of the proposed research work and future directions.

M

2. Related Work

There have been numerous research initiatives over the past two decades in the

d

area of cloud and grid scheduling. Many evolutionary and swarm intelligence based approaches such as genetic algorithms (GA), bee colony optimization (BCO), particle

te

swarm optimization (PSO), ant colony optimization (ACO), non-dominated sorting genetic algorithm (NSGA-II), and multi-objective optimization algorithms for cloud and

Ac ce p

grid scheduling have been proposed and developed. Carretero et al. [11] have proposed genetic algorithm based evolutionary approach

that minimizes flowtime and makespan. A single composite objective function is used for addressing two scheduling criteria by allotting certain weights to both the objective functions. Their work is based on assumption of one-to-one job-resource mapping, whereas in the actual grid environment, many jobs can be allocated to a single resource. Grosan et al. [12] have presented multi-objective optimization approach to minimize makespan and flowtime. They have compared their approach with the other techniques such as GA, PSO and SA. They have developed their own simulator to create a grid environment. Zhafa et. al have proposed an improved tabu search algorithm for scheduling the independent jobs in computational grids [13]. The bi-objective

5

Page 6 of 36

functions, flowtime and makespan are optimized to generate a schedule for independent jobs in grid.

ip t

Delavar et al. [14] have proposed modified ACO algorithm that generates optimized schedules for grid jobs. The objective functions considered for scheduling are

cr

cost and job execution time. The authors claim that their modified algorithm outper-

forms other algorithms such as multi-objective ACO and original ACO. Pan et al. [15]

us

have proposed new artificial chemical reaction optimization method for generating near optimal schedules. In their chemical reaction process, the reactants interact with each other to attain the minimum enthalpy (potential energy) state, which significantly im-

an

proves the makespan (overall schedule length) in grid scheduling. The comparison of the proposed approach is made with GA approach and heterogeneous earliest finish time algorithm (HEFT).

M

Salimi et al. [16] have proposed NSGA-II with Fuzzy variance based crossover approach and optimized two scheduling criteria, namely, makespan and resource usage price. Paul et al. [17] have developed a multi-objective evolutionary technique based

d

on NSGA-II, with the aim of maximizing resource utilization and minimizing the re-

te

source usage cost, penalty time and makespan. In [18], the author has proposed a new multi-objective scheduling approach and attempted to optimize multiple scheduling

Ac ce p

criteria (resource usage cost, makespan and job success rate). Selvi and Manimegalai [19] have presented multi-objective variable neighborhood search (MVNS) technique for scheduling of independent jobs. The proposed technique optimizes makespan and flowtime simultaneously, which are the conflicting objective functions. The comparison of the proposed technique is made with other random based metaheuristics such GRASP.

Zhang et. al [20] have proposed an ordinal optimized method for multi-objective

many-task scheduling in cloud environment. A rough model has been used to reduce the scheduling overhead and to produce the sub-optimal scheduling solutions. The proposed method results in a thousand times of reduction in the search time for semioptimal workflow schedules compared with the use of the Monte Carlo and the Blind Pick methods. Jena [21] has presented a nested PSO framework for multi-objective task scheduling in cloud environment. This work focuses on task scheduling using a 6

Page 7 of 36

multi-objective nested Particle Swarm Optimization (TSPSO) to optimize energy and processing time. Finally, the results obtained by TSPSO are compared with the existing

ip t

scheduling algorithms and it is found that TSPSO provides an optimal balance results for multiple objectives. In [22], authors have introduced an optimized task scheduling

cr

algorithm which adapts the advantages of various other existing algorithms according to the situation while considering the distribution and scalability characteristics of

us

cloud resources.

In [23], the author has presented task scheduling approach using multi-objective Artificial Bee Colony Algorithm (TA-ABC) for cloud . The proposed algorithm op-

an

timizes the energy, cost, resource utilization and processing time of the cloud environment. The proposed algorithm (TA-ABC) provides an optimal balance results for multiple objectives and the comparative study has been carried out to measure the per-

M

formance of TA-ABC with a few scheduling algorithms. Yao et. al[24] have proposed a multi-swarm multi-objective optimization algorithm (MSMOOA) to satisfy multiple conflicting objectives. The particles in each swarm are divided into two classes and

d

adopt different strategies to evolve cooperatively. One class of particles communicates

te

with several swarms simultaneously to promote the information sharing among swarms and the other class of particles exchanges information with the particles located in the

Ac ce p

same swarm. The results of the proposed method are evaluated by using hybrid and parallel dependent-task applications. The state-of-the-art evolutionary and swarm intelligence based algorithms for han-

dling single/multi-criteria scheduling of user jobs in grid and cloud environment are still evolving, with efforts aimed at reducing their space/time complexity, maintaining diversity in the population and directing the search towards the true Pareto-optimal solutions. We have proposed a new improvised MOBFOA. The improvisation is in terms of selecting bacteria positions from both the dominant as well as non-dominant fronts to obtain diversity in the solutions obtained. The accuracy and speed of convergence of the BFOA has been improved by introducing adaptive step size in chemotactic step. The proposed MOBFOA uses new fitness assignment method and bacteria selection procedure for simultaneous optimization of multiple objectives, where each solution evaluation is computationally expensive. 7

Page 8 of 36

3. Problem formulation

ip t

The problem definition consists of the following: a) A set of m resources and n independent tasks/jobs need to be scheduled, b) The tasks have no dependency, hence can be executed in any order, c) The task size is expressed in MI (Million Instructions)

cr

and the resource capacity is measured in MIPS (Million instructions per second), d)

The expected time to compute (ECT) of each task on a particular processor is measured

us

by dividing the task size by the resource capacity (for example, if task size is 10, 000 MI and resource capacity is 1000 MIPS then ECT would be 10 seconds (10000/1000). We formulate the independent job scheduling problem as a multi-objective opti-

an

mization problem in which a group of conflicting objectives are simultaneously optimized. We have used the Pareto-based approach in our proposed MOBFOA. In Pareto-

M

based multi-objective optimization, the distance to the Pareto-optimal front is required to be minimized, while the diversity among the generated solutions is maximized. There are three conflicting objectives: minimization of schedule length (makespan),

te

subsequent subsection.

d

minimization of flowtime and minimization of resource usage cost as explained in the

3.1. Objective functions

Ac ce p

The work presented in this paper focuses on achieving near-optimal scheduling solution on the objective functions mentioned below. Minimizing flowtime: The flow time of a set of jobs is the sum of the finalization times

of all the jobs. If St is the time required to finalise task t and N denotes the set of all jobs to be scheduled then the objective of scheduling algorithm is to minimize the  P average flowtime, that is, min (1/N ) t∈N St .

Minimizing makespan: Makespan is the time between the start time of the first job/task and the end time of the last job in a schedule. Let n be the total number of tasks T = {t1 , t2 , . . . , tn }. If Si is the start time of the first task and Fi is the end/finish time of the last task in the ith schedule then the schedule length is defined as the total time span T Si between Si and Fi for the ith schedule. The problem is to minimize the makespan across all the schedules, that is, min {T Si }, T Si ∈ Schedules.

8

Page 9 of 36

Minimizing resource usage cost: The third fitness function considered in our research work is minimization of the cost of using the resources. The usage cost ECij of the

ip t

task ti on resource rj is based on the size li (in MI) of the task, the processing capacity

Pj (in MIPS) of the resource and the cost Cj (per unit time) of the using the resource.

cr

It can be defined as ECij = (li /Pj ) ∗ Cj .

We have formulated the scheduling problem as a bi-objective optimization prob-

us

lem, where two objectives are taken at a time for optimization. In case one, makespan and flowtime are minimised simultaneously. In second case, makespan and resource usage cost are minimised and in third case, average flowtime and cost are minimised.

an

Both functions in each case are given equal priority and minimised simultaneously. There is no single optimal solution for each case but rather a set of potential solutions, which are all optimal in some objectives. The criteria for performance evaluation are

M

distance metric, box-plots, spread metric and statistical analysis. The distance metric evaluates the nearness of the non-dominated solutions obtained by an algorithm to the reference Pareto-optimal front. The box-plots and spread metric provide the informa-

d

tion on diversity in solutions and distribution of solutions in the search space. The

te

statistical analysis provides insight into the performance of each algorithm in compar-

Ac ce p

ison to other algorithms in the fitness space.

4. Proposed multi-objective BFOA (MOBFOA) The bacterial foraging optimization algorithm, inspired from the social foraging be-

havior of E.coli bacteria, is mainly governed by the four processes: chemotaxis, swarming, reproduction and elimination dispersal [25]. We have used an adaptive chemotaxis approach that helps the bacteria to explore the search space by taking larger chemotaxis steps in the beginning and smaller at the latter stages. In the conventional BFOA approach, the reproduction step involves selecting the top 50% current bacteria positions based on their health followed by replicating these selected bacteria positions. The adapted reproduction step in our approach considers both the previous as well as the current bacteria positions to select the next-generation bacteria using a new fitness assignment and ranking strategy. The improvisations proposed to the original BFOA

9

Page 10 of 36

to handle multi-objective problems are summarised below:

ip t

1. A mechanism to retain earlier (before chemotaxis) as well as current bacteria positions (after chemotaxis) and computing the fronts for the combined bacteria

cr

population.

2. Front-wise selection of bacteria (in the reproduction step) in order to prefer non-

us

dominated solutions over the dominated solutions for the next generation.

3. A mechanism to maintain diversity (using crowded distance measure proposed

an

by Deb et al. [26] in the bacteria population.

4. A mechanism to retain the same size of population by discarding the solutions with low fitness values (here the fitness value represent the strength of each so-

M

lution) 4.0.1. MOBFOA steps

d

Step 1. Create a random population θi of size B and initialise the required parameters Nc , Ns , Ned , Nf , Ped and S(i), i = 1, 2, . . . B. Here, B is the population

te

size, Nf is the total number of objective functions, Nc represents the number of chemotactic steps, Ns is the swim length, Ned is the number of elimination-

Ac ce p

dispersal steps, Ped represents the probability of elimination-dispersal, S(i) represents the size of the steps taken in the random direction specified by the tumble, θi is a point in the p-dimensional search space for the ith bacterium.

Step 2. Elimination-dispersal loop: el = el + 1. Step 3. Chemotactic loop: c = c + 1. For i = 1, 2, . . . , B, take a chemotactic step for the ith bacterium as follows: a) Calculate fitness function Ci (f, c, el) for all objective functions f = 1, 2, . . . , Nf .

b) Let Clast (f ) = Ci (f, c, el) since a better fitness may be found during the chemtotaxis process. c) Tumble: Generate a random vector ∆i ∈ Rp with each element ∆i (k) ∈ [−1, 1], k = 1, 2, . . . , p. 10

Page 11 of 36

d) Make movement with a step size S(i) for ith bacterium in direction of the

ip t

tumble. ∆i θi (c + 1, el) = θi (c, el) + S(i) p t ∆i ∆i

(1)

cr

e) Compute Ci (f, c, el) for all objective functions f = 1, 2, . . . , Nf .

us

f) Swim • Let m = 0 (Initialize the swim length counter) • While m < Ns

an

1) Let m = m + 1

2) If Ci (f, c+1, el) ≤ Clast (f )(if dominated), let Clast (f ) = Ci (f, c+ 1, el) and allow the bacterium to swim further; Use Eqn.2 and

M

θi (c + 1, el) to compute the new Ci (f, c + 1, el). ∆i ∆ti ∆i

(2)

d

θi (c + 1, el) = θi (c + 1, el) + S(i) p

te

• Else let m = Ns to prevent the bacterium from swimming further. g) If i 6= B, go to sub Step 3(a) to process the next bacterium.

Ac ce p

Step 4. Non-dominated sorting to select the better ranked solutions for the next iteration.

a) Prepare composite population by combining the new solutions (Pc+1 ) with the old ones (Pc ) as follows: R = Pc ∪ Pc+1

b) Perform non-dominated sorting of R and identify different fronts: Ft , t = 1, 2, . . . .

c) Rank the composite population of solutions R based on the ranking strategy (using fitness values).

d) A new population Pc+1 = ∅ of size B is created from the composite population R by discarding the inferior solutions from the bacteria population. Step 5. If c < Nc , move to Step 3 to continue the chemotaxis. 11

Page 12 of 36

Step 6. Elimination-dispersal: For each i = 1, 2, . . . , B, the ith bacterium is dispersed

ip t

to a random location in the solution space with a pre-defined probability Ped . Step 7. The step size S(i) of the ith bacterium is decreased during iterations using

cr

equation S(i) = S(i)/(el + 1).

Initialization of the population: After initializing the user-specified parameters (e.g., population size, number of objective functions, maximum number of solution evalua-

us

tions, etc.), the initial bacteria population is created randomly, where each bacterium represents a potential solution in the search space. A sample bacteria position depicting

an

task-resource mapping is shown in Table 1.

Bacterium position vector (1, 2, 2, . . . , 3, 1, 4) (in terms of resource indices in second row) T2 R2

T3 R2

T4 R1

T5 R3

T6 R3

T7 R2

T8 R3

T9 R1

T10 R4

M

T1 R1

Table 1: A bacterium representing task-resource mapping

d

Chemotaxis process: In chemotaxis the new position of each bacterium is obtained

te

by adding the tumble weight to the old position. In Eqn. 1, the scalar quantity S(i) indicates step size taken by the bacterium in a random direction defined by the unit length vector ∆i . If the cost at position θi (c + 1, el) is better than that at its preceding

Ac ce p

position θi (c, el), the bacterium will continue to take successive steps of size S(i) in the same direction ∆i , otherwise it would tumble in the other direction. We have

used an adaptive step-sized approach for chemotaxis process. The BFOA with a fixed chemotaxis step size S(i) may suffer from two problems: • If the step size is too large, the bacterium may reach near the Pareto-optimal front quickly, in which case it may not swim further to move towards the true Pareto-optimal front.

• If the step size is too small, the bacterium may take large number of chemotaxis iterations to reach the Pareto-optimal front or may converge to local minima. In order to avoid the above-mentioned problems, we have used an adaptive step sized based chemotactic process. In the beginning, the bacteria explores the search 12

Page 13 of 36

ip t cr

Figure 3: Convergence without adaptive step size

us

Figure 2: Convergence with adaptive step

space with large step size. During the elimination-dispersal iteration, the step size is decreased so that the bacteria near the non-dominated positions exploit or refine the

an

solutions. This process guides the search towards the true Pareto-optimal front. The initial step size is taken as 1.8 in our proposed MOBFOA, which keeps decreasing during the four elimination-dispersal events. The effect of adaptive chemotaxis is shown

M

in Fig. 2 in bi-objective plane where the proposed MOBFOA produces better extended trade-off curve as compare to the solutions obtained by MOBFOA using fixed step size(0.8) for chemotaxis process as shown in Fig.3.

d

Composite population and identification of fronts: The composite population R con-

te

sists of the solutions before and after the chemotaxis process to ensure the presence of elite solutions in the bacteria population. To identify the fronts, for each bacterium (so-

Ac ce p

lution) two quantitative measures are considered: a) domination count np , the number of bacteria solutions which dominate the solution p, and b) Sp , a set of bacteria solu-

tions that are dominated by solution p. All bacteria solutions in the first non-dominated front will have their domination count np as zero. Now, for each solution p with np = 0 , each member (q) of its set Sp is considered and its domination count is reduced by one. If this domination count becomes zero, q is placed in a separate list Q. The members (solutions) in Q belong to the second non-dominated front. This procedure is repeated with each member of Q and the third front is obtained. This process is repeated until all the fronts are obtained. The schematic of selecting the fronts is given in Algorithm 1. Selection of bacteria: The selection of bacteria from the composite population R is done by calculating the fitness of each bacterium and using the ranking strategy. The

13

Page 14 of 36

M

an

us

cr

ip t

Algorithm 1: Non-dominated sorting (P) and identification of fronts [26] 1: For each p ∈ P 2: Sp = φ 3: np = 0 4: For each q ∈ P 5: If (p ≤ q) then if p dominated q 6: Sp = Sp ∪ q Add q to the set of solutions dominated by q 7: else if (q ≤ p) then 8: np = np + 1 Increment the domination counter of p 9: If np = 0 then p belongs to the first front 10: prank =1 11: F1 = F1 ∪ p 12: i=1 Initialize the front counter 13: while F1 6= ∅ 14: Q=∅ Used to store the solutions of the next front 15: for each p ∈ Fi 16: for each q ∈ Sp 17: for each nq = nq − 1 18: if nq = 0 q belongs to next front 19: qrank = i + 1 20: Q=Q∪q 21: i=i+1 22: Fi = Q

fitness assignment to each bacterium considers both the dominated and the dominat-

d

ing solutions for a given bacterium solution yi and the strength of each bacteria. The strength S(yi ) of each bacterium yi in R is the number of bacteria (solutions) it dom-

te

inates and is given by S(yi ) = |{xj | xj ∈ R ∧ yi  xj }|, where |.| denotes the cardinality of a set, and symbol  corresponds to the Pareto dominance relation. On

Ac ce p

the basis of strength values, the fitness F (yi ) of each bacterium is calculated using Eqn 3. Hence, the fitness value of yi is equal to the summation of the strength values of

all solutions it dominates minus the summation of the strength values of all dominant solutions (that dominates this current solution).

F (yi ) =

X

yi xj

S(xj ) −

X

S(xk ), ∀ yi , xj , xk ∈ R ; i 6= j and i 6= k.

(3)

xk yi

The bio-inspired algorithm SPEA2 [27] considers the strength values of only those

solutions that are dominated by yi , whereas we are considering both the dominating and the dominated solutions corresponding to bacterium solution yi . This mechanism provides more information on the Pareto dominance. Although our strategy of fitness assignment provides a niching relations among solutions in the composite population

14

Page 15 of 36

but it may fail when most solutions do not dominate each other (i.e., belong to front 1). In that case, crowding distance approach is used to maintain the diversity in the solu-

ip t

tions.

After obtaining the fitness values of all the bacteria solutions in R, a total of N

cr

(population size) bacteria (solutions) are selected for the next generation using the following method:

us

1. If the bacteria at front 1 are N or more, then N bacteria are selected on the basis of crowding distance measure (i.e., first N bacteria from a list of bacteria sorted on the ascending values of crowding distance). If front 1 have lesser than N

an

bacteria, the remaining bacteria are selected from the subsequent fronts using the fitness values.

M

2. To select the remaining bacteria, the bacteria from the subsequent fronts are sorted in descending order of their fitness values and the remaining bacteria are

d

selected accordingly (based on better fitness values). Crowding distance: The crowding distance value of a particular solution is the average

te

distance of its two neighboring solutions [26]. The crowding-distance computation requires sorting the bacteria population according in each objective function value in the

Ac ce p

ascending order of magnitude. Thereafter, for each objective function, the boundary solutions (solutions with the smallest and the largest function values) are assigned an infinite distance value. All other intermediate solutions are assigned a distance value equal to the absolute normalized difference in the function values of two adjacent solutions. That is, for an objective function f , if θi is the ith bacterium (with an objective function

i i value f (θi )), θright and θlef t are its two neighbouring positions, fmax and fmin are the

largest and smallest objective function values seen in the population, then the crowding i i distance in the f th objective for bacterium θi is |f (θright ) − f (θlef t )|/(fmax − fmin ).

This calculation is continued with other objective functions as well. The overall crowding-distance value is determined as the sum of individual distance values with respect to each objective. Each objective function is normalized before obtaining the crowding distance. In our proposed MOBFOA, crowding distance is applied if the bac-

15

Page 16 of 36

teria at front 1 are more than N (pre-defined population size), which helps to remove the solutions in crowded region and preserves the diversity.

ip t

Elimination-dispersal: In elimination-dispersal, the bacteria are stochastically selected for elimination based on a probability Ped and are replaced by new bacteria positions.

cr

For example, if the elimination-dispersal probability is 0.3, then 30% of the bacteria will be selected on a random basis for elimination and will be replaced by newly gener-

us

ated bacteria. This step allows the bacteria to explore the search space to obtain newer and better solutions, if any.

Search termination criterion: Many stopping strategies have been proposed in state-

an

of-the-art approaches. These approaches may consider the desired quality of the solutions, the predefined number of evaluations and the desired computation time. MOBFOA terminates when the number of non-dominated bacteria solutions reaches a pre-

M

defined population size and no changes are observed in the number of non-dominated bacteria for a certain number of solution evaluations. Table 2 displays the control parameters used in MOBFOA. Parameters

Value

Bacteria Population Maximum number of swimming steps Ns Number of chemotaxis steps Nc Magnitude M Number of elimination-dispersal steps Ned Probability Ped Size of the chemotaxis step s(i)

Ac ce p

te

1 2 3 4 5 6 7

d

Sr. No.

10 3 20 20 4 0.25 1.8

Table 2: Parameter values used in MOBFOA

5. Other multi-objective optimization algorithms (MOOAs) The scheduling of independent tasks in parallel and distributed computing is NP-

hard problem [28]. The evolutionary and swarm intelligence based algorithms have been widely used to solve multi-objective optimization problems, where there is a need to optimize several objectives simultaneously. They have also been used to address multi-objective scheduling problems in grid. In this paper, we focus on applying two well-known multi-objective optimization algorithms (NSGA-II and OMOPSO) along 16

Page 17 of 36

with our proposed MOBFOA to schedule independent tasks considering optimization

ip t

of several objectives, simultaneously. 5.1. NSGA-II based scheduling of independent tasks

cr

The multi-objective version of the genetic algorithm has been proposed by Deb et. al. [26] as Non-dominated Sorting Genetic Algorithm (NSGA-II). The NSGAII is a robust optimization technique that provides a high-quality solutions that can

us

even derived from a large search space [17]. The scheduling solutions in the problem search space are represented by a set of individuals or chromosomes. Each individual

an

is evaluated on the basis of a fitness function. The detail description of NSGA-II based scheduling is given below:

Initial population: The initial population consisting of individuals/chromosomes is

M

generated randomly. The parameters required in NSGA-II are also initialized with appropriate values. A chromosome consists of a string of resource IDs corresponding to each job. It is also associated with other parameters such as objective function, start

te

value.

d

time of the job, end time of the job, rank of the chromosome and the crowding distance

Non-dominated sorting and identification of fronts: An individual a is said to be nondominated if there does not exist any other individual b ∈ X in the search space that

Ac ce p

dominates a. In NSGA-II, a set of such non-dominated individuals is called Paretooptimal front. All individuals that belong to the first non-dominated Pareto-optimal front are assigned rank 1. After ignoring this front, a second Pareto-optimal front can be obtained and assigned rank 2, and so on [17]. The NSGA-II sorts the population according to the ranks of the chromosomes. After sorting the population, it is desirable to maintain diversity in the solutions. In NSGA-II, a crowding distance is applied on the sorted population to maintain the diversity. The schematic of obtaining the non-dominated solutions is displayed in Fig. 4, where Rt represents the composite population. Pt is the parent population and Qt represnts the offsprings. The Pareto fronts are represented as F1 , F2 , . . . , Fn . The selected chromosomes are represented as Pt+1 .

17

Page 18 of 36

ip t cr us

Figure 4: Schematic of non-dominated sorting and identification of fronts

an

Crowding distance: As explained earlier, the crowding distance (disty ) of any chromosome y (with neighbours ylef t and yright ) in the population represents the density of individuals surrounding it and is given by Eqn.4, where fi is the ith objective function,

M

n is the total number of objectives, and fimax and fimin are the largest and smallest objective function values seen in the population.

n X |fi (ylef t ) − fi (yright )|

d

disty =

fimax − fimin

(4)

te

i=1

Based on the crowding distance measure a crowded comparison operator can be

Ac ce p

defined as a partial order ≺ between the chromosomes (using non-domination rank ry and crowding distance disty ), that is, a ≺ b if ra ≺ rb or (ra = rb ) and (dista  distb ). If two solutions belong to different non-domination ranks, the solution with the lower rank is preferred. If both the solutions have same rank then a chromosome from less crowded region would be preferred. NSGA-II operators: The following operators are used in NSGA-II implementation. a) Selection procedure: The selection procedure specifies method to select individuals for mating to generate the offsprings. The idea is to select better individuals, which can pass on their genes to the next generation. In our study, the binary tournament selection is used to select the individuals from a given population for reproduction. In this selection procedure, two tournaments are conducted. In the first tournament, two individuals are selected on a random basis from the current population, and

18

Page 19 of 36

the individual with better fitness value is chosen to be a parent as shown in Fig. 5. This method is repeated for second tournament to obtain the second parent. These

ip t

selected parents (individuals) then go for crossover. The binary tournament selection procedure is repeated until required number of individuals or chromosomes

an

us

cr

(solutions) are generated.

M

Figure 5: Binary tournament mechanism [29]

b) Crossover: Crossover operator is used to find new solutions in the search space.

d

We have implemented a two-point crossover for the task-resource mapping string,

te

which forms an individual as shown in Fig. 6, where two random points are chosen from the task-assignment strings of the selected parents to prepare a crossover win-

Ac ce p

dow. In crossover operation, two new offspring/children are produced from the two parents as shown in Fig. 7.

Figure 6: Chromosome before two-point crossover

Figure 7: Chromosome after two-point crossover

c) Mutation: The mutation operation is applied to a single individual based on the mu19

Page 20 of 36

tation probability after crossover operation. A single chromosome is selected for the mutation and decided by the mutation probability measure denoted by Pm . We have

ip t

used inversion mutation where two random points in a chromosome (scheduling so-

lution) are selected to form a window and resources at these points are reversed. In

cr

this way a new chromosome is formed. For example, job J3 is allocated to resource R4, J4 to R3 and J5 to R2 as shown in Fig. 8. After performing the mutation opera-

an

us

tion, J3 is assigned to R2, J4 to R3 and J5 to R4 as shown in Fig. 9.

M

Figure 8: Chromosome before applying inverse mutation

d

Figure 9: Chromosome after applying inverse mutation

te

The parameter settings for control operators for NSGA-II are shown in Table 3. Parameters Crossover Crossover Probability Pc Mutation Type Mutation Probability Pm Population Size Selection Maximum generations

Ac ce p

Sr. No 1 2 3 4 5 6 7

Type/Value Two-Point 0.8 Inversion 0.5 10 Binary-tournament 100

Table 3: Parameter values used in NSGA-II

5.2. OMOPSO based scheduling of grid jobs The optimised multi-objective particle swarm optimization (OMOPSO) is an improved multi-objective optimization technique proposed by Sierra et al. [10]. Its main features include the usage of an external archive, which retains the non-dominated solutions. The crowding distance measure is used to filter out the solutions in crowded

20

Page 21 of 36

regions [26]. The OMOPSO uses an archive like in SPEA2 [27] to store the best solutions found during the search. This archive makes use of -dominance to limit the

ip t

number of solutions stored.

Initial swarm formation: The swarm initially contains randomly generated particles

cr

(with zero speed). The position of each particle is a n-dimensional vector of resource IDs, where n represents the total number of jobs to be scheduled. The job sequence

us

j1 , j2 , . . . , jn remains fixed, while the resource IDs (on which the jobs get mapped) vary. Each particle is also associated with the following parameters: speed of the particle, the fitness values with respect to the objective functions, and the start and end

an

time of the job.

Evaluation of swarm: The generated swarm is evaluated with respect to the objective functions. The better the value of an objective function, better would be the position of

M

a particle in a given swarm. Since we are considering multiple objectives for evaluation of solutions using Pareto-optimal approach, rather than obtaining a single solution, all non-dominated particles are considered to be the solutions.

d

Selection of leaders: The selection of particles is made on the basis of Pareto-optimal

te

fronts. Pareto fronts dictate the ranking of solutions. Front 1 contains all the nondominated solutions, front 2 contains solutions that are dominated by front 1 and dom-

Ac ce p

inate all the remaining fronts (front 3, to front n). The non-dominated solutions are considered as leaders in OMOPSO and are stored in separate archive. The archive size is kept same as the population size ’N’. A ‘crowding distance’ is used to maintain this size.

Velocity vector: In each iteration, the velocity of the particles keeps changing and is expressed as vik+1 = ωvik +C1 r1 (xpbesti −xki )+C2 r2 (xgbesti −xki ). The inertia factor ω determines the impact of previous velocities on the current velocity. The cognitive parameter C1 controls the impact of previous personal best position of a particle found so far. The C2 parameter controls the impact of global best position. Position vector: The new position of a particle is obtained by adding the magnitude of the velocity vector to the old position (xk+1 = xki + vik+1 ) (assuming iteration i step is one time unit). The position vector also works like chemotaxis process and allows the particles to change their positions by changing job-resource mapping. In our 21

Page 22 of 36

scheduling problem, the job-resource allocation is changed after applying the position vector on particles of swarm as shown in Table 4 and Table 5. The elements of new

ip t

position vectors are checked for consistency, that is, the new resource IDs should lie in a specified range.

Job2 2

Job3 2

Job4 1

Job5 5

Job6 3

Job7 2

Job8 3

Job9 1

Job10 4

us

Job1 1

cr

Particle position vector (1, 2, 2, . . . , 3, 1, 4) (in terms of resource indices in second row)

Table 4: An old position of the particle

Job2 2

Job3 1

Job4 3

Job5 4

Job6 3

Job7 2

Job8 5

Job9 4

Job10 3

M

Job1 3

an

Particle position vector (1, 2, 2, . . . , 3, 1, 4) (in terms of resource indices in second row)

Table 5: A new position of the particle

Mutation operator: The OMOPSO makes use of mutation operator that accelerates the

d

convergence of the swarm. It allows the particles to explore new solutions in the search

te

space. Sometimes the velocity of the particles become almost zero and it is difficult for them to move from their current positions any further. The mutation operator helps to

Ac ce p

accelerate the speed of the particles. The OMOPSO applies a combination of uniform and inversion mutations to the particles. The mutation operators are applied using a pre-defined probability (0.2% in our case study). To perform mutation the swarm is subdivided in three equal sized parts. Each sub-part adopts a different mutation scheme: the first sub-part does not apply mutation at all, the second sub-part uses uniform mutation and the third sub-part uses inversion mutation. In uniform mutation, the resource number for a randomly selected job is replaced with a uniform random number generated between the lower and higher index number of resources as shown in Fig. 10 and Fig. 11. In this way a new particle is formed. Stopping criterion: The termination criterion considered in our work is attaining of the maximum number of iterations. The control parameter values used in PSO are listed in Table 6 which are based on the empirical study.

22

Page 23 of 36

ip t cr

Figure 10: Particle before mutation

Values 1.5 2.5 0.95 0.1-0.9 0.5-0.9 10 10

an

Parameters C1 C2 ω r1 r2 Swarm size Archive size

M

Sr. No. 1 2 3 4 5 6 7

us

Figure 11: Mutated particle after applying uniform mutation

te

6. Results and Discussion

d

Table 6: Parameter values used in OMOPSO

Ac ce p

In our research work, the bi-objective optimization cases are studied with respect to objective functions (makespan, flowtime, and resource usage cost). In each case, two scheduling criteria are minimized simultaneously using the three stochastic algorithms (NSGA-II, OMOPSO and MOBFOA). The attempt has been made to optimize the conflicting scheduling criteria simultaneously without compromising on any of the criterion. In our work, we consider 500 jobs to be scheduled on 15 resources, resulting in a large search space or job-resource combinations (15500 ). The convergence of the solutions towards the Pareto-optimal front can be inter-

preted graphically or by using generational distance metric as suggested by [30], provided the true Pareto-optimal front is known in advance. We have used graphical method for interpretation since the true Pareto front is not known in advance. The spread or distribution of solutions obtained for each objective function can be shown by a spread metric [26] or Box-and-Whisker-plots. The interpretation of the parame23

Page 24 of 36

ters used in boxplot are as follows: a) the minimum and maximum values represent the lower and upper values obtained for the objective function, b) the range (difference be-

ip t

tween the minimum and maximum values) represents the spread of obtained solutions,

c) the inter-quartile range gives the extent to which the central 50% of the objective

cr

function values are dispersed (w.r.t. the median), d) the median displays the mid of the objective function values within inter-quartile range, and e) the mean represents the

us

average of the objective function values.

In our work we have considered multi-criteria optimization of three bi-objectives: makespan versus flowtime, makespan versus resource-usage cost and flowtime versus

an

resource-usage cost. The Pareto-optimal solutions obtained by the three multi-objective optimization algorithms (MOOAs) for the makepsan and flowtime objective functions are depicted in Fig. 12. The distribution of the solutions can be observed from the

M

corresponding boxplots. Fig. 13 depicts the spread of solutions w.r.t. makespan, while Fig. 14 depicts the spread of solutions w.r.t. average flowtime. Table 7 displays the

Ac ce p

te

d

parameter values of the boxplots.

Figure 12: Pareto-optimal solutions w.r.t. makespan and flowtime

It can be observed from Fig. 12 and Table 7 that the solutions obtained using MOB-

FOA are better in terms of convergence to Pareto-optimal front and diversity than those obtained by OMOPSO and NSGA-II. The results of the boxplots (Fig.13 and Fig.14) show that MOBFOA produces better results with respect to both the conflicting objectives makespan and flowtime, while OMOPSO and NSGA-II have failed to span their search over the search space with respect to flowtime. In fact the solutions obtained

24

Page 25 of 36

ip t cr

Makespan OMOPSO 903 399 504 643 769 520 249 640

NSGA-II 883 410 473 647 787 501 286 644

us

Max Value Min Value Range (Max-Min) Mean Q3 Q1 Interquartile range (Q3-Q1) Median

MOBFOA 900 380 520 635 770 495 275 632

M

Parameters

Figure 14: Boxplot for flowtime values

MOBFOA 280 170 110 225 253 196 57 225

Flowtime OMOPSO 282 189 93 235 259 209 50 233

an

Figure 13: Boxplot for makespan values

NSGA-II 284 204 80 248 269 227 42 249

d

Table 7: Boxplot values for makespan and flowtime

by MOBFOA have significantly better diversity than those obtained by NSGA-II since

te

its fitness assignment mechanism is based on diversity maintenance strategy. In addition, MOBFOA gives a better distribution of the solutions that may aid in convergence

Ac ce p

towards global minima. The OMOPSO is the second best algorithm in terms of convergence towards Pareto-optimal front and distribution of solutions. Note that makespan is related to throughput, which may lead to long waiting time for some jobs resulting in increase in the flowtime. Thus, minimization of makespan results in maximization of flow-time; implying that the relationship between minimizing makespan and flow-time is conflicting or contradictory. For the second bi-objective case study, the Pareto-optimal solutions obtained by

the three MOOAs with respect to makepsan and resource-usage cost objective functions are displayed in Fig. 15. The distribution of the solutions can be observed from the corresponding boxplots. Fig. 16 depicts the spread of solutions obtained by three MOOAs w.r.t. makespan, while Fig. 17 depicts the spread of solutions w.r.t. resource-

usage cost. Table 8 displays the parameter values of the boxplots.

25

Page 26 of 36

ip t cr us

M

an

Figure 15: Pareto-optimal solutions w.r.t. makespan and resource-usage cost

Makespan OMOPSO 903 395 508 644 773 497 276 645

te

Parameters

Figure 17: Boxplots depicting resource-usage cost

d

Figure 16: Boxplots depicting makespan values

Ac ce p

Max Value Min Value Range (Max-Min) Mean Q3 Q1 Interquartile range (Q3-Q1) Median

MOBFOA 907 372 535 633 770 490 280 630

NSGA-II 899 406 493 650 775 516 259 648

MOBFOA 284 167 117 222 251 188 63 223

Cost OMOPSO 290 181 109 233 260 205 55 235

NSGA-II 294 196 98 251 277 227 50 255

Table 8: Boxplot values for makespan and resource-usage cost

Fig. 15 shows that MOBFOA produces a more extended trade-off curve in compar-

ison with those obtained by the OMOPSO and NSGA-II. The results of the boxplots (Fig. 16) show that all three MOOAs have produced solutions with good diversity w.r.t.

makespan while the results of the boxplots (Fig. 17) depicts that the solutions obtained by the OMOPSO and NSGA-II w.r.t. resource-usage cost objective function are away from its minimum value in comparison with those obtained by the MOBFOA. In order

26

Page 27 of 36

to reduce makespan, the jobs are allocated to such resources which can complete them at the earliest. Note that the resource with lesser usage cost may not be the fastest re-

ip t

source to compute the job and the allocation of such resource to the job may results in

increase in the makespan time. MOBFOA produces better results with respect to both

cr

the conflicting objectives, makespan and resource-usage cost.OMOPSO is the second best to locate well distributed Pareto-optimal solutions.

us

For the third case study, the Pareto-optimal solutions obtained by the three MOOAs w.r.t. flowtime and resource-usage cost objective functions are displayed in Fig. 18. The distribution of the solutions can be observed from the corresponding boxplots

te

d

M

an

(Fig. 19 and Fig. 20). Table 9 displays the parameter values of these boxplots.

Ac ce p

Figure 18: Pareto optimal solutions w.r.t. flowtime and cost

Note that the resource which computes the job in minimum flowtime may be the

most expensive one. On the other hand, the resource with minimum usage cost may increase the flowtime of the job. All the three MOOAs have attempted to find the non-

Figure 19: Boxplots depicting flowtime correspond-

Figure 20: Boxplot depicting cost corresponding to

ing to 3 MOOAs values

3 MOOAs values

27

Page 28 of 36

Flowtime OMOPSO 291 184 107 234 266 205 61 236

NSGA-II 287 188 99 236 260 211 49 238

MOBFOA 277 171 106 220 251 192 59 220

Cost OMOPSO 287 188 99 233 257 209 48 231

NSGA-II 282 198 84 242 263 220 43 246

ip t

Max Value Min Value Range (Max-Min) Mean Q3 Q1 Interquartile range (Q3-Q1) Median

MOBFOA 290 170 120 230 262 198 64 228

cr

Parameters

us

Table 9: Boxplot values w.r.t. flowtime and resource-usage cost

dominated solutions in the conflicting bi-objective plane. It can be observed that the

an

solutions obtained using MOBFOA are better in terms of convergence and diversity w.r.t. both the conflicting objective functions. The OMOPSO is the second best in maintaining the diversity and convergence towards the Pareto-optimal front.

M

Apart from the above mentioned simulated results, the three MOOAs have been tested in real grid environment where 80 jobs are considered for empirical study. The size of each job is between 10,000 and 50,000 MI, whereas the output data size of each

d

job ranges between 1 to 5 MB. The number of available resource are 5 with the capacity

te

of 2000 to 10,000 MIPS. The execution of jobs on a particular resource depends on the

Ac ce p

current load of the resource.

Figure 21: The makespan and avg. flowtime of 80 jobs on 5 resources

The proposed MOBFOA has obtained extended trade-off solutions as shown in Fig.21 which are closer to the Pareto optimal front as compare to other MOOAs even in real grid environment due to its novel evaluation and selection strategies. NSGA-II 28

Page 29 of 36

evaluates population on the basis of the values of each fitness function. It ranks all solutions and selects the non-dominated solutions in the prevailing population. OMOPSO

ip t

creates an external archive to retain selected particles for the next generation and first

copies non-dominated solutions in current population to the archive. If the size of the

cr

archive is exceeded, the solutions in overcrowded areas are removed from the archive. Unlike NSGA-II and OMOPSO, MOBFOA does not select solutions based on their

us

non-dominated levels at each generation but rather it selects the solutions based on its fitness assignment and ranking strategy. Adaptive chemotaxis approach is one of the main factors resulting in the better quality solutions of MOBFOA over NSGA-II

an

and OMOPSO. The proposed MOBFOA is capable of efficiently directing the search towards Pareto optimal front.

M

7. Performance analysis

In order to produce a comprehensive comparison of the overall quality of the three

d

alternative approaches, each algorithm is run 30 times with changed values and the convergence of algorithm is analyzed. The fine tuning of parameter values is determined

te

on the basis of empirical study for all the three algorithms considered for comparative study. The proposed MOBFOA based scheduling is compared with alternative algo-

Ac ce p

rithms on the basis of three parameters, namely, convergence of the solutions towards the Patero-optimal front, distribution of solutions in the search space, and computation time required to find the solutions. Statistical pairwise comparison of three stochastic algorithms is also presented in this section. Convergence towards Pareto-optimal front: The solutions obtained from the proposed

MOBFOA are compared with those obtained by using NSGA-II and OMOPSO for the three bi-objective cases shown in Fig. 12, Fig. 15 and Fig. 18. The proposed MOBFOA generates more extended trade-off solutions than OMOPSO and NSGA-II in all three cases involving optimization of conflicting objective functions. The MOBFOA explores better non-dominated solutions in comparison with OMOPSO and NSGA-II due to its fitness assignment strategy for selection of bacteria. we have also used the generation distance metric [30] which evaluates the nearness of the non-dominated so-

29

Page 30 of 36

lutions obtained by an algorithm to the reference Pareto-optimal front. The result in Table 10 depicts that the performance of the proposed MOBFOA outperformed over

ip t

OMOPSO and NSGA-II, it produces better convergence than OMOPSO and NSGA-II. OMOPSO is the second best in terms of convergence towards the Pareto-optimal front. Case2

Case3

cr

Case1 OMOPSO

NSGA-II

MOBFOA

OMOPSO

NSGA-II

MOBFOA

OMOPSO

NSGA-II

Mean

0.000924

0.001189

0.001978

0.000867

0.001251

0.001862

0.000757

0.001119

0.001873

Variance

2.89E-07

3.09E-07

3.77E-07

2.67E-07

3.21E-07

3.82E-07

2.32E-07

3.14E-07

3.89E-07

us

MOBFOA

Table 10: Generation distance metric

an

Diversity in solutions: The box-and-whisker plots have been used in previous section for each bi-objective scheduling problem to measure the spread of solutions and the even distribution of solutions in the search space obtained by all the three algorithms.

M

The spread metric [26, 30] is also used to determine the mean and variance of the diversity metric obtained using all three algorithms. It is observed from the box-plots

d

and spread metrics (Table 11) that the non-dominated solutions obtained by MOBFOA are better in terms of diversity, in comparison with those obtained by the OMOPSO

te

and NSGA-II in all three bi-objective case studies. Case1

OMOPSO

NSGA-II

Ac ce p

MOBFOA

Case2

Case3

MOBFOA

OMOPSO

NSGA-II

MOBFOA

OMOPSO

NSGA-II

Mean

0.356986

0.445923

0.386629

0.384133

0.426253

0.476836

0.379886

0.423623

0.416629

Variance

0.004410

0.005650

0.006256

0.003222

0.004649

0.006981

0.003471

0.006390

0.006866

Table 11: Spread metric

Statistical Analysis: Knowles and Corne [31] extended the analysis of multi-objective

optimization algorithms where more than two algorithms are involved in comparative study with the usage of two statistics, Unbeaten and beat all by doing the pairwise

comparisons. Unbeaten is the percentage of the solutions of Prateo front where the

algorithm is not beaten by any other algorithms and beats all is the percentage of the solutions where the algorithm beats all other algorithms. In order to perform this statistical analysis, the Mostats5 toolbox [31] is used. The statistical analysis determine the attainment surfaces of the three MOOAs throughout the fitness space to show the

30

Page 31 of 36

percentage by which each algorithm outperforms the other algorithms. The results in Table.12 show that MOBFOA was unbeaten in 81% of the fitness

ip t

space covered by the three algorithms while in 82.5% of the fitness space it outper-

formed OMOPSO and NSGA-II. The proposed MOBFOA is the best in performance

cr

and OMOPSO is the second in performance as it outperformed NSGA-II in 20.3% of the fitness space. NSGA-II has performed well in part of the fitness space but has not

MOBFOA 81 82.57

OMOPSO 38.6 20.3

NSGA-II 15.72 0

an

Unbeaten(%) Beats All(%)

us

outperformed over MOBFOA and OMOPSO at the 95% confidence level.

Table 12: Table depicting statistical analysis of MOOAs

Computational Complexity: Table 13 shows the computational time of the three MOOAs

M

considered for our study during the three bi-objective simulation cases. The NSGAII takes minimum time for computation. The OMOPSO is the second best in terms

d

of computational time. However, it takes more time than NSGA-II as it uses separate archive to store the non-dominated solutions and applies mutation by dividing the

te

population in three different segments. The MOBFOA takes the maximum amount of computation time among these three MOOAs. It uses fitness assignment technique

Ac ce p

for selection of bacteria, which is computationally more expensive than the crowding distance technique used in NSGA-II and OMOPSO.

Mean Std. Dev.

MOBFOA 242.61 2.93

Case1 OMOPSO 217.53 2.87

NSGA-II 155.72 2.65

MOBFOA 214.61 3.03

Case2 OMOPSO 197.53 2.63

NSGA-II 140.72 2.43

MOBFOA 242.61 2.79

Case3 OMOPSO 217.53 2.18

NSGA-II 155.72 1.65

Table 13: Table depicting computational time taken by MOOAs (in seconds)

8. Conclusion and future directions In this paper, We have proposed a new MOBFOA, which enables visualization of the solution space and progressively evolves the possible solutions to the scheduling problem by adapting certain predefined conditions. Our proposed approach uses ranking strategy to select the leaders (solutions that direct the search) along with crowding 31

Page 32 of 36

distance that preserves the diversity in the solutions. We have used adaptive chemotaxis approach to avoid premature convergence of the MOBFOA and ensure that the

ip t

proposed approach converges towards the true pareto optimal front. The fitness assignment strategy prevents the loss of good solutions when MOBFOA evolves the solutions

cr

for the next generation. The bi-objective grid scheduling case study shows that the non-dominated solutions obtained by MOBFOA are better than those obtained by the

us

OMOPSO and NSGA-II both in terms of convergence and diversity, but at the expense of computational time. The future research could include further modifications to the chemotaxis step to analyse its impact on the convergence of MOBFOA. The fitness

an

assignment based selection process can be improved to reduce the computational complexity of MOBFOA. A comparative study between the proposed MOBFOA with other optimization techniques such as SPEA2 and SMPSO could also be pursued in terms

M

of convergence towards true Pareto-optimal front, diversity in solutions, and computa-

References

d

tional complexity.

te

[1] V. Rajaraman, Grid computing, Resonance 21 (5) (2016) 401–415. [2] A. Wadhonkar, D. Theng, A survey on different scheduling algorithms in cloud

Ac ce p

computing, in: Proceedings of International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB16), IEEE, 2016.

[3] M. Kaur, S. Kadam, Discovery of resources using madm approaches for parallel and distributed computing, Engineering Science and Technology, an International Journal, Elsevier 20 (3) (2017) 1013–1024.

[4] Vijindra, S. Shenai, Survey on scheduling issues in cloud computing, Procedia Engineering 38 (2012) 2881–2888. [5] A. Certa, G. Galante, T. Lupo, G. Passannanti, Determination of pareto frontier in multi-objective maintenance optimization, Reliability Engineering and System Safety 96 (7) (2011) 861–867. 32

Page 33 of 36

[6] G.Chiandussi, M.Codegone, S.Ferrero, F.E.Varesio, Comparison of multiobjective optimization methodologies for engineering applications, Computers and

ip t

Mathematics with Applications 63 (5) (2012) 912942.

[7] Y. Kessaci, N. Melab, E. Talbi, Multi-level and multi-objective survey on cloud

cr

scheduling, in: Proceedings of IEEE International Parallel and Distributed Processing Symposium Workshops, IEEE, 2014.

formation Science 3 (1) (2010) 180184.

us

[8] Q. Bai, Analysis of particle swarm optimization malgorithm), Computer and In-

an

[9] C. A. C. Coello, M. Reyes-Sierra, Multi-objective particle swarm optimizers: A survey of the state-of-the-art), International Journal of Computational Intelligence Research 2 (3) (2006) 287308.

M

[10] M. R. Sierra, C. A. C. Coello, Improving pso-based multi-objective optimization using crowding, mutation and e-dominance, in: Evolutionary Multi-Criterion Op-

d

timization (EMO 2005), Springer, Germany, 2005, pp. 505–519. [11] J. Carretero, F. Xhafa, Use of genetic algorithms for scheduling jobs in large

te

scale grid applications, Technological and Economic Development of Economy

Ac ce p

12 (2006) 11–17.

[12] C. Grosan, A. Abraham, Multiobjective evolutionary algorithms for scheduling jobs on computational grids, International Conference on Applied Computing (2007) 459–463.

[13] F. Xhafa, J. Carretero, A tabu search algorithm for scheduling independent jobs in computational grids, Computing and Informatics 28 (2009) 1001–1014.

[14] A. G. Delavar, J. Bayrampoor, A. Boroujeni, A. Broumandnia, Task scheduling in grid enviroenment with ant colony method for cost and time, International Journal of Computer Science, Engineering and Applications 2 (5) (2012) 1–12. [15] G. Pan, Y. Xu, A. Ouyang, G. Zheng, An improved artificial chemical reaction optimization algorithm for job scheduling problem in grid computing environments, Journal of Computational and Theoretical Nanoscience 12 (7) (2015) 1300–1310. 33

Page 34 of 36

[16] R. Salimi, N. Bazrkar, M. Nemati, Task scheduling for computational grids using nsga ii with fuzzy variance based crossover, Advances in Computing 3 (2) (2013)

ip t

22–29.

[17] D. Paul, S. K. Aggarwal, Multi-objective evolution based dynamic job scheduler

Intensive Systems, Vol. 7, IEEE, 2014, pp. 359–366.

cr

in grid, in: Eighth International Conference on Complex, Intelligent and Software

us

[18] M. Kaur, Elitist Multi-Objective Bacterial Foraging Evolutionary Algorithm for Multi-Criteria based Grid Scheduling Problem, Proceedings of IEEE Interna-

an

tional Conference on Internet of Things and Applications (IOTA)2016, 2016. [19] S. Selvi, D. Manimegalai, Multiobjective variable neighborhood search algorithm

Journal 16 (2) (2015) 199–212.

M

for scheduling independent jobs on computational grid, Egyptian Informatics

[20] F. Zhang, J. C. K. Li, S. U.Khand, K. Hwange, Multi-objective scheduling of

te

309320.

d

many tasks in cloud platforms, Future Generation Computer Systems 37 (2014)

[21] R.K.Jena, Multi objective task scheduling in cloud environment using nested pso

Ac ce p

framework, Procedia Computer Science 57 (2015) 12191227. [22] A. Razaque, N. R. Vennapusa, N. Soni, G. S. Janapati, khilesh Reddy Vangala, Task scheduling in cloud computing, in: Proceedings of IEEE Conference on Systems,Applications and Technology Conference (LISAT), IEEE, 2016, pp. 359– 366.

[23] R.K.Jena, Task scheduling in cloud environment: A multi-objective abc framework, Journal of Information and Optimization Sciences 38 (2017) 119.

[24] G. shun Yao, Y. sheng ding, K. rong Hao, Task scheduling in cloud environment: A multi-objective abc framework, Multi-objective workflow scheduling in cloud system based on cooperative multi-swarm optimization algorithm 24 (5) (2017) 10501062.

34

Page 35 of 36

[25] K.M.Passino, Biomimicry of bacterial foraging for distributed optimization and

ip t

control, Control Systems Magazine 22 (2002) 52–67. [26] K. Deb, A. Pratap, S. Agarwal, Fast and elitist multiobjective genetic algorithm:

cr

Nsga-ii, IEEE Transactions on Evolutionary Computing 6 (2002) 182197.

[27] E. Zitzler, M. Laumanns, L. Thiele, SPEA2: Improving the strength pareto evolutionary algorithm for multiobjective optimization, in: Evolutionary Methods

us

for Design Optimization and Control with Applications to Industrial Problems, International Center for Numerical Methods in Engineering, 2001, pp. 95–100.

an

[28] S. Parsa, R. E. Maleki, Rasa: a new grid task scheduling algorithm), World Applied Sciences Journal 7 (2) (2009) 152160.

M

[29] K. M. Sirasala, Evolving optimal solutions by nature inspired algorithms, Thesis (2012).

[30] D. A. Van Veldhuizen, Multiobjective evolutionary algorithms: Classifications,

d

analyses, and new innovations, Ph.D. thesis (1999).

te

[31] D. Knowles, W. Corne, Approximating the non-dominated front using the pareto archived evolution strategy), Journal of Evolutionary Computation 8 (2) (2000)

Ac ce p

149–172.

35

Page 36 of 36