Asymptotic scheduling for many task computing in Big Data platforms

Asymptotic scheduling for many task computing in Big Data platforms

Information Sciences xxx (2015) xxx–xxx Contents lists available at ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins...

817KB Sizes 1 Downloads 109 Views

Information Sciences xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Information Sciences journal homepage: www.elsevier.com/locate/ins

Asymptotic scheduling for many task computing in Big Data platforms Andrei Sfrent, Florin Pop ⇑ Faculty of Automatic Control and Computers, University Politehnica of Bucharest, Romania

a r t i c l e

i n f o

Article history: Received 9 July 2014 Received in revised form 16 March 2015 Accepted 20 March 2015 Available online xxxx Keywords: Asymptotic scheduling Many-task computing Cloud computing Big Data platforms Simulation

a b s t r a c t Due to the advancement of technology the datasets that are being processed nowadays in modern computer clusters extend beyond the petabyte scale – the 4 detectors of the Large Hadron Collider at CERN produced several petabytes of data in 2011. Large scale computing solutions are increasingly used for genome sequencing tasks in the Human Genome Project. In the context of Big Data platforms, efficient scheduling algorithms play an essential role. This paper deals with the problem of scheduling a set of jobs across a set of machines and specifically analyzes the behavior of the system at very high loads, which is specific to Big Data processing. We show that under certain conditions we can easily discover the best scheduling algorithm, prove its optimality and compute its asymptotic throughput. We present a simulation infrastructure designed especially for building/analyzing different types of scenarios. This allows to extract scheduling metrics for three different algorithms (the asymptotically optimal one, FCFS and a traditional GA-based algorithm) in order to compare their performance. We focus on the transition period from low incoming job rates load to the very high load and back. Interestingly, all three algorithms experience a poor performance over the transition periods. Since the Asymptotically Optimal algorithm makes the assumption of an infinite number of jobs it can be used after the transition, when the job buffers are saturated. As the other scheduling algorithms do a better job under reduced load, we will combine them into a single hybrid algorithm and empirically determine what is the best switch point, offering in this way an asymptotic scheduling mechanism for many task computing used in Big Data processing platforms. Ó 2015 Elsevier Inc. All rights reserved.

1. Introduction The motivation behind solving scheduling problems is the current needs for efficiently processing very large data sets. At CERN, for instance, ATLAS and the three other main detectors at the LHC produced 13 petabytes of data in 2011 [5]. At this scale, even minor optimizations in the scheduling algorithms will result in major improvements in terms of processed data. In the field of meteorology we also find data sets ranging from terabytes to petabytes [25]. Some of the world’s supercomputers are working solely with the purpose of predicting life threatening events such as hurricanes, storms and earthquakes. Examples of applications addressing the problem of Big Data processing are: clustering and data mining [15,19], prediction and analytics [26,27], geographic and satellite data processing [11], etc. There is, indeed, a lot of focus on optimization at all

⇑ Corresponding author. E-mail addresses: [email protected] (A. Sfrent), fl[email protected] (F. Pop). http://dx.doi.org/10.1016/j.ins.2015.03.053 0020-0255/Ó 2015 Elsevier Inc. All rights reserved.

Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

2

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

possible levels: from the hardware platforms, scheduling software to the algorithms that are used to actually process the data [1]. From the economical point of view, good scheduling techniques help production costs to decrease by efficiently taking advantage of the available resources. On certain occasions providing a reliable scheduling framework that is able to deal with a very high volume of jobs per time unit and maximizing function of the executed jobs is directly linked to income. Even major companies have trouble from time to time when users generate too many requests that have to be processed in a short amount of time. The evolution of modern scheduling algorithms has been closely related to discoveries in the area of artificial intelligence, as genetic algorithms, for instance, are largely used nowadays to solve this type of problem. Moreover, GA are often combined with heuristics that were empirically proved to yield better results. As simplicity is a prerequisite for reliability (Edsger W. Dijkstra), our goal is to design a simple algorithm that performs optimally under certain conditions. Such algorithms may be combined into powerful hybrids that are able to solve the exact instance of the scheduling problem that occurs in our system. The general scheduling problem, known as the Job Shop Problem (or JSP), is a problem in discrete or combinatorial optimization and is notoriously hard to solve and has been discussed may times in the literature [3,43]. The problem may be stated as follows: we are given a set of jobs, each one of them consists of multiple operations and each operation needs to be executed on a specific machine, and a set of machines for these jobs to run on; find a planning of the jobs on machines so that all of them complete in the minimum possible amount of time. This problem is, in fact, a generalization of the TSP (Traveling Salesman Problem) and it is known to be in the NP Hard class. This paper addresses the following scheduling problem: a finite set of jobs is given and each job consists of the execution of an undetermined number of tasks. A task is a sequence of specific operations and its length depends on the machine on which it is scheduled to run. We have a cluster consisting of a limited number of heterogeneous machines, one machine can process a specified numbers of operations at a time and preemption is not allowed. Our goal is to maximize the total number of jobs that are processed at up to a certain point. Our model is a version of the JSP – there are always only three stages for each job (Read, Process, Write) and all of them must be scheduled on the same machine. The scheduling problem consists of pairing jobs with machines in such a way that the total value is maximized. We are confronting two separate issues here: the predictability of the system (the capacity of knowing the job rates ahead) and the 0/1 Knapsack problem. The predictability of the system has been discussed multiple times in the literature and several techniques have been developed in order to accurately predict the load of the system or number of tasks that will come at a future point in time. Other approaches include ANN (artificial neural network) [14] approaches and advanced statistical models. In our approach, we remove the prediction issue by assuming well known task rates for each job (we can also see it as a good prediction model that is unspecified). This paper describes the building of a hybrid algorithm to solve the scheduling problem with respect to our model. In Section 2 an overview on related work is presented. In Section 3 we introduce our model and some of the restrictions we impose on the system. The asymptotically optimal algorithm is presented in Section 4 together with a mathematical demonstration of its optimality under heavy load circumstances. In Section 5 we approach the problem of building a hybrid algorithm and investigate the optimal switch points. The simulation framework we built in order to assess the performance of our schedulers is described in detail in Section 6. Lastly, Section 7 presents the results of our experiments and a brief comparison between several scheduling algorithms. We also included, in Section 8, conclusions derived from our work and next research directions.

2. Related work The Job Shop Problem is one of the fundamental problems in the field of job scheduling. Because even for small instances of this problem it is hard to provide exact solutions, most of the research is focused on developing approximation algorithms and heuristics. Some authors have successfully used genetics-based learning systems and even hybrids that combine the advantages of both immune and genetic algorithms [16,35]. Other papers include other approaches such as incomplete search algorithms [4] and ANN (artificial neural networks) [42]. In [42] the author maps the problem to a Hopfield Network that is built in such a way that the energy function minimization will lead to a good solution to the scheduling problem. Another scheduling approach was proposed in [37] for hypergraph-based data link layer scheduling for reliable packet delivery in wireless sensing and control networks with end-to-end delay constraints. Easier problems in the NP set (such as Flow Shop Problem) have been solved by adapting existing approximation algorithms for the Traveling Salesman Problem. Garey showed back in one of his research papers dating from 1976 that Flow Shop Problem is NP-Complete for m P 3 (where m is the number of machines) [10]. This means there is a known polynomial reduction from FSP to TSP and any approximation algorithm for TSP can be used to solve FSP as well. As a side note, the m < 3 condition makes the P algorithm unusable in modern computer clusters. Let’s consider the case of only one machine. Let’s also consider a fraction of our scheduling time T as being a knapsack with a total weight capacity of T units. Each job represents an object type and the actual objects are called tasks. Each job (object) takes a certain amount of time T j to execute on the specified machine (object’s weight). Every task that is completely executed within the time T has an assigned value (object’s value). From the 0/1 Knapsack we learn that we are dealing Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

3

with a NP-Complete problem. This means there is no known algorithm that can efficiently solve our scheduling problem (unless P ¼ NP). The algorithms that try to solve this problem are split in two categories:  Dynamic programming is used to provide exact solutions to 0/1 Knapsack in a pseudo-polynomial time for discrete, strictly positive weights. There are no known algorithms that can exactly solve this problem for the real case with a computational complexity that is less than exponential.  Approximation algorithms there is a wide range of approximation algorithms for this problem. Some of them are based on approximation algorithms for other NP problems, such as TSP (traveling salesman problem) or 3SAT (3 variable Boolean satisfiability problem). Other approaches include optimization techniques based on genetic algorithms or artificial neural networks (for instance Hopfield Networks). There is some focus on researching hybrid algorithms for solving the scheduling problem as well, but most of the papers do not usually include two completely different algorithms, but different techniques that solve some parts of the same problem (for instance, genetic algorithms to optimize the actual scheduling and neural networks to predict the load of the system so we can feed that information in the fitness function of the genetic algorithm), the scheduling optimization being oriented on makespan and/or flowtime [2,22,38]. There is a lot of focus on developing high performance genetic algorithms to find solutions to scheduling problems. GA approaches can be seen as probabilistic searches in the solution space and various heuristics have been implemented to get closer to the optimal solutions. For instance, the PSO (Particle Swarm Optimization) may be used to generate a better initial population for the genetic algorithms [18,33]. Rui Zhang published a paper on a similar idea: a combination between simulated annealing and a genetic algorithm; the authors also build an inference system in order to extract information from the system, which is later used to guide the optimization [40]. Scalable and multi-objective scheduling solutions of many tasks in Cloud platforms were proposed in [7,17,20,32,39]. The problem of dynamic rescheduling based on multiple heuristics was studied and presented in [21,24,34]. However, solutions for the general scheduling problem are not always suitable for real systems. Sinnen raises this issue in one of his research papers, saying ‘‘Experiments revealed the inappropriateness of this classic model to obtain accurate and efficient schedules for real systems’’ [31]. He also states that his model, which takes into account processor involvement in communication has a much higher efficiency. Some of the theoretically efficient algorithms present weaknesses when tested on real systems due to approximation of certain parameters (setup time, network communication, bandwidth) [13]. Another reference for our model is ADJSA (Adaptable Dynamic Job Scheduling Approach), which introduces a new signal into the scheduling algorithm, the job historical information (i.e. from recent executions). A prediction model is continuously adjusted in accordance with ‘‘dynamic and real-time factors of the Grid’’ [8,36]. This can be viewed as a hybrid algorithm consisting of the prediction and decision parts. In an article from 2007 a team designed a system that combines several heuristics to score problem constraints and feeds the scores into a decision module that devises a scheduling plan [6]. The solution described in [12] proposes a static scheduling and some of the issues they raise identify with ours. The fundamental problem with such schedulers is the capacity of the base model to account for every possible outcome. In this paper, we use a combination between FCFS and an asymptotically optimal algorithm (at high job rates) in order to achieve better performances. We also focus on the mathematical determination of the exact points of switch with respect to our assumptions. The precomputed points are, in fact, part of a static scheduling plan (we will follow the switch plan no matter what is the state of the cluster at that time). 3. Proposed model for asymptotic scheduling This section formally states the model we are going to use: we define two entities, Jobs and Machines, and describe their properties. A Job is described by three parameters and the job set has the following formal description:

J ¼ fjjj ¼ ðD; l; t p Þg

ð1Þ

where  D is the amount of data the job has to read from disk before it can start processing;  l is the fraction of the data D which the job produces as an output after the processing has been done. We can also imagine these two parameters as computing time needed by the machine to prepare data (the total amount of data would be Dð1 þ lÞ);  t p is the processing time needed by the job to complete on any machine. We chose this parameter to be machine independent, although it would make little difference. In fact, dij , the time a specific machine needs in order to execute a certain task only depends on the (Job, Machine) peer and it is not influenced in any way by the rest of the state of our system.

The Machine set has the following formal description:

M ¼ fmjm ¼ ðsr ; sw Þg

ð2Þ

Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

4

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx Table 1 Notations used in this paper. Notation

Explanation

bxc jHj j!i

The integral part of x. Cardinality of the set H. Job j 2 J is to be scheduled on machine i 2 M

where

sr is the time needed by a machine to read a unit of data from the disk. For a j 2 J, the expected time needed to read all the data needed by the job is Dj sr ;  sw is the time needed by a machine to write a unit of data to the disk. For a j 2 J, the expected time needed to write out all the data produced by the job is lDj sw .



Note that it is possible to develop a more complex model that still fits our needs. The only restriction for building such a model (or validating an existing one) is that we need to be able to compute the time needed for one job to run on a certain machine depending solely on the machine’s and job’s parameters. This is needed because our problem has to in some way reducible to an instance of the 0/1 Knapsack Problem. If the model is changed in such a way that it introduces further constraints, it may restrict the theory in this paper from application. We are using throughout the paper the term ‘‘cluster’’ to denote a set of (possibly heterogeneous) machines that are connected in a network and can be viewed, in certain respects, as a single system. This is the model of a datacenter, which is specific to Big Data processing. This system has a special requirement, the existence of job buffers. A job buffer can be viewed as a way of storing jobs for later scheduling. The definition does not restrict in any way the physical implementation of the buffers. There may be special machines that take care of this, specialized memory systems, etc., but we will consider that any operation performed on them is a Oð1Þ operation and takes a negligible amount of time. Other operations such as querying the state of a machine (which can be either busy or available), one scheduling step, starting/ending a jobs are considered to be atomic and negligible as well. The mathematical notations used in this paper are presented in Table 1. 4. Asymptotically optimal scheduling A solution to a particular scheduling problem consists of an algorithm that is able to devise a certain arrangement of the available jobs on a given set of machines so a specific metric is optimized. In our case, we are interested in the throughput of the system: we want to achieve the highest possible number of completed tasks per time unit. There are many scheduling algorithms – some of them are simple, such as ‘‘First come first served’’ (FCFS), which manages a job queue and pops one job at a time, scheduling it on one of the available processing units; on the other hand, some of them use more sophisticated approaches to traverse and analyze the problem space and find a better solution: for instance, reinforcement learning (a machine learning technique) has been used to automatically ‘‘learn’’ a good resource utilization that is afterward fed into a genetic algorithm [9]. Another interesting technique is to try to devise algorithms that perform really well under certain conditions (such as heavy load, low job rates, and slowly changing job frequencies) and then combine all of them under a single, powerful scheduler. The crucial piece of code in such cases is the one that chooses between the different available scheduling techniques. We need a reliable mechanism for identifying and scoring each set of condition as they appear (using Markov chain techniques [29,30]). We construct in this paper a hybrid algorithm as the ones discussed in the paragraph above. We will consider that our system already uses a scheduler of its own, so we will identify a set of conditions and build an algorithm that performs almost optimally under these specific conditions. There is, nevertheless, a big difference in our approach: we will not try to also build an empirical method that differentiates between sets of conditions, but we will rely instead on mathematics to deliver the specific points in time when we need to switch between the two schedulers. We will approach the problem of finding an asymptotically optimal algorithm for scheduling. Note that we are talking about asymptotically optimal scheduling at very high loads. That means we will consider an infinite number of jobs of each type, we will then present a scheduling algorithm and prove its optimality under these conditions. As Serafini stated in his paper, the asymptotic bound of scheduling algorithms has not received enough attention from researchers. He established certain limits for the general scheduling problem in [28] and because of the simplicity of our model, we are able to reach them. The immediate relation between the entities defined in our model is given by the needed amount of time for a machine i 2 M to complete a certain task j 2 J. We define dij as follows:

dij ¼ DðjÞ





ðjÞ srðiÞ þ lðjÞ sðiÞ w þ tp

ð3Þ

Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

5

We can see this equation as a compressed version of our model. We will not be using the individual parameters for Jobs/ Machines from this point on, but instead will use dij . This would be impossible, for instance, with a model derived from the Job Shop problem, where operations are allowed to be executed on different machines (the only thing that matters is their relative order inside a job). As we specified that a job must read some data, process it and then write some output on the same machine, blending of all the operations and considering them as one is possible – we can now drop some of the details and proceed using the information that dij encapsulates. The total number of jobs of type j 2 J that can be executed on the machine i 2 M in T time units is

  T dij 6 7 6 7 6 7 T 5  Nij ðTÞ ¼ 4  ðiÞ ðiÞ DðjÞ sr þ lðjÞ sw þ t ðjÞ p

Nij ðTÞ ¼

ð4Þ ð5Þ

Throughout this paper we will work mainly with continuous functions, so the result above is of little interest to us. Also, we are interested in the case when T ! 1. Let us define nij , the rate at which jobs are executed on a certain machine (following the same notation). Thus,

nij ¼ lim

T!1

Nij ðTÞ T

ð6Þ

The following inequality holds:

lim

T dij

1 T

T!1

6 nij 6 lim

T!1

T dij

þ1

ð7Þ

T

We will independently solve the left and right parts of the inequality:

lim

T dij

T

T!1

lim

1

T dij

T!1

¼ lim



T!1

þ1 T

1 1  dij T

 ¼ lim

T!1

1 1 þ dij T



¼ lim

1

T!1 dij

¼ lim

1

T!1 dij

 lim

1

T!1 T

þ lim

1

T!1 T

¼

1 1 0¼ dij dij

ð8Þ

¼

1 1 þ0¼ dij dij

ð9Þ

Thus proving that asymptotically, nij ¼ d1ij ¼ ct. This will give us later a good approximation of the throughput of the entire system. It is, an approximation because of the integral part being dropped (it only provides an asymptotic behavior – the longer an algorithm would keep scheduling the job j on the machine i, the closer the throughput gets to the value above). 4.1. The best asymptotically optimal algorithm Since we are now scheduling under high load conditions, we can assume we have an infinite amount of tasks of each type. This means we do not need to worry so much about specific jobs and we can look at our scheduling problem from each machine’s point of view. Let us define the following set of jobs that are executed on the ith machine, Hi :

Hi ¼ fj1 ; j2 ; j3 ; . . . ; jn g;

jHi j ¼ n

ð10Þ

The time needed for this set of jobs to complete is given by:

Ti ¼

X dij

ð11Þ

j2H

Let ji;min be that j 2 J for which the following equation is true:

diji;min ¼ min dij ¼not di j2J

ð12Þ

Intuitively, we will be scheduling on each machine the same job – the one that gives us the highest throughput on that particular machine. Note that we are only interested in optimizing the total number of scheduled jobs; in other words, the cost of scheduling j on i is dij and the value we get is 1 (after completion, no job has a greater value than the others). With the definition above we are now able to build our new schedule, namely the set H0i , so that it has the same number of jobs as   Hi jH0i j ¼ jHi j ¼ n :

H0i ¼ ji;min ; ji;min ; . . . ; ji;min

ð13Þ

The time needed for this new set of jobs to complete is give by: Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

6

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

T 0i ¼

X X dij ¼ di ¼ ndi j2H0

ð14Þ

j2H0

We also know that di 6 dij ; 8j 2 J, which leads us to the following conclusion:

ndi 6

X dij ) T 0i 6 T i

ð15Þ

j2H

Let’s analyze this result. By minimizing the time for the same amount of jobs to complete we also maximized the asymptotic throughput. Also, since we did not enforce any restrictions on the elements of Hi (it was arbitrarily considered), the H0i scheduling should be at least as good as any other possible scheduling of n jobs on ith machine. Thus, we found the best asymptotically optimal scheduling algorithm. We can now compute another interesting result for our model, its throughput. Let’s see how many jobs we will be able to schedule using this algorithm given a certain amount of time T. Using the notations above:

Niji;min ¼

  T di

ð16Þ

We are interested in the asymptotic behavior of this function, so we push T to the limit:

j k T Niji;min di 1 ni ¼ lim ¼ lim ¼ T!1 T!1 T di T

ð17Þ

Remember that we only took into account a single machine, so all the results hold in the case of a single machine. In reality we will deal with clusters that consist of hundreds/thousands of machines. The simplicity of the algorithm combined with the fact that our model has only independent jobs and their operations are bound to be executed on the same machine and in the same order (read, process, write) makes it easy for us to compute the full throughput in the cluster as:



X X1 ni ¼ d i2M i2M i

ð18Þ

Again, this result is only asymptotically correct. In a real world application, this is only an approximation. The real value gets closer to the approximation as we use the algorithm and for a sufficiently long period the difference is negligible. 4.2. Practical implementation details In this section we are going to explore some of the alternatives we have for implementing this scheduling method. The straight forward algorithm is rather simple. We will show that by pre-processing of dij we get a better running time. Throughout this section we will use the following notations across all algorithms and complexity notations:    

i a machine, an element of the M set, j a job, an element of the J set, m the total number of machines (that is, m ¼ jMj), n the total number of jobs (that is, n ¼ jJj).

There is one important elements that is missing from the algorithms below: checking the availability of the jobs/ machines. This was intended – we have removed these details in order to make the code more readable and our intentions more clearly expressed. 4.2.1. Brute force

Algorithm 1. Brute force. 1: for all machines i 2 M do 1 2: di 3: for all jobs j 2 J do   ðiÞ ðiÞ ðjÞ 4: dij ¼ DðjÞ sr þ lðjÞ sw þ t p 5: di ¼ minfdi ; dij g 6: end for 7: schedule j ! i such that dij ¼ di 8: end for

Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

7

This is a trivial implementation and we can quickly find its running time. Computations inside the inner loop take place in constant time, Oð1Þ an will be always executed exactly n times. The outer for loop may be executed, in the most unfavorable case at most m times, so the resulting running time of this algorithm is OðmnÞ. Over a period of time T, the scheduler code is called T times, so the total time spent inside the scheduler is OðmnTÞ. It is really important to lower this complexity because in a real world application, the time spent in scheduler should be minimized as much as possible in order to make the scheduler as responsive as possible. By its nature and by the fact that the two loops may be interchanged with only minor modifications (without affecting the running time), this algorithm does not take advantage of any of the special cases that might occur. It is also clear that computations in the inner loop are unnecessarily repeated across consecutive calls to the scheduling algorithm. This is a key observation and can be used to improve the its performance, making it linear in the number of machines. The optimization is implemented in the next subsection. 4.2.2. Preprocessing of di

Algorithm 2. Preprocessing of di . 1: for all machines i 2 M do 1 2: di 3: for all jobs j 2 J do   ðiÞ ðiÞ ðjÞ 4: dij ¼ DðjÞ sr þ lðjÞ sw þ t p 5: di ¼ minfdi ; dij g 6: end for 7: end for 8: for all machines i 2 M do 9: schedule j ! i such that dij ¼ di 10: end for This algorithm consists of two parts: preprocessing and the actual scheduling. Informally, the preprocessing step answers the question ‘‘what is the best job that we can run on a certain machine?’’. Ideally, this would be implemented as an initialization step that should be called only once, at the beginning of the scheduler life. After finding out the best job for each machine, the actual scheduling part (the one that is being called at each moment of time) has to check every free machine and schedule the precomputed job (if available) on it. The second part of the algorithm takes OðmÞ time because it has to iterate through all the machines in order to check if scheduling of a job is possible at that time. The first part of the algorithm is more interesting. Its sequential complexity has the order OðmnÞ, but because it is an initialization step, it can be computed in parallel on all of our machines. This way, the complexity lowers to OðnÞ. The total running time of the algorithm is Oðm þ nÞ if we consider only one time step. Over a period of T units, we end up with Oðn þ TmÞ which is, asymptotically, n times better than the performance of the brute force algorithm. 5. Hybrid scheduling approach 5.1. Schedulers and job rates In this section we will talk about scheduling from the point of view of individual job types and define 3 functions that we will be using throughout the rest of this section. Informally, these functions have to do with rates at which jobs either come, are scheduled or complete at a certain point in time. We will use continuous functions, considering only their asymptotic behavior. This is a good approximation, since the frequency of change in our model is considered to be fairy small. In one of the next sections we will take closer look at this approximation and see how good does it fit a discrete simulation. Let’s define the following three functions:

f j : Rþ ! Rþ ;

j2J

ð19Þ

This is the rate at which a job j 2 J arrives to our cluster. The function will most probably be experimentally derived, either via interpolation or approximation using a set of samples. Our simulation framework presents the ability to introduce arbitrary counters to probe certain events and can be, at least theoretically, used to estimate f j .

hj : Rþ ! Rþ ;

j2J

ð20Þ

Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

8

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

This is the rate at which jobs j 2 J are being scheduled at a certain point in time by our asymptotically optimal algorithm. This is a general function that describes our algorithm and it has no definition restrictions (it should correctly estimate the momentary throughput even if the heavy load conditions are not met). This function may be also approximated (everything from f j still applies), but an approximation that covers its entire domain is not really useful. Thus, we will later introduce the H

definition of another function family that derives from hj , namely hj , which expresses the behavior of the scheduling algorithm under the ‘‘heavy load’’ conditions and, what’s more, it is time independent (as we have shown earlier).

j2J

g j : Rþ ! Rþ ;

ð21Þ

This is the rate at which jobs j 2 J are being scheduled at a certain point in time by a normal scheduler. Remember that the solution deals with combining two different scheduling algorithms to produce better results. Hence, g j is the function that characterizes the ‘‘usual’’ scheduler, the one that we will be using when ‘‘heavy load’’ conditions are unmet. We will later derive mathematical instruments for computing g j asymptotically for a FCFS scheduler. Under some conditions, mathematical approaches are impractical. We may, once again use the simulation framework we built to probe it and use the collected data to build a probabilistic model for our scheduler. As we have said before, let’s introduce a function that describes our technique under the conditions we are interested in H

(under heavy load). We will use the hj notation to denote that optimal scheduling rate. H

hj : Rþ ! Rþ ;

j2J

ð22Þ

The Rþ domain we are using here does not mean the algorithm described by this function has this behavior everywhere, but that it has this behavior whenever the conditions are met. That is, if the conditions are met throughout Rþ , our algorithm will perform with the throughput Rþ . From now on, we will use the j ! i notation to show that a certain job j we are talking about is being scheduled by the algorithm on the i machine (where j 2 J and i 2 M). Also, Dj is the unordered set of all processing times of a job j on any machine. Thus,

Dj ¼ dij ji 2 M; j ! i

ð23Þ

We now have all the data and notations that we need to give a formal definition of the throughput: H

hj ðtÞ ¼

X1 d2Dj

H hj

function and compute its

ð24Þ

d H

H

H

From Eqs. (23) and (24) we deduct that hj ðtÞ is constant in time. From now on, we will use hj as a constant notation for hj ðtÞ. Using this we could also compute various metrics, such as the total number of jobs that can be scheduled in a certain amount of time, between, let’s say, T 1 and T 2 in the whole cluster:



XZ j2J

T2 T1

H

hj dt

ð25Þ

H

Since hj is constant, we can move it outside the integral, H

N ¼ hj

XZ j2J

T2

dt

ð26Þ

T1 H

After solving the integral and substituting the formula for hj we get

XX 1 N ¼ ðT 1  T 2 Þ d j2J d2D

ð27Þ

j

Although we will not use this result in the form we stated it, it shows the power of having mathematical definitions for scheduling algorithms. This usually leads to lower computational complexities because we do not need to simulate everything that happens inside the cluster to compute metrics that we can later use for improving the scheduling. Mathematics also gives us trustworthy and bug free results that we use for building better algorithms on top of them. In the next section we will try to answer the question ‘‘when can we consider the heavy load conditions have been met so we can switch to our asymptotically optimal algorithm?’’. One could say this question is easily answerable from the inside of the scheduling algorithm, at runtime, just by checking those conditions. The trick here is that given a lot of tasks and machines we need to spend some computational power at each of step just to check the fact. The computational complexity of having this check at runtime is certainly asymptotically worse than Oð1Þ and we can reduce it to Oð1Þ if we precompute, given a correct model, all the interesting timestamps we need. Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

9

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

5.2. Switching algorithms: h ! g at T 1 We will now approach the main topic of our model: switching algorithms. The main idea here is that we want to use the right tool for the job. We have a scheduling algorithm that we normally use because it performs well on average and we called it g j . However, if we identify a set of conditions under which another algorithm optimally schedules our jobs, we will H

want to make use of this observation. We already demonstrated that hj is the best we can achieve at very high loads, when we have an infinite amount of jobs at our disposal so let’s mathematically determine (in terms of our model, knowing the H

H

functions f j ; g j and hj ) a point in time (let’s call it T 1 ) when we are able to successfully apply hj in order to get better results. Note that there will exist some gray areas before T 1 : the asymptotically optimal algorithm may perform worse on average than the other algorithm and at some point (closer to T 1 ) it becomes better. Ideally, we would determine all the solutions of an integral equation and choose the one that’s the closest to T 1 , when the situation stabilizes and hj is better than g j until it H

achieves hj throughput. We will not deal with this problem at this time. Let’s just determine the moment when we are sure that we have enough jobs to optimize the throughput. As the job rates increase, they will be stored into some buffers/queues in our cluster. We call this accumulation. The accumulation rate is a defined as

aj : Rþ ! Rþ

ð28Þ

aj ðtÞ ¼ f j ðtÞ  g j ðtÞ

ð29Þ

The formula uses g j because we are now situated before T 1 (so the switch did not happen yet). It simply shows that the accumulation is, in fact, produced by all those tasks our algorithm has been unable to schedule. Because of the limited job storage capacity our cluster presents not all of these jobs will be scheduled. Some of them will be simply lost. If we imagine a scenario where users send some queries that we cannot execute (and ignore instead), some of those users who had their queries ignored will probably resend them at a later time. This scenario is completely covered by our model because we can treat those as new queries that come from users (that is, unrelated to the fact that we ignored them in the earlier). All of this must be modeled by the f j function. Next step for us is to count how many of these jobs are accumulating over a period of time. Without restricting the generality, we will consider that our analysis has been done starting from t ¼ 0, when g j was able to schedule every job that it came across is a sufficiently small amount of time (that is, at the beginning the equation g j ðtÞ ¼ f j ðtÞ holds). This is true for most of the real clusters: one should have enough machines and a powerful enough scheduler so that under normal loads you can fulfill all the requests that come. The total accumulation from t ¼ 0 to t ¼ T is

Aj ðTÞ ¼

Z

T

aj ðtÞdt ¼

Z

0

T

ðf j ðtÞ  g j ðtÞÞdt

ð30Þ

0

In most cases, before T 1 , we will deal with two different states of the system. We already briefly described the first one above: g j ðtÞ ¼ f j ðtÞ. The contribution of this state to the integral equals 0. In the second state, the ‘‘heavy load’’ conditions we need for our scheduler have not yet been met, so we are still using g j . In this state, Aj will monotonically increase until all the buffers in the cluster will be fully saturated. We say that our conditions are met when H

Aj ðTÞ P hj ;

ð8Þj 2 J

ð31Þ

For each individual task we compute the latest T 1j for which the following equation holds: H

Aj ðT 1j Þ ¼ hj ;

ð8Þj 2 J

ð32Þ

Which formally gives us T 1 as

n o H T 1 ¼ max T 1j jAj ðT 1j Þ ¼ hj

ð33Þ

5.3. Switching algorithms: h ! g at T 2 We are now situated at a point in time t > T 1 and we are using hj to schedule jobs. At this moment all the job buffers in the cluster are fully saturated. As f j functions have high values, a lot of jobs continue to arrive and the scheduling algorithm H

(remember, even hj ðtÞ ¼ hj ) is not able to fulfill all requests. From a throughput point of view we are optimally scheduling – there is no other technique that could help us to do a better job under this conditions. Before we can continue, we need to introduce another member of our model, a characteristic of the cluster itself (although it may also be seen as a being of the job description): buffers. We can imagine buffers as being some storage boxes where jobs are being accumulated whenever our scheduling capacity is lower than the rate at which new jobs arrive. The buffers will hold jobs for us so we can schedule them later when resources are available. From a mathematical point of view, we are interested in the capacity of the cluster to store jobs j 2 J, so Bj 2 NH . We can deduce from Bj 2 NH two individual Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

10

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

constraints. First of all, we need that Bj > 0 and that is because without any storage capacity, we cannot even schedule that particular job – we need to store it somewhere so we can inform the scheduling algorithm about its existence. Secondly, we cannot model anything closely related to real world applications if we do not impose the Bj < 1 restriction. It is practically impossible to hold an infinite amount of information (besides, it is impossible to have that infinite amount of information already generated at a certain point in time so that we can ask ourselves where to store it). Next, let’s move further with our experiment and start to lower the values of f j so that they start to monotonically decrease. While doing so, there will be a job dependent moment T 0j for which

  H f j T 0j ¼ hj

ð34Þ H

H

We are able to use hj in the equation above because as long as f j ðtÞ P hj we still have the ‘‘heavy load’’ conditions met H hj ,

(that is for a particular job). Contrarily, when f j ðtÞ drops under the conditions are no more met if we do not take into account buffers. Bringing buffers into the equation, we can still employ an optimal scheduling for a short period of time (the difference of tasks we need will be supplied by the buffers, rather than by our users). A graphical analysis will show that this time is usually really short, but there are certain conditions under which it can last for longer, so if we want an accurate model, we need to take it into account. The equation below builds on top of the last one, studying the behavior of the scheduling for t > T 0j :

Bj þ

Z

T

T 0j

f j ðtÞdt 

Z

T

H

H

hj dt P hj

T 0j

ð35Þ

Having T ¼ T 2j leads us to:

Bj þ

Z

T 2j

T 0j

f j ðtÞdt 

Z

T 2j T 0j

H

H

hj dt ¼ hj

ð36Þ H

Which can be further simplified by making the observation that hj is constant with time:

Bj þ

Z

T 2j

T 0j

  H H f j ðtÞdt  T 2j  T 0j hj ¼ hj

ð37Þ

The second moment of time we were looking for, T 2 , can be easily drawn by looking at T 2j and choosing the minimum value:

T 2 ¼ minfT 2j g j2J

ð38Þ

After T 2 , the first buffer stores fewer jobs than our asymptotically optimal algorithm needs to be efficient, so the hybrid algorithm needs to switch to the scheduler denoted by the g j function family. 6. Simulation framework and experimental result We present in this section MTS2 simulation framework (Many Task Scheduling Simulator). The purpose of MTS2 framework is to simulate events that happen inside a cluster in order to experimentally check/validate our results. It can be used for a broad range of simulations – even if it has originally been developed for simulation of scheduling algorithms it can be also used, for instance, to plot mathematical functions. This framework provides building blocks that can be used to implement mathematical models for simulation. It was designed to be really fast and scalable: for instance, simulation of one million steps on the test machine took under 60 s (using the FCFS scheduler). This means that, on average, each simulation step takes about 60 ls. For the purpose of this paper, performance is critical: we study the asymptotic behavior of some scheduling algorithms under extreme conditions. The main simulation that has been done to show the properties we discussed in the previous section takes a schedule algorithm into a cluster with two thousands of machines, a storage capacity of several thousands of jobs and job rates that would fill the buffers quickly if left unscheduled. 6.1. MTS2 framework design There were two key elements that have been taken into account while designing MTS2 framework: efficiency and sampling. First of all, let’s talk about efficiency. Later in this paper we will show some results that we were able to produce using our simulation software and some of the parameters are impressive. We were able to simulate in only 10 s key processes that take part inside a cluster: jobs come, they are temporarily stored in job-specific buffers, a scheduling algorithm takes the current state of the system, selects a subset of jobs and pairs them to some empty machines. We were able to repeat these steps one million times, resulting in a average time of just 10 ls per step. Better performances my result from porting this framework to a GPU architecture. Scheduling algorithms have been tested multiple times on GPUs clusters. Such examples can be found at [23,41]. Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

11

The way we achieve this performance is by ensuring Oð1Þ operation whenever we need to access related objects. For instance, Jobs have internal pointers to the Buffers where they are being stored in the cluster; they also have pointers to the machine they are running on and to their related counters. We choose here to not rely on (possibly efficient) implementations of collections in the Standard Template Library so much, but to use direct pointers as much as possible, even though the computational complexity would remain the same (Oð1Þ). Unfortunately, by maintaining all these internal links between components we are increasing the complexity of the system as a whole. Whenever we were faced with the trade-off between performance and low coupling we chose performance. The components are far less reusable given the current design, but it is still really flexible with respect to testing scheduling algorithms. A scheduling algorithm gets every information about the state of the cluster it might need and it’s not tied to a certain Job or Machine model in any way. The second key element that influenced MTS2 design was sampling and probing simplicity. We started from the idea that it does not matter how well an experiment looks or performs if it will be hard to get data out of it. The current design allows one line changes for new metrics. The class that helps us to do so is called Counter. This is the main type of object that use for extracting data out of our experiments. Inside the simulation, everybody is able to declare a Counter and it will be instantly visible everywhere. Almost every object has a step() function that does the actual simulation and an increment_counters() function that updates specific counters for each object. For instance, the step() function for a Machine consumes a unit of time from the job it needs to process at the moment, while the increment_counters() function may set a counter’s value to 0 or 1 depending on whether the machine is or not busy. A metric that we can extract from this simple counter is ‘‘How much time was the machine free?’’. Low values tells us we are not using the machines effectively or we only had a small amount of jobs to be scheduled. The counters are implemented in such a way that all the user needs to do is to declare and increment/set its value at appropriate times; the simulation framework will take care of sampling and recording their values. The simulation basically runs in two steps that aggregate the two points above. First step is to run the actual simulation for a tiny amount of time (one time unit – we are using the term tiny because we usually run these simulations over several millions of time units). Second step is to sample the counters (internally, we call this operation ‘‘checkpointing’’). The sampling can be fine tuned so that we impose a certain period of time where the counters are being sampled and we can even set a specific sampling rate. In the next section, we will be focusing on the technical details of the simulation framework in a bottom up manner. We will start with a description of the basic model (Jobs, Machines and Buffers), move to the Scheduling class that encapsulates all of the scheduling logic. Next, we will build up to the Cluster and towards the end we will talk a bit about experiments and the main simulation code. In the last section in this section we will explain how the mathematical model has been mapped to the framework components, present a short list of the most relevant counters and discuss the information we managed to gather. 6.2. Counter smoothing Counter smoothing is an important functionality we have built in the Counter class and it is used to smooth rapidly changing functions for the purpose of plotting. We are rarely interested in the instantaneous value of the functions because, and this is especially true at high change frequencies, it is more useful to analyze the trends rather than individual values. Moreover, it is sometimes hard to infer the trend from discrete points at a high frequency of change. For exemplification we will use the FCFS scheduler on some test data with and without smoothing and then discuss the results (the running conditions of the algorithm are exactly the same for both simulations, except for the smoothing feature activation). Fig. 1 shows the instantaneous throughput of a FCFS algorithm. There is a lot of change in the throughput and that makes hard for us to observe its trend. We can get rid of the superfluous information by activating Counter smoothing on this function. What Counter smoothing does is to average the values back over a given number of steps. In the next example we are using this feature in order to get a more analyzable graph. In Fig. 2 the trend of the scheduler’s throughput is more obvious and we can estimate it at about 150 executed jobs per time unit with less effort. 6.3. Jobs, machines and buffers These three items form the basic layer of our experiment. The most important thing to note about these three classes is the direct mapping of the data members to the mathematical model definitions. The implementation lets us define various job types and machines via configuration files. It is possible to load these configuration files into an experiment by using the following two parameters:  jobs points to the configuration file that contains job definitions. Every job definition is stored in CSV format. For instance, the first part of the definition of one of the jobs we use in our experiment is (4, 0.7, 8), which translates into the formal description j 2 J; j ¼ ðD ¼ 4; l ¼ 0:7; t p ¼ 8). This is also the place where the storage capacity for each job is being defined. For our simulation we used uniform size across all buffers – every one of them is able to store two thousand jobs;

Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

12

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

300

f0 f1 f2 throughput

250

Job count

200

150

100

50

0

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1e+06

Time (steps) Fig. 1. FCFS throughput without counter smoothing.

300

f0 f1 f2 throughput

250

Job count

200

150

100

50

0

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1e+06

Time (steps) Fig. 2. FCFS throughput with counter smoothing over 1000 time units.

 machines points to the configuration file that contains machine definitions. These are kept in CSV format as well. A machine definition of (4, 1) translates into m 2 M; m ¼ ðsr ¼ 4; sw ¼ 1Þ. In our specific experiment we have two more parameters that we can set at the job level:  ft1 the absolute moment in time when job rates start to increase (controls the beginning of the ‘‘heavy load’’ conditions);  ft2 the absolute moment in time when job rates return to their initial levels (controls the end of the ‘‘heavy load’’ conditions).

6.4. Cluster. Scheduling algorithms The Cluster class is the one that brings everything together and basically builds the experiment foundation. It deals with the management of jobs, collection of machines and buffers. At this level we can start to talk about scheduling algorithms: the Cluster class is responsible for creating the scheduler we want to use. We only implemented four scheduling algorithms for our paper with the purpose of testing our model: NoScheduler, AOScheduler, FCFSScheduler and GAScheduler. First of them, NoScheduler, is not a real scheduling algorithm – it’s a mock we use to test the integration of the scheduling mechanism into the Cluster. AOScheduler is the algorithm that has an asymptotically optimal behavior under very high load conditions (denoted hj in our mathematical model). The other two Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

13

algorithms (denoted g j in our model) serve for performance comparisons. We implemented a ‘‘first come first served’’ algorithm (FCFSScheduler) and a genetic algorithms based one (GAScheduler). At the cluster and scheduling algorithms level, we defined the following parameters:  primary_scheduler the primary scheduler we want to run in our cluster. The equivalent of this primary scheduler in our model is the g j function set;  secondary_scheduler the secondary scheduler we want to run in our cluster. The equivalent of this secondary scheduler in our model is the hj function set. This is optional and if no secondary scheduler is specified, the experiment will only run the primary scheduler the throughout the experiment;  secondary_scheduler_start the moment in time when we want the first switch to happen (formally, this is the equivalent of an experimental T 1 at which g ! h should take place);  secondary_scheduler_end the moment in time when we want the second switch to happen (formally, this is the equivalent of an experimental T 2 at which h ! g should take place). 6.5. Experiments and main simulation The Experiment class is an object designed to wrap our model – it connects the simulation and sampling logic to the model itself. It is a really thin layer, as the class has only one method, but it is actually important, because it draws a frontier between the simulation algorithm and the data it operates on. We are now reaching the top most piece of code of our framework: the main simulation logic. This code is responsible for parsing command line arguments and setting all the parameters we passed to the application. At this level, we can control some time and sampling related parameters:  experiment the experiment class we want to run. This parameter helps us to choose from a series of different experiments. For instance, in our simulation we have built an experiment called ClusterExperiment;  start the moment of time that is considered the start of the simulation;  stop the moment of time that is considered the end of the simulation. Our logic will count from start to stop and call the step() method on Experiment;  start_sampling the moment of time when we want to start probing the experiment for samples;  end_sampling the moment of time when we want to stop probing the experiment for samples. These two parameters are useful in a context where we have a long simulation, but we are actually interested only into a small part of it and we need to see that part in more detail;  sampling_rate the rate at which we want the system to probe the experiment for samples. The value is expressed in single time units. For instance, say we want to sample every 10 steps, then the sampling_rate parameter would be set to 10. 7. Experimental results The main goal of this experiment is to use a hybrid algorithm (a combination of the ‘‘first come first served’’ algorithm and the asymptotically optimal algorithm we previously presented) in order to optimize the total number of jobs that can be executed in a certain time period. We will compare these results with a more complex algorithm that uses a generic approach to build better solutions. We want to show that simple, but highly specialized algorithms may lead to better results than a single complex algorithm. First of all, we need our experiment to be as fine grained as possible, so we will simulate 1 million time steps. One time step of the simulation is considered to be a full cycle of the following operations: 1. 2. 3. 4. 5. 6.

jobs come to the cluster (according to their rate function); buffers are filled and every job that does not fit is dropped; the scheduler is called; it analyzes the buffers and machines and devises a schedule; machines that have a job associated with them run exactly one time step of their corresponding job; all objects involved in the experiment are queried for counters; counters are logged to a log file.

Every operation in this list is considered to be instantaneous, except for the actual running of the jobs on machines (step 4), which takes a full time unit to complete. This way, we do not take into account the running time of the scheduling algorithm. Because we want to put our asymptotically optimal algorithm to the test we need to create a ‘‘high load’’ conditions segment in the simulation. This has been achieved by using three different types of job-rate functions (for a graphical representation, see Fig. 3):  normal job rates, the functions deliver little jobs; the rates are tuned in such a way that all the jobs can be scheduled and executed, so that the buffers are always empty. FCFS has been found, for example, to experience this behavior;

Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

14

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

300

f0 f1 f2

250

Job count

200

150

100

50

0

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1e+06

Time (steps) Fig. 3. Job rates – number of jobs that arrive to the cluster per time unit.

 high job rates, the function starts to increase and then stabilizes at a maximum level. This has been done by using the sigmoid function because it provides us with a simple way of setting its horizontal asymptote and the grow rate.  normal job rates, the functions deliver little jobs. This is the same function as the first one, except for the sudden drop at the beginning. This way we can focus on optimizing for the T 1 and T 2 time points discussed in the previous section. In order to make results more visible, we need high enough rates for the normal conditions. We have 3 machine types and 1500 units of each of the types, a total of 4500 machines. The normal rates have a average of 90 jobs per time unit. Each job need an average of 31 time units to complete. We have used counter smoothing with a averaging period of 1000 time units. Our system is capable of storing a maximum of one million jobs for each of the three job types we have. Running this experiment takes, on average, one minute on the test machine we used for development. 7.1. Comparison between FCFS and AO We ran our simulation under the conditions described above using only the FCFS scheduler. Here is a fragment from the output of the simulation framework:

½MTS2—MSG mainð96Þ : Simulation took 46:184 s ½MTS2—MSG mainð97Þ : Average time per step 0:046184 ms ½MTS2—MSG mainð98Þ : Total scheduled jobs : 107; 111; 671 This time, we ran our simulation using only the AO scheduler. The simulation output was:

½MTS2—MSG mainð96Þ : Simulation took 64:634 s ½MTS2—MSG mainð97Þ : Average time per step 0:064634 ms½MTS2—MSG mainð98Þ : Total scheduled jobs : 110; 587; 911 We can draw some conclusions directly from this output:  The AO scheduler seems to be slower than FCFS. In fact, this is not the only reason for the timing differences between the algorithms. The biggest contribution to this timing difference is due to the fact that AO manages to schedule more jobs that have to be simulated. This way more time is being spent in the Machi.e::step() method.  AO does a better job at scheduling, overall. The AO algorithm managed to schedule three and a half million jobs more than FCFS. This is due to the optimality of the algorithm in the high load area. We will see later that it is possible to combine the two algorithms and yield a even better result.  Each step of the simulation takes about 50 ls. This is due to the hardware oriented optimization we implemented in the simulation framework. Figs. 4 and 5 give us some more information about the FCFS and AO schedulers evolution during the simulation. They show the throughput of each of the algorithms in time, as scheduled jobs per time unit. In the first part of the simulation, the general trend of both algorithms is almost the same. Both of them experience a flat area, at around 90 jobs per time unit, Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

15

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

300

f0 f1 f2 throughput

250

Job count

200

150

100

50

0

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1e+06

Time (steps) Fig. 4. FCFS scheduler evolution.

300

f0 f1 f2 throughput

250

Job count

200

150

100

50

0

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1e+06

Time (steps) Fig. 5. AO scheduler evolution.

and then increase until they reach some limit. The first flat area is due to the ‘‘normal conditions’’ that occur in our system at that point. Both algorithms are able to completely schedule all the jobs that come, so the throughput is exactly the same. Note that we have used counter smoothing for FCFS to show this behavior. This is not needed for AO because it is not influenced by factors such as the relative order of the jobs that come into our system. At t  330; 000 both throughputs start to increase, as more jobs start to come to the cluster. They stabilize at a certain level and we can see that as the asymptotic scheduling limit of the algorithm under heavy load conditions, when we can consider an infinite number of tasks. We can clearly see that this limit is about 155 jobs/timeunit for FCFS and almost 170 jobs/timeunit for the asymptotically optimal algorithm. Under these conditions only AO produces a scheduling that is 9.7% better. At t  660; 000 job rates drop to the initial values. The heavy load period is now over. Note that the throughput does not drop immediately and the second flat area is still sustained by buffers. Because FCFS schedules jobs by the time they arrived to the cluster and buffers were fully saturated over the high load period, it uniformly consumes the jobs in buffers and when they are empty, there is a sudden drop in the throughput, which returns to its initial values. On the other hand, AO exhibits a slightly different behavior. It drops, around t  7; 000; 000 to a certain level and it is able to sustain this level a little bit more, until t  770; 000, when it drops to its initial level. In order to understand what happens in this area, we need to take a look at the job buffers. Fig. 6 shows us the three buffer loads. For instance, we can see that shortly after job rates start to increase their values the buffer loads start to rise. They reach saturation at one million stored jobs and now the cluster ignores any job Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

16

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

1.2e+06

buffer0 buffer1 buffer2

1e+06

Job count

800000

600000

400000

200000

0

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1e+06

Time (steps) Fig. 6. Job buffer evolution during AO scheduler run.

that arrives because it has no further storage capacity. On the right side, when the pressure on buffers is releasing, we can see that buffer0 takes more time until it is empty. This happens as the first job type (indexed 0) takes a longer time to complete on its assigned machine. This explains the curious flat area in Fig. 5: the AO algorithm is able to schedule job number 0 a little more than jobs 1 and 2 using all the machines available for this task. This is exactly the area where our AO algorithm experiences a bad behavior. A comparison between the time points where throughputs experience sudden drops gives us another sign of the optimality of AO algorithm over FCFS: AO is able to consume buffers faster than FCFS and the graph shows the sudden drop point on the left of t ¼ 700; 000 for AO and on the right of t ¼ 700; 000 for FCFS.

7.2. Hybrid algorithm We will move on to our main subject, a hybrid that uses the asymptotically optimal algorithm under high load conditions. In order to produce the following simulation we switched from FCFS to AO at t ¼ 300; 000 and back to FCFS at t ¼ 700; 000. Note, this is just a guess, the two switching points have only been approximated by graphically analyzing the graphs in the last section. Here is the output of the simulation:

½MTS2—MSG mainð96Þ : Simulation took 49:614 s½MTS2—MSG mainð97Þ : Average time per step 0:049614 ms ½MTS2—MSG mainð98Þ : Total scheduled jobs : 110; 589; 710 As we can see, the total number of jobs has been optimized by only 0.002% compared to the use of AO only. It is, nevertheless, a sign that by tuning the T 1 and T 2 parameters we can reach larger optimizations. Let’s take a look at the graph shown by Fig. 7. The first part of the simulation seems familiar. We can see that the second flat area has now raised at about 170 jobs per time unit, which is the maximum possible value. In the second part, around the second switch point (T 2 ¼ 700; 000), the hybrid algorithm experiences a strange behavior. The sudden drop before t ¼ 700; 000 is due to AO, which is able to quickly consume the buffers for jobs 1 and 2. Recall from the previous section (Fig. 6) that the buffer 0 still stores a lot of jobs of type 0. We find ourselves in a unfortunate situation for AO. The buffers 1 and 2 are empty and the job rates for their respective jobs are low so most of the machines that process 1 and 2 are available. Even though we have these resources available, the AO algorithm does not use them, by design, for processing jobs of type 0. FCFS does not restrict machine usage in any way, so a FCFS scheduling will probably do a better job in this case. This is shown in the graph in Fig. 7: the peak that comes immediately after changing algorithms at T 2 is due to the ability of FCFS to make use of the available resources in order to process jobs of type 0. Fig. 8 shows in more detail the buffer load evolution near the switch point. The buffers 1 and 2 are cleared before the switch point. Buffer 0 has a less steep descend and at t ¼ 700; 000, when the algorithms are switched, FCFS starts to consume jobs in buffer 0 at a higher speed than AO. Ideally, the hybrid algorithm should switch to FCFS earlier in order to make use of the resources that are made available immediately after the jobs stored in buffers 1 and 2 are consumed.

Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

17

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

300

f0 f1 f2 throughput

250

Job count

200

150

100

50

0

0

100000

200000

300000

400000

500000

600000

700000

800000

900000

1e+06

Time (steps) Fig. 7. Hybrid algorithm. Switch points have been approximated.

1e+06

buffer0 buffer1 buffer2

900000 800000

Job count

700000 600000 500000 400000 300000 200000 100000 0 640000

660000

680000

700000

720000

740000

760000

Time (steps) Fig. 8. Hybrid algorithm. Buffer evolution around T 2 .

7.3. Finding the best switch point First of all, we need to run another simulation in order to determine exactly what is the throughput of the asymptotically optimal algorithm throughout the high load period. The results are shown below:

½MTS2—MSG Counter details : ðinst sched job 0; 42Þ ½MTS2—MSG Counter details : ðinst sched job 1; 65Þ ½MTS2—MSG Counter details : ðinst sched job 2; 60Þ By using this information, we can now compute the first and the second switch moments. Experimentally, we have found these to be T 1 ¼ 417; 955 and T 2 ¼ 693; 176 in our case. The main simulation has the following output:

½MTS2—MSG mainð96Þ : Simulation took 70:823 s ½MTS2—MSG mainð97Þ : Average time per step 0:070823 ms ½MTS2—MSG mainð98Þ : Total scheduled jobs : 110; 798; 000 Let’s compute the optimization ratio of this algorithms considering the FCFS and AO only simulations. We want to express the exceeding scheduled jobs as a percentage of the original value, thus Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

18

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

pALG ¼

Nh  NALG NALG

ð39Þ

where N h is the total number of jobs scheduled by the hybrid algorithm and N ALG is the total number of jobs scheduled by FCFS and AO algorithms, respectively.  pFCFS ¼ 110;798;000107;111;671 ¼ 0:034416  3:44%. 107;111;671  pAO ¼ 110;798;000110;587;911 ¼ 0:0018997  0:19% . 110;587;911 In the next section we will test our hybrid algorithm against a genetic algorithm based scheduler. 7.4. Genetic algorithm based scheduler We designed a genetic algorithm that solves the scheduling problem defined by our model in order to see how well does it behave compared to the hybrid algorithm in the previous section. Below we present the output of the main simulation:

½MTS2—MSG mainð100Þ : Simulation took 540:793 s ½MTS2—MSG mainð101Þ : Average time per step 0:540793 ms ½MTS2—MSG mainð102Þ : Total scheduled jobs : 109; 706; 454 The first thing to notice is the high amount of time it took for the simulation to complete using the GA scheduler. This is due to the complexity of the genetic algorithm. We used a small population (10 individuals) and only one generation per unit. The fitness function is optimizing for throughput, as it computes the average time it takes for all machines to schedule their assigned jobs and targets the minimum value (by doing this, it actually targets the maximum throughput, which is, asymptotically, the inverse of the average). We can also see that the genetic algorithm has been able to schedule more jobs than the plain FCFS algorithm, but less than the optimized hybrid algorithm. Nevertheless, a genetic approach may result in better maintainability of the scheduling software, because it uses only one procedure that fits almost optimally any situation (by its nature). We have been unable to produce better results with other fitness functions. Next section presents a side by side comparison between all the algorithms we implemented. 7.5. Algorithms comparison In this section present a structured view of the results of our simulations. Table 2 comments:  FCFS. Baseline implementation;  AO. The use of Asymptotically Optimal algorithm yields better results, but its use outside the heavy load conditions makes it slightly less efficient, especially around T 2 ; Table 2 Algorithm comparison – full simulation. Scheduling algorithm

FCFS AO Hybrid (approximation) Hybrid (optimal switch) Genetic algorithm

Total (jobs)

107,111,671 110,587,911 110,589,710 110,798,000 109,706,454

Difference to baseline

Throughput (jobs/time)

(jobs)

(%)

0 3,476,240 3,478,039 3,686,329 2,594,783

0 3.14 3.15 3.33 2.42

107.11 110.59 110.59 110.80 109.71

Table 3 Algorithm comparison – high load conditions. Scheduling algorithm

FCFS AO Hybrid (approximation) Hybrid (optimal switch) Genetic algorithm

Total (jobs)

55,713,921 59,233,272 59,233,441 59,774,606 58,033,225

Difference to baseline

Throughput (jobs/time)

(jobs)

(%)

0 3,519,351 3,519,520 4,060,685 2,319,304

0 6.32 6.32 7.29 4.16

139.28 148.08 148.08 149.44 145.08

Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

19

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

 Hybrid (approximation). Because we only used a graphical approximation of the switch points we did not get any noticeable improvement;  Hybrid (optimal switch). This hybrid algorithm makes use of our mathematical method of finding the best switch point. It has the best performance out of all the tested combinations;  Genetic algorithm. This serves for comparison with the best hybrid algorithm. The monolithic nature of this algorithm makes it easier to maintain at the expense of a slightly worse performance. Table 3 illustrates the behavior of all the algorithms during the high load phase. Table 3 comments:  FCFS. Baseline implementation;  AO. The asymptotically optimal algorithm has a better performance than FCFS. From this data, it becomes more visible that the Hybrid (approximation) does not take advantage of the FCFS’s ability to make a better use of the available resources outside the heavy load phase;  Hybrid (optimal switch). This algorithm efficiently combines AO and FCFS and outperforms all of the others, having the highest throughput;  Genetic algorithm. The genetic algorithm performs well under heavy load, but it’s slow convergence results in a lower number of scheduled jobs. Due to its superior complexity it needs a higher amount of time to output a planning, thus it may not be suitable for systems with fast scheduling demands. 8. Conclusion In this paper we built an algorithm that works very well at high job rates. We shown that the scheduling problem becomes simpler if we consider the case of an infinite amount of jobs and it is possible to compute the asymptotic throughput of the best algorithm. We designed in our simulation framework an experiment that consists of three different phases. In the first one, we expect job rates to be at normal levels (the levels that would normally occur in our cluster). In the second phase, job rates start to increase rapidly until the total number of requests that arrive in the cluster exceed our scheduling possibilities (that is, no matter what scheduling algorithm we are using). In the third one, the cluster experiences a sudden drop in job rates, but the buffers are still saturated and it takes some time until they are completely emptied. During the first and third phases, FCFS and AO showed similar performances with respect to the throughput. AO was more predictable, while for FCFS we needed to use counter smoothing with an averaging period of 1000 time steps. Actually, in the first phase, both algorithms were able to successfully schedule all incoming jobs (with a more efficient resource utilization by AO). As expected, during the second AO yields a better performance than FCFS. In our mathematical model, we have built a hybrid algorithm: the algorithm representing the g j function family has been combined with AO (which represents the hj function family). The idea is as follows: we know that AO is able to efficiently use the resources at very high job rates (we provided an asymptotic approximation of the throughput) and that the other algorithm is able to take care, somehow, of the rest of the scheduling demands. In this case it is worth to compute the exact time points (referred throughout the paper as T 1 and T 2 ) when we can safely switch the algorithms). This provides us with a Oð1Þ method of dealing with the switch at runtime. There are drawbacks of this mathematical method. First of all, it is hard to build a completely accurate model for the following two function families:  f j The job rate function family is, probably, the biggest of our issues. It is hard to know in advance how many jobs will arrive at a specific time to our cluster. We could build an approximation out of the data we collected so far, but sometimes we deal with highly dynamical systems where the long term validity is not ensured. For instance, even one of the major companies selling cloud based services had trouble serving all its users in 2008 when a Black Friday offer generated an unexpectedly high number of requests. In that particular case, there was too little specific data in order to accurately estimate the impact of the offer;  g j This set of functions depend often on a certain cluster configuration and on the f j . Should any of these change, the g j family needs to be recomputed (or re-approximated), usually by simulation. The implementation of a hybrid algorithm introduces the idea of having specialized algorithms for certain conditions and a generic scheduler for the rest of the time. We were able, by using this approach, to increase the throughput in our simulation. The genetic algorithm we implemented was not sufficiently good to schedule more jobs than the hybrid, but a single scheduling algorithm has the big advantage of maintainability. Within our model it is fairly simple to detect the optimal conditions for AO, but for more sophisticated hybrids the task of differentiating between many special conditions could turn out to be hard. Future work in this field may include mathematical tools that allow us to assess the asymptotic behavior of scheduling algorithms, in order to provide a more objective way of comparing different algorithms. This paper does not provide error H

estimation for the real case, as all math is done asymptotically (for instance, the throughputs denoted by the hj functions Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

20

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

holds at t ! 1). This could also be done in the future, so the approximations resemble the real environment more closely. It would be also interesting to add to this paper the idea of provisioning (adding resources to our system in real time), as modern clusters are able to continue serving users while accepting new resources.

Acknowledgments The research presented in this paper is supported by projects: ‘‘SideSTEP - Scheduling Methods for Dynamic Distributed Systems: a self-⁄ approach’’, ID: PN-II-CT-RO-FR-2012-1-0084; CyberWater grant of the Romanian National Authority for Scientific Research, CNDI-UEFISCDI, project number 47/2012; MobiWay: Mobility Beyond Individualism: an Integrated Platform for Intelligent Transportation Systems of Tomorrow – PN-II-PT-PCCA-2013-4-0321; clueFarm: Information system based on cloud services accessible through mobile devices, to increase product quality and business development farms – PN-II-PT-PCCA-2013-4-0870. We would like to thank the reviewers for their time and expertise, constructive comments and valuable insight.

References [1] M.D. Assuncao, R.N. Calheiros, S. Bianchi, M.A. Netto, R. Buyya, Big data computing and clouds: trends and future directions, J. Parall. Distrib. Comput. 2014 (2014), http://dx.doi.org/10.1016/j.jpdc.2014.08.003. . [2] A. Baykasog˘lu, A. Hamzadayi, S.Y. Köse, Testing the performance of teaching–learning based optimization (TLBO) algorithm on combinatorial problems: flow shop and job shop scheduling cases, Inform. Sci. 276 (0) (2014) 204–218. http://dx.doi.org/10.1016/j.ins.2014.02.056. . [3] P. Brucker, Scheduling Algorithms, Springer, 2007. [4] P. Brucker, E.K. Burke, S. Groenemeyer, A branch and bound algorithm for the cyclic job-shop problem with transportation, Comput. Oper. Res. 39 (12) (2012) 3200–3214. http://dx.doi.org/10.1016/j.cor.2012.04.008. [5] G. Brumfiel, High-energy physics: down the petabyte highway, Nature 2011 (469) (2011) 282–283, http://dx.doi.org/10.1038/469282a. [6] G. Capannini, R. Baraglia, D. Puppin, L. Ricci, M. Pasquali, A job scheduling framework for large computing farms, in: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, SC ’07, ACM, New York, NY, USA, 2007, pp. 54:1–54:10. http://dx.doi.org/10.1145/1362622.1362695. [7] J. Celaya, U. Arronategui, A task routing approach to large-scale scheduling, Future Gener. Comput. Syst. 29 (5) (2013) 1097–1111 (special section: Hybrid Cloud Computing) http://dx.doi.org/10.1016/j.future.2012.12.009. . [8] R.-S. Chang, C.-Y. Lin, C.-F. Lin, An adaptive scoring job scheduling algorithm for grid computing, Inform. Sci. 207 (0) (2012) 79–89. http://dx.doi.org/10. 1016/j.ins.2012.04.019. . [9] B.F. Costa, M. Mattoso, I. Dutra, Applying reinforcement learning to scheduling strategies in an actual grid environment, Int. J. High Perform. Syst. Archit. 2 (2) (2009) 116–128, http://dx.doi.org/10.1504/IJHPSA.2009.032029. [10] M.R. Garey, The complexity of flowshop and jobshop scheduling, Math. Oper. Res. 1 (2) (1976) 117–129. [11] D. Gorgan, V. Bacu, D. Rodila, F. Pop, D. Petcu, Experiments on ESIP – environment oriented satellite data processing platform, Earth Sci. Inform. 3 (4) (2010) 297–308, http://dx.doi.org/10.1007/s12145-010-0065-0. [12] T.A. Henzinger, V. Singh, T. Wies, D. Zufferey, Scheduling large jobs by abstraction refinement, in: Proceedings of the Sixth Conference on Computer Systems, EuroSys ’11, ACM, New York, NY, USA, 2011, pp. 329–342. http://dx.doi.org/10.1145/1966445.1966476. [13] J. Hu, Realistic Models for Scheduling Tasks on Network Nodes, Ph.D. Thesis, California State University at Long Beach, Long Beach, CA, USA, aAI3302285, 2008. [14] J. Huang, H. Jin, X. Xie, Q. Zhang, Using NARX neural network based load prediction to improve scheduling decision in grid environments, in: Proceedings of the Third International Conference on Natural Computation, ICNC ’07, 05, IEEE Computer Society, Washington, DC, USA, 2007, pp. 718– 724. http://dx.doi.org/10.1109/ICNC.2007.803. [15] W. Huang, S.-K. Oh, W. Pedrycz, Design of hybrid radial basis function neural networks (HRBFNNs) realized with the aid of hybridization of fuzzy clustering method (FCM) and polynomial neural networks (PNNs), Neural Netw. 60 (0) (2014) 166–181. http://dx.doi.org/10.1016/j.neunet.2014.08. 007. . [16] M. Istin, F. Pop, V. Cristea, Higa: hybrid immune – genetic algorithm for dependent task scheduling in large scale distributed systems, in: Proceedings of the 2011 10th International Symposium on Parallel and Distributed Computing, ISPDC ’11, IEEE Computer Society, Washington, DC, USA, 2011, pp. 282–287. http://dx.doi.org/10.1109/ISPDC.2011.51. [17] J. Kolodziej, M. Szmajduch, T. Maqsood, S. Madani, N. Min-Allah, S. Khan, Energy-aware grid scheduling of independent tasks and highly distributed data, in: Proceedings – 11th International Conference on Frontiers of Information Technology, FIT 2013, 2013, pp. 211–216. [18] G. Koulinas, L. Kotsikas, K. Anagnostopoulos, A particle swarm optimization based hyper-heuristic algorithm for the classic resource constrained project scheduling problem, Inform. Sci. 277 (0) (2014) 680–693. http://dx.doi.org/10.1016/j.ins.2014.02.155. . [19] X. Liu, X. Wang, W. Pedrycz, Fuzzy clustering with semantic interpretation, Appl. Soft Comput. 26 (0) (2015) 21–30. http://dx.doi.org/10.1016/j.asoc. 2014.09.037. . [20] Y. Mhedheb, F. Jrad, J. Tao, J. Zhao, J. Kołodziej, A. Streit, Load and thermal-aware vm scheduling on the cloud, in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8285 LNCS (PART 1), 2013, pp. 101–114. [21] A. Olteanu, F. Pop, C. Dobre, V. Cristea, A dynamic rescheduling algorithm for resource management in large scale dependable distributed systems, Comput. Math. Appl. 63 (9) (2012) 1409–1423. http://dx.doi.org/10.1016/j.camwa.2012.02.066. . [22] Q.-K. Pan, Y. Dong, An improved migrating birds optimisation for a hybrid flowshop scheduling with total flowtime minimisation, Inform. Sci. 277 (0) (2014) 643–655. http://dx.doi.org/10.1016/j.ins.2014.02.152. . [23] F. Pinel, B. Dorronsoro, P. Bouvry, Solving very large instances of the scheduling of independent tasks problem on the GPU, J. Parall. Distrib. Comput. 73 (1) (2013) 101–110. http://dx.doi.org/10.1016/j.jpdc.2012.02.018. [24] F. Pop, C. Dobre, C. Stratan, A. Costan, Dynamic meta-scheduling architecture based on monitoring in distributed systems, Int. J. Auton. Comput. 1 (4) (2010) 328–349. [25] I. Rao, E. Huh, S. Lim, An adaptive and efficient design and implementation for meteorology data grid using grid technology, in: Proceedings of the 15th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, WETICE ’06, IEEE Computer Society, Washington, DC, USA, 2006, pp. 239–246. http://dx.doi.org/10.1109/WETICE.2006.19. [26] J.A. Rodger, A fuzzy nearest neighbor neural network statistical model for predicting demand for natural gas and energy cost savings in public buildings, Expert Syst. Appl. 41 (4, Part 2) (2014) 1813–1829. http://dx.doi.org/10.1016/j.eswa.2013.08.080. .

Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053

A. Sfrent, F. Pop / Information Sciences xxx (2015) xxx–xxx

21

[27] J.A. Rodger, Toward reducing failure risk in an integrated vehicle health maintenance system: a fuzzy multi-sensor data fusion kalman filter approach for {IVHMS}, Expert Syst. Appl. 39 (10) (2012) 9821–9836. http://dx.doi.org/10.1016/j.eswa.2012.02.171. . [28] P. Serafini, Asymptotic scheduling, Math. Program. 98 (1–3) (2003) 431–444. . [29] C. Serbanescu, Noncommutative Markov processes as stochastic equations’ solutions, Bull. Math. Soc. Sci. Math. Roum. Tome 41 89 (3) (1998) 219– 228. [30] C. Serbanescu, Stochastic differential equations and unitary processes, Bull. Math. Soc. Sci. Math. Roum. Tome 41 89 (3) (1998) 311–322. [31] O. Sinnen, L. Sousa, F. Sandnes, Toward a realistic task scheduling model, IEEE Trans. Parall. Distrib. Syst. 17 (3) (2006) 263–275. [32] Z. Tang, L. Jiang, J. Zhou, K. Li, K. Li, A self-adaptive scheduling algorithm for reduce start time, Future Gener. Comput. Syst. 2014 (0) (2014). http:// dx.doi.org/10.1016/j.future.2014.08.011. . [33] J. Tang, G. Zhang, B. Lin, B. Zhang, A hybrid PSO/GA algorithm for job shop scheduling problem, in: Proceedings of the First International Conference on Advances in Swarm Intelligence, ICSI’10, vol. Part I, Springer-Verlag, Berlin, Heidelberg, 2010, pp. 566–573. http://dx.doi.org/10.1007/978-3-64213495-1_69. [34] M.-A Vasile, F. Pop, R.-I. Tutueanu, V. Cristea, J. Kołodziej, Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing, Future Gener. Comput. Syst. 9 December 2014 (in press). http://dx.doi.org/10.1016/j.future.2014.11.019. . [35] Y. Xu, K. Li, J. Hu, K. Li, A genetic algorithm for task scheduling on heterogeneous computing systems using multiple priority queues, Inform. Sci. 270 (0) (2014) 255–287. http://dx.doi.org/10.1016/j.ins.2014.02.122. . [36] L. Xu, Q.-m. Zhu, Z. Gong, P.-f. Li, Adjsa: an adaptable dynamic job scheduling approach based on historical information, in: Proceedings of the 2nd International Conference on Scalable Information Systems, InfoScale ’07, ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), ICST, Brussels, Belgium, Belgium, 2007, pp. 31:1–31:2. . [37] M. Yan, K.-Y. Lam, S. Han, E. Chan, Q. Chen, P. Fan, D. Chen, M. Nixon, Hypergraph-based data link layer scheduling for reliable packet delivery in wireless sensing and control networks with end-to-end delay constraints, Inform. Sci. 278 (0) (2014) 34–55. http://dx.doi.org/10.1016/j.ins.2014.02. 006. . [38] W.-C. Yeh, P.-J. Lai, W.-C. Lee, M.-C. Chuang, Parallel-machine scheduling to minimize makespan with fuzzy processing times and learning effects, Inform. Sci. 269 (0) (2014) 142–158. http://dx.doi.org/10.1016/j.ins.2013.10.023. . [39] F. Zhang, J. Cao, K. Li, S.U. Khan, K. Hwang, Multi-objective scheduling of many tasks in cloud platforms, Future Gener. Comput. Syst. 37 (0) (2014) 309–320. special Section: Innovative Methods and Algorithms for Advanced Data-Intensive Computing Special Section: Semantics, Intelligent processing and services for big data Special Section: Advances in Data-Intensive Modelling and Simulation Special Section: Hybrid Intelligence for Growing Internet and its Applications. http://dx.doi.org/10.1016/j.future.2013.09.006. . [40] R. Zhang, C. Wu, A hybrid approach to large-scale job shop scheduling, Appl. Intell. 32 (1) (2010) 47–59. http://dx.doi.org/10.1007/s10489-008-0134-y. [41] K. Zhang, B. Wu, Task scheduling for GPU heterogeneous cluster, in: Proceedings of the 2012 IEEE International Conference on Cluster Computing Workshops, CLUSTERW ’12, IEEE Computer Society, Washington, DC, USA, 2012, pp. 161–169. http://dx.doi.org/10.1109/ClusterW.2012.20. [42] D.N. Zhou, V. Cherkassky, T.R. Baldwin, D.E. Olson, A neural network approach to job-shop scheduling, Trans. Neural Netw. 2 (1) (1991) 175–179, http://dx.doi.org/10.1109/72.80311. http://dx.doi.org/10.1109/72.80311. [43] M. Ziaee, General flowshop scheduling problem with the sequence dependent setup times: a heuristic approach, Inform. Sci. 251 (0) (2013) 126–135. http://dx.doi.org/10.1016/j.ins.2013.06.025. .

Please cite this article in press as: A. Sfrent, F. Pop, Asymptotic scheduling for many task computing in Big Data platforms, Inform. Sci. (2015), http://dx.doi.org/10.1016/j.ins.2015.03.053