Future Generation Computer Systems (
)
–
Contents lists available at SciVerse ScienceDirect
Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs
Security, energy, and performance-aware resource allocation mechanisms for computational grids Joanna Kołodziej a,∗ , Samee Ullah Khan b , Lizhe Wang c , Marek Kisiel-Dorohinicki d , Sajjad A. Madani e , Ewa Niewiadomska-Szynkiewicz f , Albert Y. Zomaya g , Cheng-Zhong Xu h a
Institute of Computer Science, Cracow University of Technology, ul. Warszawska 24, 31-155 Cracow, Poland
b
NDSU-CIIT Green Computing and Communications Laboratory, North Dakota State University, Fargo, ND 58108, USA
c
Center for Earth Observation and Digital Earth, Chinese Academy of Sciences, Beijing, China
d
AGH University of Science and Technology, Cracow, al. Mickiewicza 30, 30-059 Cracow, Poland
e
COMSATS Institute of Information Technology (CIIT), Abbottabad 22060, Pakistan
f
Institute of Control and Computation Engineering, Warsaw University of Technology, ul. Nowowiejska 15/19, Warsaw, Poland
g
School of Information Technologies, University of Sydney, Sydney, NSW 2006, Australia
h
Department of Electrical and Computer Engineering Wayne State University, Detroit, MI, USA
article
info
Article history: Received 16 February 2012 Received in revised form 23 July 2012 Accepted 14 September 2012 Available online xxxx Keywords: Distributed cyber physical systems Secure computational grid Resource reliability Scheduling Energy optimization Dynamic voltage scaling Evolutionary algorithm
abstract Distributed Cyber Physical Systems (DCPSs) are networks of computing systems that utilize information from their physical surroundings to provide important services, such as smart health, energy efficient grid and cloud computing, and smart security-aware grids. Ensuring the energy efficiency, thermal safety, and long term uninterrupted computing operation increases the scalability and sustainability of these infrastructures. Achieving this goal often requires researchers to harness an understanding of the interactions between the computing equipment and its physical surroundings. Modeling these interactions can be computationally challenging with the resources on hand and the operating requirements of such systems. In this paper, we define the independent batch scheduling in Computational Grid (CG) as a three-objective global optimization problem with makespan, flowtime and energy consumption as the main scheduling criteria minimized according to different security constraints. We use the Dynamic Voltage Scaling (DVS) methodology for reducing the cumulative power energy utilized by the system resources. We develop six genetic-based single- and multi-population metaheuristics for solving the considered optimization problem. The effectiveness of these algorithms has been empirically justified in two different grid architectural scenarios in static and dynamic modes. © 2012 Elsevier B.V. All rights reserved.
1. Introduction The concepts of today’s cyber physical systems have grown far beyond the original models of supercomputing centers and conventional distributed computing systems, such as grids and clusters. Future generation Computational Grids (CGs) are designed as multilevel and multilayer platforms with a large number of components of various types. These components are not restricted to the conventional computational devices and data centers. In fact,
∗
Corresponding author. Tel.: +48 12 628 27 80. E-mail addresses:
[email protected],
[email protected],
[email protected] (J. Kołodziej),
[email protected] (S.U. Khan),
[email protected] (L. Wang),
[email protected] (M. Kisiel-Dorohinicki),
[email protected] (S.A. Madani),
[email protected] (E. Niewiadomska-Szynkiewicz),
[email protected] (A.Y. Zomaya),
[email protected] (C.-Z. Xu). 0167-739X/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.future.2012.09.009
modern grids must provide a wide range of services according to the requirements of the individual users. Due to the high heterogeneity of these users and resources, the grid managers in one locality (geographical or managerial) might have a limited control over the system components [1]. Another problem can be a disproportion of resource availability and resource provisioning in highly distributed cyber physical systems. This disproportion can imply a severe increase in energy consumption during the execution of grid applications and realization of tasks submitted by the grid end users [2]. All these issues make a development of intelligent scalable resource and grid tasks management techniques essential in today’s grid computing. Scheduling in highly heterogeneous cyber physical systems may be considered as a family of NP-complete optimization problems. Depending on the restrictions imposed by the application needs, the complexity of the problem can be determined by the number of objectives to be optimized, such as (single vs. multiobjective), the type of the environment (static vs. dynamic), the
2
J. Kołodziej et al. / Future Generation Computer Systems (
processing mode (immediate vs. batch), and tasks interrelations (independence vs. dependency). The current efforts in the grid computing research focus on a design of the new grid schedulers, that can efficiently optimize the standard scheduling objectives, such as makespan, flowtime and resource utilization, but also can fulfill the security requirements of the grid users and can minimize the energy consumed by all of the system components. These schedulers should also capture the complexity of the whole system and provide meaningful measures for a wide range of grid applications and services. Therefore, energy-efficient and security-aware scheduling in CGs becomes therefore a complex research and engineering endeavor mainly due to different priorities and preferences of the grid users and resource owners. The computational intractability of the scheduling problems in today’s grids calls for heuristic approaches as effective means for designing risk-resilient energy-aware grid schedulers by trading-off among the different scheduling requirements, constraints and scenarios. Recent literature leverages the capability of Genetic Algorithms (GAs) to provide satisfactory green computing solutions as well as securityaware scheduling by tackling the aforementioned scheduling challenges. In this paper we address the problem of minimization of the energy consumed in the processes of scheduling and execution of batch of independent tasks submitted to the grid environment. This energy consumption is monitored in different grid scenarios based on the security requirements specified by the grid users. Formally, this scheduling issue is defined as a multi-objective Independent Batch Job Scheduling problem in CGs and illustrates the simplest scheduling scenario, which, however, can fairly model many real-life approaches. The paper contributed in three main aspects:
• The definition of the generic independent batch scheduling model in CG with two conventional scheduling criteria, namely Makespan and Flowtime, and additionally, security and cumulative energy consumed in the system—as new scheduling objective functions. • The development of six single- and multi-population energyaware GA-based schedulers working in risky and secure scheduling scenarios. The schedulers are integrated within the Sim-G-Batch grid simulator. • The evaluation of the proposed schedulers in static and dynamic grid scenarios and measurement of their performance by observing the Makespan, Flowtime, failure and relative energy consumption improvement rates. The generic scheduling model presented in this paper is an extended version of our previously developed and published models [3,4] for energy-aware and risk-resilient scheduling in very simple grid architectures. This extension has been made by the aggregation of the security and energy criteria with the conventional scheduling objective functions, and the development of the multi-population GA-based security- and energy-aware schedulers. Additionally, we have developed a security- and energy-aware Sim-G-Batch grid toolkit for simulation of all considered scheduling scenarios and grid infrastructures. This software is composed of two basic modules, namely simulator and scheduler units, that can be easily updated and restructured to adapt the whole application for solving various types of multicriteria scheduling problems by using the wide class of metaheuristics. The paper is organized as follows. State-of-the-art securityaware and energy-aware grid scheduling with genetic-based approaches is presented in Section 2. The generic architecture of the security-aware grid is defined in Section 3. The main scheduling attributes and scheduling scenarios are specified in Section 4. Section 5 defines the framework used for the
)
–
implementation of the multi-population genetic schedulers for the empirical evaluation. The results of this evaluation along with a brief characteristics of the grid simulator are presented in Section 6. The paper is concluded in Section 7. 2. Related work Energy-aware management of large-scale dynamic cyber physical systems has become a popular research issue in the last decade [5]. The proposed management methodologies can be classified into two main groups, namely static management technologies and dynamic management methods [6,7]. Static methods work usually at the hardware level and aim in replacing the physical computational nodes or their components with green-energy devices with low-power chips, batteries and, recently, nano-processors [8]. The number of idle nodes can be significantly reduced in such systems, but, on the other hand, the full knowledge of the stable system configuration is necessary in this case [9]. A cumulative energy utilized by the large-scale cyber physical system can be effectively reduced in the case of modulated power supply to its computational and data storage devices. This effect can be achieved by equipping the computational nodes with Dynamic Voltage and Frequency Scaling (DVFS) modules [10]. DVFSbased modeling has recently become a key dynamic power management method supporting the energy efficient scheduling and resource and data management in grids and large-scale data systems [11]. The replication and storing of a huge amount of data in data centers and data servers farms is usually needed to provide a desirable reliability of the whole system. The low-power hardware techniques must be in this case supported by intelligent decision making, resource management and task scheduling mechanisms. A configuration of the decision variables as well as the distributed system components is a challenging issue if energy optimization is provided according to some users’ and resource providers’ requirements and conditions. Khan and Ahmad in [2] propose the game theoretical solution for simultaneous optimization of the system performance and energy consumption in large data centers. They simulate a simple non-cooperative game of the agents that represent the data servers. The Nash equilibrium states of the game, in their case, guarantee system-wide performance. The authors have extended and improved their generic model by including some cooperation level among agents. All these game scenarios have made their approach an effective solution for the verification of energy proportionality in data centers [12,13], memory-aware computation, data intensive computation in grids [14,15], and finally, energy-efficient grid scheduling [10,11]. Another example of the game-theoretical energy-aware resource management system in highly distributed computational environment (computational grids) is the bargaining game presented in [16]. In this model the resource brokers bargain with resource providers for lower access price and longer usage duration. The negotiation process is guided by the system’s end-users requirements (e.g., deadline) and can be provided directly between buyers (End-users) and sellers (resource owners). This example illustrates very well the economic-like relations among system’s users and local cluster managers. In most of the above mentioned DVFS approached the scheduling has been defined as classical or dynamic load balancing problem. In such cases linear, dynamic and goal programming are the major optimization techniques (see i.e. [11,12,17,15]). However, these methods can be ineffective in the case of incomplete imprecise, fragmentary and overloading data, which complicates the specification of proper system evaluation criteria, assignment scores, availability of resources, and the final collective decisions of the users. In this case, heuristic approaches seem to be the effective means for designing multi-criteria grid or cloud
J. Kołodziej et al. / Future Generation Computer Systems (
schedulers by trading-off various preferences and goals of the system users and resource and service managers. Although, genetic-based metaheuristics are very popular global optimization methods and can work very well in highly parametrized environments, there are not so many examples of their application in energy-aware scheduling and resource management in cyber physical distributed systems. Shen et al. in [18, 19] adopted a conventional single population genetic cloud and grid scheduler to simultaneous optimization of the energy utilization and tasks’ deadlines by including a shadow price mechanism for evaluation of the individuals in genetic populations. The ‘‘shadow price’’ for a pair task-machine is defined as an average energy consumption per instruction for the processor that can operate at different voltage levels. Multi-objective Parallel Genetic Algorithm (MOPGA) hybridized with energy-conscious scheduling heuristics (ECS) has been applied by Kessaci et al. in [20] and Mezmaz et al. in [21] to the optimization of the Makespan (a dominant criterion) and cumulative energy consumption in classical embedded and cloud systems. The general framework of MOPGA defined in these papers is based on the multi-start genetic algorithm and island models [22]. The voltage and frequencies of the processors are scaled up at 16 discrete levels and genes in GA chromosomes are defined by the task-processor labels and processor voltage. The tasks submitted by the end-users are assumed to be parallel applications modeled by Directed Acyclic Graphs (DAGs). The problem of the simultaneous optimization of several conventional scheduling criteria and energy utilization becomes much more challenging when some data processing protection, secure system access and individual preferences and individual security requirements of the system’s users must be taken into account. Many well-known scheduling approaches for grid and cloud computing largely ignore the security factor, with only a handful of exceptions. The main reason may be the high cost of providing the effective authentication procedures in grids and private clouds [23]. The concept of the Virtual Organizations does not solve the problem of the protection of personal data of the system users, especially in the case when that data must be replicated between geographically dispersed clusters. Some efforts have been made on developing intelligent methodologies for monitoring the execution of the system’s applications and detection the resource failures due to the security restrictions [24]. Abawajy et al. have shown in [25] that simple replication of the submitted tasks can keep the probability of the satisfaction of the security requirements and successful job executions at the high level. However, this model seems to be not so good solution for optimizing the energy consumed in the whole global system. In grid computing, most of the developed security and task abortion mechanisms are implemented as the external procedures separated from the core of the scheduling system. In [26] the authors defines the security as additional scheduling criterion in online grid scheduling and aggregated this criterion with conventional scheduling objective functions such as Makespan and resource utilization. In [27,26] the authors considered the risky and insecure conditions in online scheduling in CGs caused by software vulnerability and distrusted security policy. They apply the game model introduced in [28] for simulating the resource owners selfish behavior. The results presented in [26] are extended by Wu et al. in [29]. The authors consider the heterogeneity of fault-tolerance mechanism in a security-assured grid scheduling and define four types of genetic-based online schedulers for the simulation of fault-tolerance mechanisms. The other gametheoretical approaches are presented in [30,4]. In these papers the scheduling problem has been interpreted as a difficult decision problem for grid users working at different levels of the system. Users decisions and fundamental features arising in the users’ behavior, such as cooperativeness, trustfulness and symmetric
)
–
3
and asymmetric roles, were modeled based on the game theory paradigm. Although, the users preferences and strategies and several conventional are widely analyzed in above mentioned publications, the problem of energy conservation is not addressed there and no green technologies are implemented there. 3. Generic model of a secure grid Although CG remains the most popular grid environment, today’s Grid Systems have grown far beyond their original intention. Modern grid infrastructures can benefit various complex applications, such as collaborative engineering, data processing and exploration, and e-Science applications. Grid system is usually modeled as a multi-layer architecture with hierarchical management system that consist of two or three levels depending on the system knowledge, access to data and system services and resources. Buyya et al. in [31] define the following four layers of grid components:
• grid ‘‘fabric’’ layer composed of the grid resources, services, and local resource administration systems,
• grid core middleware, which provides services related to security and access management, scheduling and remote tasks submission, • grid user layer, that is defined as a set of grid end-users, resource and service providers and system administrators, and • grid applications layer. The grid middleware and user layers play the most important role in multi-criterion scheduling and resource management. The fabric layer can be designed as the set of various local computational machines, networks and computational clusters (grid sites). All these layers can be easily identified in each grid cluster, which is the main local component of the whole global infrastructure. The global grid management system can be therefore defined as a compromise between the centralized and decentralized resources and service management systems, where the scheduling and resource allocation decisions are provided at global, inter-site and intra-site levels. The generic model of two- and three-level management structure in a grid cluster is presented in Fig. 1. Two-level systems consist of the global levels with a metabroker, and the intra-site levels, where the resource sites are managed by the resource owners and local resource and service providers. The grid end-users submit their applications to the meta-broker, who, based on the recommendations and information on the resource availability and current state of the cluster, is responsible for mapping the tasks to machines [32]. The meta-brokers from different clusters can collaborate and exchange the information about the end-users requirements and, in the case of machine failures observed during the execution of assigned tasks, the tasks can be sent to another cluster and re-scheduled. In three-level system, there is a middle inter-site level between the meta-broker (in this case he plays the role of cluster global scheduler) and the inter-site grid resource alliance [28]. At this level, the local grid managers collect the information about the resource reliability from the resource owners, define some ‘‘grid sites reputation indexes’’ and forward them to the global scheduler. At the global level the scheduler performs the tasks adaptively, using scheduling algorithms and suggested resources based on availability. In the case of security-aware scheduling, all system managers at global and inter-site levels must additionally analyze the security requirements defined as security demand parameters for the execution of tasks and requests of the CG users for trustful resources available within the system. The local system managers or local grid site managers define the ‘‘reputation’’ trust level indexes of the machines and send their recommendations to
4
J. Kołodziej et al. / Future Generation Computer Systems (
)
–
• batch or immediate task processing policy: the grid jobs can be
a
b
scheduled as the set of tasks (batches) or immediately after their submission to the system provided by the end-users; and • tasks’ interrelations: tasks may be independently submitted and calculated in the grid system or considered as parallel applications with priority constraints and interrelations among the application components (usually modeled by a Directed Acyclic Graphs (DAGs). To our best knowledge there is no universal notation for classification of the scheduling problems in distributed cyber systems. In [35] the authors extended the standard Graham’s [36] and Brucker’s [37] classifications of scheduling problems and proposed the following three-component basic notation:
α|β|γ , (1) where α characterizes the resource ‘‘fabric’’ layer, β specifies the main scheduling attributes, and γ denotes the scheduling criteria. In this paper we address the Independent Batch Scheduling problem, where task are grouped into batches and can be executed independently in a hierarchically structured static or dynamic grid environments. Based on the Eq. (1) an instance of this problem can be expressed as follows:
α = Rm|beta = [{batch, indep, (stat , dyn)}] |γ = (objectiv es, hier ),
(2)
where:
• Rm – Graham’s notation references that tasks are mapped into Fig. 1. Two-level (a) and three-level (b) hierarchical architecture of clusters in computational grids.
the level managers, who provide the scheduling and monitor the resource allocation. The trust level and security demand parameters are generated by aggregation of several scheduling and system attributes. Those parameters depend heavily on the security policy, accumulated resource or grid cluster ‘‘reputation’’, self-defense capability, attack history, special users’ requirements, and peer authentication [26]. Song et al. in [27] have developed a fuzzy-logic trust model, where all scheduling and system attributes are aggregated into simple scalar parameters. The task security demand in this model is supplied by the user’s programs as request for authentication, data encryption, access control, etc. We used this model for the specification of new characteristics of tasks and resources in the grid system, namely security demand and trust level vectors, that are defined in the next section. 4. Problem statement, scheduling criteria, and objective functions
• • • • •
(parallel) resources of various speed,1 batch – means that task are processed in batch mode, indep – means that there are no interrelations among submitted tasks and all users submit their tasks independently to the others, (sta, dyn) – indicates static and dynamics grid scheduling modes, objectiv es – denotes the set of the scheduling objective functions defined for the considered problem, hier – references that the scheduling objectives are optimized in hierarchical mode.
In this paper the objectiv e set is composed of three main scheduling objective functions, namely:
• Cmax − Makespan – the dominant scheduling criterion, • Csum − Flowtime – the Quality of Service (QoS) criterion, and • EI (EII )—total energy consumption (EI or EII is selected depending on the scheduling scenario (see Section 4.4)). Independent batch scheduling can be very useful in illustrating many real-life scheduling approaches such as the processing of large data in banking or medical systems, or data mining in physics and bio-informatics, and many others. 4.1. Expected time to compute (ETC) matrix model
Scheduling in large-scale distributed cyber physical systems such as computational grids, can be considered as a family of various global optimization problems defined with respect to different properties of the underlying system environment and various requirements of the users. To achieve the desired performance of the whole system, both users’ preferences and environment constraints must be ‘‘embedded’’ into the scheduling mechanism [33,34]. There are three main scheduling attributes that must be specified to define a particular tasks-machines mapping problem in CGs, namely:
Independent batch scheduling can be implemented by using the Expected Time to Compute (ETC) matrix model [38]. Let us first denote by n and m the numbers of tasks in a given batch and machines available in the system. We will use the N = {1, . . . , n} and M = {1, . . . , m} as the notation for the sets of tasks’ and machines’ labels, respectively. The tasks in our security-aware scheduling model can be specified as a monolithic application, parallel application with very complex internal structure or meta-task consisting of various independent components. The general characteristics of the task j, j ∈ N is given by the following two parameters:
• static or dynamic grid environment: in the first case – the number of tasks and resources remains unchanged, in the dynamic scenario – the tasks and the resources may be added or removed from the system;
1 In independent grid scheduling it is usually assumed that each task may be assigned just to one machine.
J. Kołodziej et al. / Future Generation Computer Systems (
• wlj – task workload – expressed in Millions of Instructions (MIs); • sdj – security demand parameter. We denote by WL = [w l1 , . . . , w ln ] and SD = [sd1 , . . . , sdn ] the workload and security demand vectors for all tasks in the batch. The values of security demand parameters are real fractions within the range [0, 1], where value 0 refers to the lowest and value 1—to the highest security requirements for the task execution. Similar characteristics can be provided for grid machines, for which the following parameters must be specified:
• cci – computing capacity of the machine i — expressed in Millions of Instructions Per Second (MIPSs);
• readyi – ready time of i, which expresses the time needed for the reloading of the machine i after finishing the last assigned task; • tli , tli ∈ [0, 1] — trust level parameter for machine i — it plays role of the ‘‘reputation’’ index of machine i in the sense of its reliability in the secure task execution. We denote by CC = [cc1 , . . . , ccm ], and ready_times = [ready1 , . . . , readym ] the computing capacity and ready times vectors. The term ‘‘machine’’ is related to a single or multiprocessor computing unit or even to a local small-area network. A task can be successfully completed at a resource when the following security assurance condition is satisfied: sdj ≤ tli for a given (j, i) taskmachine pair. For each pair (j, i) of task-machine labels, the coordinates of WL and CC vectors can be used for an approximation of the completion time of the task j on machine i. These completion times are denoted by ETC [j][i] in the ETC matrix model and can be calculated in the following way: ETC [j][i] =
wlj cci
.
(3)
All ETC [j][i] parameters are interpreted as the elements of an ETC matrix, ETC = [ETC [j][i]]n×m . The elements in the rows of the ETC matrix define the estimated completion times of a given task on different machines, and elements in the column of this matrix are interpreted as approximate times of the completion of different tasks on a given machine. The coordinates of WL and CC vectors can be generated by using the Gaussian distributions [39] with some global parameters expressing the distribution (heterogeneities) of tasks and machines in the system. 4.2. Schedule representation and main scheduling criteria The ETC matrix model is very useful for the formal definition of the standard scheduling objective functions, namely Makespan and Flowtime in terms of the completion times of the machines. Let us first introduce two main representations of schedules in CGs. The schedules in these representations are encoded as vectors of machines and tasks labels. The machine labels are used in direct encoding method, where the schedule is defined in the following way: S = [i1 , . . . , in ]T ,
(4)
where ij ∈ M denotes the number of the machine on which the task labeled by j is executed. In some cases however, it is convenient to define the schedule as a permutation vector of tasks in the batch, that is to say: Sch = [Sch1 , . . . , Schn ]T ,
(5)
where Schi ∈ N , i = 1, . . . , n. This type of encoding is called a permutation-based encoding scheme. Schedule Sch is the vector of labels of tasks assigned to the machines. For each machine the
)
–
5
labels of the tasks assigned to this machine are sorted in ascending order by the completion times of the tasks. In this representation some additional information about the numbers of tasks assigned to each machine is required. The total number of tasks assigned i and is interpreted as the ito a machine i is denoted by Sch = [Sch 1 , . . . , Sch m ]T , th coordinate of an assignment vector Sch which defines in fact the loads of grid machines. We used both encoding methods for the specification of the main scheduling objectives and genetic operations in GA-based schedulers (see Section 5). Formal definitions of all considered scheduling criteria are based on the concept of completion times of machines. A completion time of the machine i is denoted by completion[i] and it is interpreted as the sum of the ready time parameter for this machine and a cumulative execution time of all tasks actually assigned to this machine, that is to say:
completion[i] = readyi +
ETC [j][i],
(6)
j∈Task(i)
where Task(i) is the set of tasks assigned to the machine i. The Makespan Cmax is expressed as the maximal completion time of all machines, that is: Cmax = max completion[i].
(7)
i∈M
The cumulative Flowtime for a given schedule is defined as the sum of workflows of the sequences of tasks on machines, that is to say:
Csum =
i∈M
readyi +
ETC [j][i]
(8)
j∈Sorted[i]
where Sorted[i] denotes a set tasks assigned to the machine i sorted in ascending order by the corresponding ETC values. An extended list of other conventional scheduling criteria defined in terms of completion times and by using the ETC matrix model can be found in [40]. 4.3. Security conditions The grid cluster or the grid resource may be not accessible to the global meta-scheduler when being infected with intrusions or by malicious attacks. Also the security requirements of the end-users can be exorbitant. In such cases, the failures of machines during the tasks’ executions can be observed. Based on the security demand and trust level parameters, the probabilities of such failures can be effectively estimated. Let us denote Prf to be a Machine Failure Probability matrix, the elements of which are interpreted as the probabilities of failures of the machines during the tasks executions due to the high security restrictions. These probabilities are denoted by Pf [j][i] and are calculated by using the negative exponential distribution function, that is to say:
Prf [j][i] =
0, 1 − e−α(sdj −tli ) ,
sdj ≤ tli sdj > tli
(9)
where α is interpreted as a failure coefficient and is a global parameter of the model. The scheduler has two options of initializing his work: (a) to analyze the Machine Failure Probability matrix in order to minimize the failure probabilities for task-machine pairs; or (b) to perform the conventional scheduling without any preliminary analysis of the security conditions, abort the task scheduling in the case of machine failure, and reschedule this task at another resource. The scheduler’s strategies define two different scenarios of securityaware grid scheduling, namely secure and risky scenarios.
6
J. Kołodziej et al. / Future Generation Computer Systems (
Secure scenario. In secure scenario, the security assurance conditions are verified for all task-machine pairs before the allocation of the physical resources. This verification usually delays the whole process and may result also in the extension of the predicted execution times of the tasks on machines. Although the preliminary analysis of the security conditions may significantly prevent the machines’ failures, there is of course no guarantee that each task will be successfully executed. The local resource managers and cluster schedulers aim to minimize the values of the failure probabilities for the task-machine pairs. The verification of the security conditions and possible failures of the machines2 generate some additional ‘‘cost’’ of scheduling expressed in time units. The completion time of the machine i is denoted by completions [i] and can be defined as follows:
completions [i] = readyi +
(1 + Prf [j][i])(ETC [j][i]).
Level
Class I
0 1 2 3 4 5 6
Class III
Rel. freq.
Volt.
Rel. freq.
Volt.
Rel. freq.
1.5 1.4 1.3 1.2 1.1 1.0 0.9
1.0 0.9 0.8 0.7 0.6 0.5 0.4
2.2 1.9 1.6 1.3 1.0
1.0 0.85 0.65 0.50 0.35
1.75 1.4 1.2 1.9
1.0 0.8 0.6 0.4
• in the secure scenario readyi + 1 + Prf [j][i] · ETC [j][i] ≤ Cmax (sec ) j∈Sorted[i]
∀i ∈ M ;
(16)
• in the risky scenario ETC [j][i] readyi + j∈Sorted[i]
Cmax (sec ) = max completions [i].
(11)
+
i∈M
Csum (sec )
1 + Prf [j][i] · ETC [j][i]
j∈Sortedres [i]
≤ Cmax (risk) ∀i ∈ M .
1 + Prf [j][i] · ETC [j][i] .
readyi +
i∈M
(17)
(12)
j∈Sorted[i]
Risky scenario. In risky mode, the prior verification of all security assurance and resource reliability conditions is ignored. In this case, the expected probabilities of failures of machines during the execution of the assigned tasks are higher than in the secure mode. Many task can be re-scheduled. This re-scheduling procedure is realized similarly to the main scheduling process in the secure mode. Cumulative completion time of machine i (i ∈ M ) in this case can be defined as follows: completionr [i] = completion[i] + completionsres [i],
(13)
where completion[i] is calculated by using the Eq. (6), for tasks primarily assigned to the machine i, and completionsres [i] is the completion time of machine i calculated by using the Eq. (10) for the tasks re-assigned to the machine i from the other resources (rescheduled tasks).3 The formulas for Makespan and Flowtime in this scenario are defined in the following way: Cmax (risk) = max completionr [i].
(14)
i∈M
Csum (risk) =
Class II
Volt.
(10)
The formulas for calculation the values of Makespan and Flowtime functions can be in this case expressed in the following way:
–
Table 1 DVFS levels for three machine classes.
{j∈Tasks(i)}
=
)
readyi +
i∈M
+
1 + Prf [j][i] · ETC [j][i] ,
(18)
where A is the number of switches per clock cycle, C is the total capacitance load of the machine, v is the supply voltage and f is the frequency of the machine. The energy consumed by this machine for the computation of the task j can then be derived using the following formula: F [j]
Pow(t )dt ,
(19)
0
(15)
j∈Sortedres [i]
2 In such a case the task must be rescheduled or is moved the system task backlog queue and can be scheduled in the next batch. 3 The component completions [i] in Eq. (13) is not calculated while all tasks are res
Pow = A · C · v 2 · f ,
Ej =
where Sortedres [i] is the set of rescheduled tasks assigned to the machine i sorted in ascending order by the corresponding ETC values. We assume in this work the hierarchical optimization mode with Makespan as the dominant scheduling criterion. In this case, the Flowtime function is minimized in both secure and risky scenarios subject to the following constraints:
successfully executed on the grid machines.
The energy utilization model used for simulating the energyaware grid scheduling presented in this paper is based on the Dynamic Voltage and Frequency Scaling (DVFS) hardware technique [41] used for a modulation of the voltage supplies and frequencies of the grid computational devices. DVFS concept is primarily based on the power consumption model employed in complementary metal–oxide semiconductor (CMOS) logic circuits [42], where the capacitive power Pow of a given machine depends on the voltage supply and machine frequency. Formally, this dependency relation can be defined as follows:
ETC [j][i]
j∈Sorted[i]
4.4. DVFS-based energy model
where F [j] is the time of finalizing the task j on the machine. It can be observed that the reduction of the supply voltage and operational frequency of the machine results in the reduction of the energy utilized by this machine. DVFS method is a hardware technique, so each machine in the system must be equipped with a DVFS module [43]. Table 1 shows the exemplary values of voltage and frequency parameters for 16 DVFS levels and three ‘‘energetic’’ categories of the machines. This data well illustrates typical realistic scenarios and has been used in many research and engineering projects for the evaluation of the load balancing and scheduling methodologies in data centers, grid and, recently, also cloud systems [12,2,15]. We used this data in the empirical analysis presented in Section 6. Let us denote by si an energetic class of machine i. Based the data presented in Table 1, each energetic class can be represented by the following meta-column vector of the frequency and voltage
J. Kołodziej et al. / Future Generation Computer Systems (
parameters at numerous DVFS levels specified for this class, that is to say: si = (vs0 (i), fs0 (i)); . . . ; (vsl(max) (i), fsl(max) (i))
T
(20)
where vsl (i) refers to the voltage supply for machine i at level sl , fsl (i) is a scaling parameter for the frequency of the machine at the same level sl , and lmax is the number of levels in the class si . The parameters {fs0 (i), . . . , fsl(max) (i)} define the relative frequencies of the machine i at the levels s0 , . . . , sl(max) and are expressed as the scalar parameters in the range [0, 1]. The reduction of the machine frequency and its supply voltage can lead to the extension of the computational times of the tasks executed on that machine. For a given ‘‘task-machine’’ pair (j, i), the completion times for the task j on machine i at different DVFS levels in the class si can be interpreted as the coordinates of a vector [j][i] which is defined in the following way: ETC
[j][i] = ETC
1 fs0 (i)
· ETC [j][i], . . . ,
1 fsl(max) (i)
· ETC [j][i] ,
(21)
where ETC [j][i] are the expected completion times for task j on machine i (see Eq. (3)). Based on Eqs. (18), (19) and (21) the energy utilized for completing task j on machine i at level sl can be defined as a scalar product of the number of switches per clock cycle, the total capacitance load, the frequency and the squared voltage at level sl , and the estimated completion time of task j on machine i. That is to say:
[j][i][sl ], Eji (sl ) = γ · (fsl (i))j · f · [(vsl (i))j ]2 · ETC
(22)
where γ = A · C is a constant parameter for a given machine class, (vsl (i))j is a voltage supply value for class si and machine i at level sl for computing task j, and (fsl (i))j is a corresponding relative frequency for the machine i. The cumulative energy utilized by the machine i for the completion of all tasks assigned to this machine is defined in the following way:
Ei = γ · f ·
([(vsl (i))j ]2 · ETC [j][i])
j∈Tasks(i) l∈Lˆ i
+ [vsmax (i)] · readyi + fsmin (i) · [vsmin (i)] · Idle[i] 2
2
(23)
where Idle[i] denotes an idle time of machine i, and Lˆ i denotes a subset of DVFS levels used for the tasks assigned to machine i. All additional machine frequency transition overheads are ignored. These overheads take usually a negligible amount of time (e.g., 10–150 ms, see [44]) and do not bear down on the overall ETC model with an active ‘‘energetic’’ module. The average cumulative energy utilized by the whole grid system for completion of all tasks from a given batch is defined as the sum of the energies Ei calculated for all machines active in the scheduling process, that is: m
Etotal =
Ei
i =1
m
.
)
–
7
(1) Max–Min Mode, where each machine works at the maximal DVFS level during the execution and computation of tasks, and works at the minimal DVF level in the idle periods; (2) Modular Power Supply Mode, where each machine can work at various DVFS levels during the task executions, and works at the minimal DVF level in the idle periods. In the former, the consumption of the energy depends on the ‘‘energetic’’ class of the system devices or services, defined as ‘‘machines’’ (resources) in the system. No modifications of the conventional scheduling procedures and standard scheduling objectives, such as Makespan and Flowtime, are needed. In the latter, the optimal power supply levels can be specified for each machine, and the energy consumption can subsequently be reduced by diminishing the power supply in the machines while preserving the deadline constraints for the main tasks. The procedures for calculation and security-aware optimization the Makespan, Flowtime and cumulative energy utilized by the system are different in the aforementioned scheduling scenarios and can be realized as follows. The minimization of the Makespan and Flowtime are the first two steps of the hierarchical scheduling optimization procedure. In Max–Min Mode, the completion time, Makespan and Flowtime are calculated by using the Eqs. (13)–(15) in the risky scenario, and Eqs. (10)–(12) in the secure scenario. The idle time for machine i working in Max–Min Mode can be expressed as the difference between the Makespan and the completion time of this machine, that is to say: Idleris [i] = Cmax (ris) − completionr [i]
(25)
in the risky scenario, and Idlesec [i] = Cmax (sec ) − completions [i]
(26)
in the secure mode. The calculation of the idle factor for the machine with the maximal completion time (Makespan) is ignored. In Modular Power Supply Mode, for each task-machine pair, the DSV level sl must be specified. The formulas for computing the completion times in secure and risky scenarios at the level si can be defined as follows: completionsII [i] = readyi
+
1
f (i) j∈Tasks(i) sl
· 1 + Pf [j][i] · ETC [j][i]
(27)
in the secure mode, and completionrII [i] = completionII [i] + completionsres,II [i]
(28)
completionsres,II
in the risky mode, where [i] is the completion time calculated for rescheduled tasks on machine i, and completionII [i] is calculated as follows: completionII [i] = readyi +
1
f j∈Tasks(i) sl
(i)
· ETC [j][i].
(29)
We used these formulas for the specification of the Makespan
(24)
We used this model for the specification of two energy-aware scheduling scenarios and the definition of the scheduling criteria. 4.4.1. Aggregation of the scheduling objectives—two energy-aware scheduling scenarios Two main energy-aware scheduling scenarios are considered in this work for the analysis of the effectiveness of modulation of the power supply and operating frequencies of the machines, namely:
(Cmax (ris))II and (Cmax (sec ))II in both security and risky modes. The Flowtime can be calculated from the Eqs. (12) and (15) by replacing the ECT [j][i] components by f 1(i) · ETC [j][i]. sl
Finally, the formulas for idle times can be expressed as follows: IdleII (sec )[i] = (Cmax (sec ))II − completionsII [i]
(30)
IdleII (ris)[i] = (Cmax (ris))II −
(31)
completionrII
[i].
The third criterion of the scheduling problem addressed in this paper is minimization of the average energy utilized by the
8
J. Kołodziej et al. / Future Generation Computer Systems (
machines in the system. In Min–Max Mode and risky scenario, this energy is defined as follows: EI (ris) =
1 m
+
m
·
γ · completionr [i] · f · [vsmax (i)]2
i =1
1 m
·
m
γ · fsmin (i) · [vsmin (i)]2 · Idleris [i].
(32)
i =1
Similar formula can be generated for the average energy utilized in the secure mode EI (sec ) through replacing the completionr [i] and Idleris [i] in Eq. (32) by completions [i] and Idlesec [i] (see Eqs. (10) and (26)). In Modular Power Supply Mode the average cumulative energy is given by Eq. (24): m
EII = Etotal =
Ei
i =1
m
(33)
where Ei is calculated by modification the Eq. (23) by using the Eqs. (28)–(31). In all cases the energy is minimized on the assumption that the optimal Makespan and Flowtime values achieved in the first two steps of the process, cannot increase. 5. Security-aware genetic-based batch schedulers and the grid simulator Heuristic methods are well known from their robustness and have been applied successfully to solve scheduling problems and general combinatorial optimization problems in a variety of fields [45,33,32,34]. Therefore they can be considered as good candidates to be the effective schedulers in CGs and in wider class of DCP systems. These methods can tackle the various scheduling attributes and additional energy and security aspects. In this work, we consider the following six representative examples of single- and multi-population genetic-based risk resilient grid schedulers: mode;
mode;
• HGS-Sched(R)— multi-population hierarchical GA scheduler working in the risky mode;
• HGS-Sched(S )— multi-population hierarchical GA scheduler working in the secure mode;
• IGA(R)—multi-population island GA scheduler working in the risky mode; • IGA(S )—multi-population island scheduler working in the secure mode. 5.1. Single-population schedulers The GA(R) and GA(S ) algorithms are selected as the most efficient single-population security-aware and energy-aware grid schedulers evaluated empirically in similar grid scenarios in series of our previous publications [3,30,4]. However, we provided our experiments separately for the security and energy criteria. Genetic engines in those single-population schedulers are based on the framework of the classical genetic algorithms used in the combinatorial optimization [46,34]. An initial population is generated by using the MTC + LJFR-SJFR method. In this method all but two individuals are generated randomly. Those two individuals are created by using the Longest Job to Fastest Resource—Shortest Job to Fastest Resource (LJFR–SJFR) and Minimum
–
Completion Time (MCT) heuristics [47]. In both risky and secure implementations of the schedulers we used the Linear Ranking selection, Cycle Crossover (CX), Rebalancing Mutation (R) and Struggle (ST) population replacement mechanism. The crossover and mutation operators are well known and commonly used in grid scheduling. The detailed description of these techniques can be found in [48]. The main aim of using the Struggle replacement mechanism is to eliminate the premature convergence of the schedulers to the local optima. In this method, the ‘‘similarity’’ relation for the individuals must be specified. This relation determines the ‘‘similarity classes’’ of individuals. The worse individuals in the class can be then replaced by the best ones, if such a replacement improve the fitness value. The similarity relation among the individuals is usually defined as a distance limit of the schedules representing those individuals in the schedule domain interpreted as a metric space. This distance can be measured according to the classical Euclidean metrics, as it is demonstrated in [47]. In this work, we used the Mahalanobis distance [49] for measuring the distances between schedules according to the following formula:
n (S 1 [j] − S 2 [j])2 1 2 sime (S ; S ) = σP2 j =1
(34)
where σP is the standard deviation of the S 1 [j] over the population P. The struggle strategy has shown to be very effective in solving several large-scale multi-objective problems (see e.g., [50,51]). However, the cost of the implementation of the distancebased ‘‘similarity’’ relation can be very high if the distances for each possible pair of chromosomes must be measured and then compared. This cost van be effectively reduced by a hash technique [52], in which the hash table with the task-resource allocation key is created. The value of this key, denoted by K , is calculated as the sum of the absolute values of the subtraction of each position and its precedent in the direct representation of the schedule vector (reading the schedule vector in a circular way). The hash function fhash is defined as follows:
• GA(R)—single-population GA scheduler working in the risky • GA(S )—single-population GA scheduler working in the secure
)
fhash (K ) =
0,
K < Kmin
N·
K − Kmin Kmax − Kmin
N − 1,
Kmin ≤ K < Kmax
(35)
K ≥ Kmax
where Kmin and Kmax correspond respectively to the smallest and the largest value of K in the population, and N is the population size. 5.2. Multi-population schedulers Although the struggle mechanism can reduce the effect of the premature convergence of the single-population genetic-based grid schedulers, the achievement of the effective exploration of the search space by one and often large population can be difficult and problematic in the most of the realistic scenarios. The performance of the classical mono-population genetic algorithms designed for the large-scale combinatorial optimization problems, such as grid scheduling, can be improved by dividing the global population into several small sub-populations and by providing the independent search processes (several independent implementations of the same genetic algorithm) with each of them separately. This is the main paradigm of the island model (Island Genetic Algorithm — IGA), which is a well known and broadly used example of the multi-population genetic strategy [22]. The sub-populations of the individuals in this model are called ‘‘islands’’ or ‘‘demes’’. The processes activated in different demes are synchronized time to time by using the migration procedure, in which, after a fixed
J. Kołodziej et al. / Future Generation Computer Systems (
)
–
9
evolutionary processes (α -periodic metaepoch Metα ). The extension of the tree is managed by the Branch Comparison (BC ) operator. The detailed definition and interpretation of all HGS-Sched operators can be found in [34,52]. The genetic engines in all islands of IGA algorithm and all branches of HGS-Sched strategy base on the same single-population genetic algorithm. We used the GA(R) and GA(S ) algorithms for risky and secure implementations of hierarchical and island multipopulation schedulers. 5.3. Security- and energy-aware Sim-G-Batch grid simulator — basic concept
Fig. 2. 3 levels of HGS-Sched tree structure.
number of iterations of each genetic algorithm in each deme, a part of each population migrate to the neighboring island, usually according to the standard ring topology. We denote by itd the period between two migrations, and by mig — the relative amount of the migrating individuals in each population. The mig parameter is called the migration rate, and is calculated as follows: mig =
mdeme deme
· 100%
(36)
where deme is the size of the sub-population in IGA and mdeme is the number of migrating individuals in each deme. The main drawback of the application of IGA algorithm to the grid scheduling can be the ‘‘blind’’ procedure of decomposition of the global population into the system of islands. Some subpopulations can be generated in the regions with no promising solutions, and the algorithms activated there are ineffective schedulers, but they increase the complexity and in fact the execution time of the whole island strategy. This problem can be partially solved by the Hierarchic Genetic Scheduler (HGS-Sched), which allow to activate the search processes based on the results of the preliminary scanning of the wide regions of the search space. The main concept of HGS-Sched is a generation of a multilevel decision tree, where the branched are interpreted as models of dependent evolutionary processes. The search process in HGSSched is initialized by activating a scheduler with low search accuracy. This scheduler is the main module of the entire strategy and is responsible for the ‘‘management’’ of the general structure of the tree4 and exploration of new regions in the optimization domain. The accuracy of search in HGS-Sched branches is defined by the branch degree parameter, and depends on the mutation rate and in fact the size of the population, which can be different in the branches of different degrees. Fig. 2 depicts a simple graphical representation of 3-level structure of HGS-Sched. New branches are sprouted from the parental ones by using the sprouting procedure SO after execution of α -generation
4 The scheduler with the lowest accuracy of search is called the core of the HGS tree structure.
Simulation seems to be the most effective method for a comprehensive analysis of the scheduling algorithms in largescale distributed cyber physical systems, such as grid or cloud environments. It simplifies the study of schedulers performances and avoids the overhead of coordination of the resources, which usually happens in the real-life grid or cloud scenarios. Simulation is also effective in working with difficult and highly parametrized problems. In such cases a considerable number of independent runs is needed to ensure significant statistical results. Using the simulators for the evaluation of grid schedulers is feasible, mainly because of the high complexity of the grid environment. For simulating the grid environment we developed a SimG-Batch grid simulator, which is based on the discrete eventbased model HyperSim-G [53], which facilitates the evaluation of different scheduling heuristics under a variety of scheduling criteria across several grid scenarios. These scenarios are defined by the configuration of security conditions for scheduling and the access to the grid resources, grid size, energy utilization parameters, and system dynamics. The simulator allows the flexible activation or deactivation of all of the scheduling criteria and modules, as well as works with a mixture of meta-heuristic schedulers. The main concept of the energy-aware and security version of the Sim-G-Batch simulator is presented in Fig. 3. The work of the simulator is initialized by the definition of the scheduling instance based on the values of the main tasks’ and machines’ parameters, namely (a) the workload and computing capacity parameters, (b) ready time vector, and (c) security demand and trust level vectors for all tasks and machines (see Section 4.1). For simulating all considered power supply scenarios, the following ‘‘energy’’ attributes must be additionally configured:
• machine categories specification parameters (number of classes, maximal computational capacity value, computational capacity ranges interval for each class, machine operational speed parameter for each class, etc.); • DVFS levels matrix for machine categories. The simulator is highly parametrized in order to reflect numerous realistic grid scenarios. All schedulers are decoupled from the simulator main body and are implemented as the external libraries. The Sim-G-Batch software was written in C++ for Linux Ubuntu 10.10.5 We used this simulator for simple empirical evaluation of the six genetic schedulers defined in Section 5 in the large-scale static and dynamic grid environments. 6. Empirical analysis The main aim of our empirical analysis is to compare the effectiveness of multi-population and single-population genetic-based
5 The codes of the benchmarks and meta-heuristics are available upon request to Joanna Kołodziej (www.joannakolodziej.org).
10
J. Kołodziej et al. / Future Generation Computer Systems (
)
–
Fig. 3. General flowchart of security- and energy-aware Sim-G-Batch simulator. Table 2 Configuration of the grid simulator for static and dynamic scheduling. Small grid
Large grid
Static case Nb. of hosts Resource cap. Total nb. of tasks Workload of tasks
64 1024
256 N (5000, 875) 4096 N (250000000, 43750000)
Table 3 GA setting for static and dynamic benchmarks. Parameter Evolution steps Pop. size (pop_size) Intermediate pop. Cross probab. Mutation probab. max_time_to_spend
15 ∗ m 3 ∗ (log2 (m) − 1) (pop_size)/3 0.9 0.25 40 s (static )/50 s (dynamic )
Dynamic case Init. hosts Max. hosts Min. hosts Resource cap. Add host Delete host Init. tasks Total tasks Workload
64 70 50
256 264 240 N (5000, 875) N (562500, 84375) N (437500, 65625) N (625000, 93750) 768 3072 1024 4096 N (250000000, 43750000)
Both cases Security demands sdj Trust levels tli Failure coefficient α
U [0.6; 0.9] U [0.3; 1] 3
schedulers in large-scale static and dynamic scheduling in grid environment. Although the implementation and execution of the basic procedures in the multi-population strategies is rather time demanding compare with the single-population techniques, we expect that the exploration and exploitation of the search space can be more effective when many not so large populations are distributed in a wider region of the scheduling domain. The performance of all six GA-based meta-heuristics defined in Section 5 has been analyzed in two grid size scenarios: Small Grid with 64 hosts and 1024 tasks, and Large Grid with 256 hosts and 4096 tasks. The capacity of the resources and the workload of tasks are randomly generated by Gaussian distributions.
The sample values of key input parameters used in the experiments for the simulator are presented in Table 2.6 Most of those parameters (excluding the numbers of tasks and machines) were tuned in empirical analysis presented in [27,26,28] and the recent publications of the authors and their collaborators [3,54,30,34,55]. The machines in our experiments can work at 16 DVFS levels and can be categorized into three ‘‘energetic’’ resource classes, Class I, Class II, and Class III as it is presented in Table 1. The configurations of key parameters for both risky and secure implementations of GA algorithms, IGA and Green-HGS-Sched meta-heuristics are presented in Tables 3–5. The relative performance of all schedulers has been quantified with the following four metrics:
• Makespan—the dominant scheduling criterion which can be calculated in various way depending on the security and energetic criteria (see Sections 4.2–4.4); • Flowtime—the QoS criterion calculated, similarly to Makespan, in various way depending on the security and energetic criteria; • Relative Energy Consumption Improvement rate (RECI ) expressed as follows: RECI =
EII − EI EII
· 100%,
(37)
6 We used the following notation U (a, b) and N (a, b) for uniform and Gaussian probability distributions, respectively.
J. Kołodziej et al. / Future Generation Computer Systems ( Table 4 HGS-Sched settings for static and dynamic benchmarks.
Strategy
10 ∗ n 10 0 and 1 3 ∗ (⌈3 ∗ (log2 n − 1)/(10)⌉) (⌈3 ∗ (log2 n − 1)/(10)⌉)
GA (S) HGS-Sched (R)
abs((r_pop_size)/3) abs((b_pop_size)/3) 0.9 0.4 0.25 50 s. (static )/80 s. (dynamic )
HGS-Sched (S) IGA (R) IGA (S)
10 ∗ n 5% 10 0.9 0.25 50 s (static )/80 s (dynamic )
GA (S) HGSSched (R)
Static instances
Dynamic instances
Small grid
Large grid
Small grid
Large grid
4689266.131 [±265830.863] 4311127.632 [±276998.049] 4207360.747
4509951.114 [±289227.664] 4380160.166 [±282839.469] 4303132.059
4507897.021 [±275379.743] 4371274.191 [±237932.564] 4433896.361
4651142.093 [±320904.724] 4362364.259 [±292104.936] 4354685.093
[±221065.141] HGS4108354.055 Sched (S) [±247242.576] IGA (R) 4366655.375 [±289042.319] IGA (S) 4311221.617 [±373554.947]
[±348320.006] [±308969.891] [±295031.864] 4139736.276 4214293.172 4283408.536 [±257457.196] 4395596.988 [±261247.362] 4382375.414 [±166236.414]
[±338025.569] 4324908.717 [±320366.959] 4297869.438 [±248889.545]
[±222714.525] 4347895.97 [±226417.748] 4407165.141 [±299326.898]
where EII and EI are defined in Eqs. (24) and (32) respectively;
• Machine Failure Rate (Failr ) defined as follows: Failr =
nfail n
Large grid
Small grid
Large grid
30.77 [±8.79] 22.31 [±8.91] 25.75 [±9.87] 17.23 [±8.68] 28.30 [±11.86] 20.63 [±9.65]
32.17 [±13.31] 29.26 [±13.08] 26.59 [±11.67] 20.39 [±9.43] 26.67 [±9.03] 20.42 [±9.02]
40.76 [±14.07] 29.61 [±13.32] 27.79 [±11.71] 20.50 [±10.60] 31.59 [±13.33] 20.53 [±9.21]
40.55 [±13.50] 29.60 [±11.02] 32.29 [±12.03] 22.80 [±8.74] 28.33 [±8.67] 23.23 [±9.16]
6.1. Results
Table 6 Average Makespan results in the large size static and dynamic instances ([±s.d.], s.d. = standard deviation).
GA (R)
Dynamic instances
Small grid
where nfail is the number of unfinished tasks, which must be rescheduled.
Parameter
Strategy
11
Static instances
GA (R)
Table 5 Configuration of IGA algorithm.
itd mig Number of islands (demes) Cross probab. Mutation probab. max_time_to_spend
–
Table 8 Average Failr results in the large size static and dynamic instances ([±s.d.], s.d. = standard deviation).
Parameter Period_of_metaepoch Maximal number of metaepochs (in the core) nb_of _metaepochs Degrees of branches (t ) Population size in the core Population size in the sprouted branches (b_pop_size) Intermediate pop. in the core Intermediate pop. in the sprouted branch Cross probab. Mutation probab. in core Mutation probab. in the sprouted branches max_time_to_spend
)
· 100%
(38)
Each experiment was repeated 30 times under the same configuration of operators and parameters. Tables 6–9 present the best results for all four scheduler performance metrics achieved in each run of the simulator and averaged over the number of runs. The values of Makespan and Flowtime are expressed in some general arbitrary time units, not necessary seconds or their fractions. The statistical significance of the presented results has been verified by providing the standard Student’s t-test for means [49] in order to compare the average values of four performance metrics defined in the previous section. The possible results of the t-test can be the acceptance or rejection of the null hypothesis (H0) stating that any differences in results are purely random. An erroneous rejection of the null hypothesis constitutes a Type 1 error. The predefined confidence level C .L. for the provided Student’s test was 95%. Tables 10–13 present the probabilities of Type 1 errors (P-values) [49], that give the adequate information about possible acceptance or the rejection of the ‘‘null’’ hypothesis. The difference in results is not statistically significant if the P-value is not greater than 0.05 (P-value is 1 for the base (best) results). It can be observed that the secure version of the hierarchical strategy HGS-Sched(S ) generated best results for Makespan, Flowtime and Failr metrics in all considered scheduling and grid scenarios. In the case of Makespan optimization in static scenario, the effectiveness of the single population GA(S ) algorithm is better than the island model, which is rather surprised. The main reason of that can be that in such a case the island strategy does not improve the quality of search of the single-population scheduling mechanism implemented in each deme of IGA when the global
Table 7 Average Flowtime results in the large size static and dynamic instances ([±s.d.], s.d. = standard deviation). Strategy
GA (R) GA (S) HGS-Sched (R) HGS-Sched (S) IGA (R) IGA (S)
Static instances
Dynamic instances
Small grid
Large grid
Small grid
Large grid
4466754857.90 [±174326439.22] 4406611017.93 [±219130307.87] 4362796736.04 [±253617207.62] 4259830293.79 [±192760951.81] 4397443169.51 [±229978217.79] 4369439682.35 [±240695659.00]
8640800941.68 [±189540400.47] 8580831738.06 [±205759079.97] 8508566945.88 [±233037141.88] 8266260949.54 [±202925292.54] 8535510637.65 [±223548995.71] 8338883269.26 [±181970313.93]
4642996587.18 [±252880424.73] 4494655727.96 [±231604602.24] 4371783830.08 [±192367946.42] 4310297306.92 [±240553913.47] 4514639234.91 [±280156916.77] 4333021181.13 [±250462801.13]
8816738303.84 [±225432113.87] 8632499230.56 [±] 219965301.67 8492042635.11 [±225443809.72] 8380100642.32 [±202724627.75] 8471638151.26 [±166084470.76] 8415134908.62 [±300664752.32]
12
J. Kołodziej et al. / Future Generation Computer Systems (
Table 9 Average RECI results in the large size static and dynamic instances ([±s.d.], s.d. = standard deviation). Strategy
GA (R) GA (S) HGS-Sched (R) HGS-Sched (S) IGA (R) IGA (S)
Static instances Large grid
Small grid
Large grid
23.27 [±10.85] 22.18 [±16.59] 18.41 [±7.62] 20.85 [±7.63] 22.49 [±10.90] 20.28 [±8.31]
19.04 [±9.12] 22.71 [±18.79] 23.17 [±10.71] 15.73 [±7.98] 22.40 [±10.01] 20.12 [±9.75]
23.35 [±8.46] 16.63 [±6.41] 24.26 [±10.50] 19.45 [±8.58] 20.60 [±9.70] 19.18 [±10.08]
18.76 [±7.82] 15.72 [±7.87] 26.80 [±6.60] 20.93 [±8.37] 23.47 [±8.67] 18.16 [±9.28]
Table 10 Two-tailed P-value results for Makespan in static and dynamic instances ([±s.d.], s.d. = standard deviation). Strategy
GA (R) GA (S) HGS-Sched (R) HGS-Sched (S) IGA (R) IGA (S)
Static instances
Dynamic instances
Small grid
Large grid
Small grid
Large grid
0.0086 0.0468 0.0187 1 0.1174 0.4532
0.5196 0.0042 0.1722 1 0.0918 0.6768
0.0532 0.4682 0.0512 1 0.3032 0.5132
0.0274 0.4148 0.4644 1 0.3912 0.5174
Table 11 Two-tailed P-value results for Flowtime in static and dynamic instances ([±s.d.], s.d. = standard deviation). Strategy
GA (R) GA (S) HGS-Sched (R) HGS-Sched (S) IGA (R) IGA (S)
Static instances
Dynamic instances
Small grid
Large grid
Small grid
Large grid
0.0046 0.0632 0.2312 1 0.0910 0.1832
0.0002 0.0013 0.0094 1 0.0042 0.2386
0.0024 0.0333 0.3386 1 0.0464 0.7806
0.0002 0.0054 0.1508 1 0.1154 0.7210
Table 12 Two-tailed P-value results for Failr in static and dynamic instances ([±s.d.], s.d. = standard deviation). Strategy
GA (R) GA (S) HGS-Sched (R) HGS-Sched (S) IGA (R) IGA (S)
Static instances
Dynamic instances
Small grid
Large grid
Small grid
Large grid
0.0008 0.1048 0.0232 1 0.0162 0.2941
0.0906 0.0894 0.1924 1 0.0974 0.7416
0.0938 0.1854 0.2838 1 0.0896 0.3238
0.0824 0.0828 0.0342 1 0.0344 0.0452
Table 13 Two-tailed P-value results for RECI in static and dynamic instances ([±s.d.], s.d. = standard deviation). Strategy
GA (R) GA (S) HGS-Sched (R) HGS-Sched (S) IGA (R) IGA (S)
Static instances
Dynamic instances
Small grid
Large grid
Small grid
Large grid
1 0.9393 0.0746 0.3128 0.8267 0.28496
0.235 0.4724 1 0.0694 0.9242 0.4226
0.7416 0.0044 1 0.1175 0.2634 0.1454
0.0413 0.0716 1 0.0538 0.2554 0.0564
or best local solutions of the problem are probably located very close to each other. It can be also observed that in all considered scheduling scenarios the secure implementations of the algorithms
–
Table 14 Average number of genetic epochs needed for finding the best solutions. Strategy
Dynamic instances
Small grid
)
GA (R) GA (S) HGS-Sched (R) HGS-Sched (S) IGA (R) IGA (S)
Static instances
Dynamic instances
Small grid
Large grid
Small grid
Large grid
19778 [±2233.80] 19683 [±2087.96] 15303 [±9992.81] 11929 [±1805.38] 16913 [±2197.85] 17452 [±7874.37]
80468 [±8055.71] 79294 [±4623.67] 69576 [±4207.50] 63189 [±3218,40] 75430 [±6574.47] 75035 [±5269.50]
18789 [±9608.64] 20377 [±8662.43] 14594 [±218.23] 13935 [±8038.40] 15350 [±414.08] 15416 [±8244.82]
80759 [±2449.40] 80344 [±9599.21] 71366 [±2608.06] 64009 [±5300.22] 77352 [±7574.74] 78325 [±2217.25]
in the same class are more effective in Makespan, Flowtime and Failr minimization than their risky versions, namely GA(R), HGS-Sched(R) and IGA(R). The situation is different in the optimization of the energy consumption. All relative energy consumption improvement rates are positive. It means that keeping the machines busy for a longer time and modulation of their power supply do not extend the task deadline constraints, but improve the load balancing of the machines and reduce the energy utilization in the system. The comparison of the results achieved by the secure and risky versions of the GA, HGS-Sched and GA metaheuristics showed, that in the case the energy optimization, HGS-Sched(R) and GA(R) algorithms achieved the best (highest) values of RECI parameters. This may lead to the conclusion that the differences in the Modular Power Supply and Min–Max power modes are not so significant when the security procedures are active. The lowering of the power supply in such a case does not improve the energy management. In the risky scenario both multi-population schedulers perform very well in modular mode. In this case the Makespan and Flowtime values are rather big, so the lowering of the supply power can reduce significantly the energy consumed in the system without increasing the task deadline limits. Good results achieved by the hierarchical scheduler do not imply the high complexity of the strategy and the long execution time needed for generating good solutions for the problem. In Table 14 we compared the average number of genetic epochs needed for finding the best generated schedulers in the 3-step hierarchical optimization procedure with Makespan, Flowtime and energy consumption components of the cumulative objective function, in risky and secure scheduling scenarios. It can be observed that hierarchical scheduler needs 20%–35% less genetic epochs for approximating the best solutions compare with the island and single-population metaheuristics in all considered scenarios. It makes this strategy well adapted for solving the challenging multi-criteria scheduling problems in highly parametrized dynamic environments. A simple analysis of the performance of six genetic metaheuristics provided in this section is sufficient for a comparison of the effectiveness of each the considered scheduler in the optimization of each performance metrics separately. The cumulative performance analysis of the schedulers can be provided by generating the Kiviat radar graphs. 6.1.1. Kiviat graphs In order to measure the relative schedulers performances under the cumulation of four performance measures specified in Section 5.3, we use the four-dimensional Kiviat graphs [39] with Makespan, Flowtime, Failr and RECI parameters. Each Kiviat graph has four orthogonal dimensions, each representing one of the performance metrics. The scales and ranges of four measures
J. Kołodziej et al. / Future Generation Computer Systems (
)
–
13
Fig. 4. The Kiviat graphs for all in static and dynamic scenarios: a-Maximal Makespan, b-Maximal Flowtime, c-Maximal Failr , d-Maximal RECI; a1 -Average Makespan, b1 Average Flowtime, c1 -Average Failr , d1 -Average RECI.
are specified in Fig. 4. The circle center represents zero for all measures. The range of Makespan, Flowtime, Failr and RECI) is from zero to the largest value observed in the experiments (and denoted by a, b, c , d on the plot). These Kiviat graphs provide a holistic view of the capabilities of each algorithm. Based on visualizing these graphs, we can select the best approach under practical constraints in real-life applications. Specifically, the smaller the area of the quadrangle on the center of the Kiviat graph, the better the Quality of Grid Services (QoGS) provided by a scheduling algorithm. The parameters in the centers of each circle indicates the QoGS measures of each scheduler in each scheduling scenario, and are defined as follows: QoGS =
quadrangle_area cicrle_area
· 100%.
(39)
It follows from the results that the best (the smallest) QoGS parameters are achieved by both implementations of the island algorithm. The values of these parameters for IGA are between 25% and 35% in both risky and secure scenarios. The worst (highest) values of QoGSs have been generated by the HGS-Sched scheduler, which rather surprised us in the light of the high effectiveness of
this scheduler in the optimization of two major criteria, namely Makespan and Flowtime. The main reason of achieving such results may be the high stability of HGS-Sched algorithm. The values of standard deviations measured for this algorithm are relatively small compare with the other methods (especially IGA). The analysis of the Kiviat graphs shows how different can be the evaluation results of the same meta-heuristics in the case of mutual optimization of all scheduling criteria compare with the results analyzed for each criterion separately. 7. Conclusions The categorization of the grid scheduling problems presented in this paper allows to look at the old scheduling models from a contemporaneous and unique perspective. Two new scheduling criteria, namely security and energy consumption, usually considered as the separate optimization problems, are embedded in the proposed scheduling models. The simulated grid scenarios in such cases can better illustrate the realistic systems, in which large number of variables, numerous objectives, constraints, and business rules, all contributing in various ways must be analyzed.
14
J. Kołodziej et al. / Future Generation Computer Systems (
Multi-criteria grid scheduling can be seen as a family of complex optimization, management and decision making problems with several often conflicting criteria, which makes many standard approaches ineffective. Well-known conventional (exact) algorithms usually can work effectively under some selected criteria and can generate just the partial solutions of an overall problem are wellknown. Therefore, it remains an open question how to integrate these partial solutions to achieve a global optimum. Metaheuristics, due to their robustness and high scalability, are able to tackle the various scheduling attributes and criteria. Simple empirical analysis of six genetic-based schedulers presented in this paper show their various properties and different effectiveness in the exploration of the search space. Large and rather stable populations used in simple single-population schedulers often concentrate around the local solutions of the problem and the ‘‘escape’’ from the basins of attractions of such solutions can be difficult for these populations. This problem can be solved by dividing the large populations into several clusters (sub-populations) and implement the standard genetic algorithms in each of them separately, as it is defined in island model. Each sub-population in IGA is much smaller than the global one in conventional GA scheduler, and can be better dispersed around the local optima under the same mutation rate. One of the main drawbacks of the island model is its complexity. Randomly generated sub-populations can be activated also in the ‘‘flat’’ regions of the optimization landscape, where no promising solutions are located. This problem is partially solved by the hierarchic schedulers, where the large global population is dynamically created in different phases of the search process by the genetic algorithms with various accuracy of search. Many small populations are generated in the regions of appearance of the promising solutions, previously briefly scanned by another, not so accurate process. We have observed the high effectiveness of the hierarchical scheduler in the reduction of the Makespan and Flowtime values compare with the other algorithms. HGS-Sched works very well in the secure scenario and the complexity of the algorithm is lower than the other techniques. However, it seems that relatively fast optimization of the major scheduling criteria by hierarchical scheduler is achieved by the efficient load balancing of the machines. In such a case the energy utilized by the system is not significantly reduced due to the modular power supply. This can be the reason of worse results of an overall evaluation of the HGS-Sched compare with the island model. All models presented in this papers are in fact not restricted just to the grid systems. They may be easily adapted to cloud environments, where security awareness and intelligent power management are the hottest research issues. Acknowledgments Joanna Kołodziej’s and Marek Kisiel-Dorohinicki’s work was partially supported by ‘‘Biologically inspired mechanisms in planning and management of dynamic environments’’ grant of Polish National Science Centre, No. N N516 500039. Samee U. Khan’s work was partly supported by the Young International Scientist Fellowship of the Chinese Academy of Sciences, (Grant No. 2011Y2GA01). Ewa Niewiadomska-Szynkiewicz’s was partially supported by the National Centre for Research and Development (NCBiR) grant No. O R00 0091 11. References [1] C. Dobre, Monitoring and controlling grid systems, in: Nikolaos Preve (Ed.), Towards a Global Interconnected Infrastructure, Springer, Berlin, 2011.
)
–
[2] S.U. Khan, I. Ahmad, A cooperative game theoretical technique for joint optimization of energy consumption and response time in computational grids, IEEE Tran.on Parallel and Distributed Systems 20 (3) (2009) 346–360. [3] J. Kołodziej, S. Khan, F. Xhafa, Genetic algorithms for energy-aware scheduling in computational grids, in: Proc. of the 6th IEEE International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC2011), 2628.10.2011, Barcelona, Spain, 2011, pp. 17–24. [4] J. Kołodziej, F. Xhafa, Integration of task abortion and security requirements in ga-based meta-heuristics for independent batch grid scheduling, Computers and Mathematics with Applications 63 (2012) 350–364. [5] A. Beloglazov, R. Buyya, Y. Lee, A.Y. Zomaya, A taxonomy and survey of energyefficient data centers and cloud computing systems, Advances in Computers 82 (2011) 47–111. [6] G.L. Valentini, W. Lassonde, S.U. Khan, N. Min-Allah, S.A. Madani, J. Li, L. Zhang, L. Wang, N. Ghani, J. Kołodziej, H. Li, A.Y. Zomaya, C.-Z. Xu, P. Balaji, A. Vishnu, F. Pinel, J.E. Pecero, D. Kliazovich, P. Bouvry, An overview of energy eciency techniques in cluster computing systems, Cluster Computing (2011) http://dx.doi.org/10.1007/s10586-011-0171-x. [7] J. Kołodziej, S.U. Khan, A.Y. Zomaya, A taxonomy of evolutionary-inspired solutions for energy optimization: problems and intelligent resolution techniques, in: Advances in Intelligent Modelling and Simulation: Artificial Intelligence-based Models and Techniques in Scalable Computing, in: Studies in Computational Intelligence, vol. 422, Springer-Verlag, Berlin, Heidelberg, 2012, pp. 215–233 (Chapter 10). [8] A.M. Caulfield, L.M. Grupp, S. Swanson, Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications, in: Proc. of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’09, 2009. [9] High-density computing: A 240-processor beowulf in one cubic meter, in: Proc. of the IEEE/ACM SC2002 Conference, Baltimore, Maryland, 2002. [10] A.Y. Zomaya, Energy-aware scheduling and resource allocation for largescale distributed systems, in: 11th IEEE International Conference on High Performance Computing and Communications, HPCC, Seoul, Korea, 2009. [11] Y.C. Lee, A.Y. Zomaya, Minimizing energy consumption for precedenceconstrained applications using dynamic voltage scaling, in: Proc. of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid CCGrid, Shanghai, China, 2009, pp. 92–99. [12] S. Khan, A self-adaptive weighted sum technique for the joint optimization of performance and power consumption in data centers, in: Proc. of the 22nd International Conference on Parallel and Distributed Computing and Communication Systems, PDCCS, USA, 2009, pp. 13–18. [13] K. Guzek, J.E. Pecero, B. Dorrosoro, P. Bouvry, S.U. Khan, A cellular genetic algorithm for scheduling applications and energy-aware communication optimization, in: Proc. of the ACM/IEEE/IFIP Int. Conf. on High Performance Computing and Simulation, HPCS, Caen, France, 2010, pp. 241–248. [14] F. Pinel, J. Pecero, P. Bouvry, S.U. Khan, Memory-aware green scheduling on multi-core processors, in: Proc. of the 39th IEEE International Conference on Parallel Processing, ICPP, 2010, pp. 485–488. [15] D. Kliazovich, P. Bouvry, S.U. Khan, Dens: data center energy-efficient network-aware scheduling, in: Proc. of ACM/IEEE International Conference on Green Computing and Communications, GreenCom, Hangzhou, China, December 2010, 2010, pp. 69–75. [16] R. Subrata, A. Zomaya, B. Landfeldt, Cooperative power-aware scheduling in grid computing environments, Journal of Parallel and Distributed Computing 70 (2010) 84–91. [17] S. Khan, A goal programming approach for the joint optimization of energy consumption and response time in computational grids, in: Proc. of the 28th IEEE International Performance Computing and Communications Conference, IPCCC, Phoenix, AZ, USA, 2009, pp. 410–417. [18] G. Shen, Y. Zhang, A new evolutionary algorithm using shadow price guided operators, Applied Soft Computing 11 (2) (2011) 1983–1992. [19] G. Shen, Y. Zhang, A shadow price guided genetic algorithm for energy aware task scheduling on cloud computers, in: Proc. of the 3rd International Conference on Swarm Intelligence, ICSI-2011, 2011, pp. 522–529. [20] Y. Kessaci, M. Mezmaz, N. Melab, E.-G. Talbi, D. Tuyttens, Parallel evolutionary algorithms for energy aware scheduling, in: P. Bouvry, H. Gonzalez-Velez, J. Kołodziej (Eds.), Intelligent Decisions Systems in Large-Scale Distributed Environments, in: Studies in Computational Intelligence, vol. 362, Springer Vlg, 2011 (Chapter 4). [21] M. Mezmaz, N. Melab, Y. Kessaci, Y. Lee, E.-G. Talbi, A. Zomaya, D. Tuyttens, A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems, Journal of Parallel and Distributed Computing (2011) http://dx.doi.org/10.1016/j.jpdc.2011.04.007. [22] D. Whitley, S. Rana, R. Heckendorn, The island model genetic algorithm: on separability, population size and convergence, Journal of Computing and Information Technology 7 (1998) 33–47. [23] M. Humphrey, M. Thompson, Security implications of typical grid computing usage scenarios, in: Proc. of the Conf. on High Performance Distributed Computing, 2001. [24] S. Hwang, C. Kesselman, A flexible framework for fault tolerance in the grid, Journal of Grid Computing 1 (3) (2003) 251–272.
J. Kołodziej et al. / Future Generation Computer Systems ( [25] J. Abawajy, An efficient adaptive scheduling policy for high performance computing, Future Generation Computer Systems 25 (3) (2009) 364–370. [26] S. Song, K. Hwang, Y. Kwok, Risk-resilient heuristics and genetic algorithms for security- assured grid job scheduling, IEEE Transactions on Computers 55 (6) (2006) 703–719. [27] S. Song, K. Hwang, Y. Kwok, Trusted grid computing with security binding and trust integration, Journal of Grid Computing 3 (1–2) (2005) 53–73. [28] Y.-K. Kwok, K. Hwang, S. Song, Selfish grids: game-theoretic modeling and nas/psa benchmark evaluation, IEEE Transactions on Parallel and Distributing Systems 18 (5) (2007) 1–16. [29] C.-C. Wu, R.-Y. Sun, An integrated security-aware job scheduling strategy for large-scale computational grids, Future Generation Computer Systems 26 (2) (2010) 198–206. [30] J. Kołodziej, F. Xhafa, Meeting security and user behaviour requirements in grid scheduling, Simulation Modelling Practice and Theory 19 (1) (2011) 213–226. [31] R. Buyya, M. Murshed, Gridsim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing, Concurrency and Computation: Practice and Experience 14 (13–15) (2002) 1175–1220. [32] S. Garg, R. Buyya, H. Segel, Scheduling parallel applications on utility grids: time and cost trade-off management, 2009. [33] A. Abraham, R. Buyya, B. Nath, Nature’s heuristics for scheduling jobs on computational grids, in: Proc. of the 8th IEEE International Conference on Advanced Computing and Communications, India, 2000, pp. 45–52. [34] J. Kołodziej, F. Xhafa, Enhancing the genetic-based scheduling in computational grids by a structured hierarchical population, Future Generation Computer Systems 27 (2011) 1035–1046. [35] D. Klusaček, H. Rudová, Efficient grid scheduling through the incremental schedule-based approach, Computational Intelligence 27 (1) (2011) 4–22. [36] R. Graham, E. Lawler, J. Lenstra, A. Rinnooy Kan, Optimization and approximation in deterministic sequencing and scheduling: a survey, Annals of Discrete Mathematics 5 (1979) 287–326. [37] P. Brucker, Scheduling Algorithms, Springer Vlg, 2007. [38] S. Ali, H. Siegel, M. Maheswaran, D. Hensgen, Task execution time modeling for heterogeneous computing systems, in: Proc. of the Workshop on Heterogeneous Computing, 2000, pp. 185–199. [39] L. Lapin, Probability and Statistics for Modern Engineering, second ed., 1998. [40] F. Xhafa, A. Abraham, Computational models and heuristic methods for grid scheduling problems, Future Generation Computer Systems 26 (2010) 608–621. [41] R. Ge, X. Feng, K. Cameron, Performance-constrained distributed dvs scheduling for scientific applications on power-aware clusters, in: Proc. of the 2005 ACM/IEEE Conference on Supercomputing, 2005. [42] R.J. Baker, CMOS: Circuit Design, Layout, and Simulation, Second ed., Wiley, 2008. [43] J. Lorch, A. Smith, Improving dynamic voltage scaling algorithms with pace, in: Proc. of the 2001 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2001, pp. 50–61. [44] R. Min, T. Furrer, A. Chandrakasan, Dynamic voltage scaling techniques for distributed microsensor networks, in: Proc. IEEE Workshop on VLSI, 2000, pp. 43–46. [45] H. Aguirre, K. Tanaka, Working principles, behavior, and performance of moeas on mnk-landscapes, European Journal of Operational Research 181 (2007) 1670–1690. [46] Z. Michalewicz, Genetic Algorithms + Data Structures = Evolution Programs, Springer Vlg, 1992. [47] F. Xhafa, J. Carretero, A. Abraham, Genetic algorithm based schedulers for grid computing systems, International Journal of Innovative Computing, Information and Control 3 (5) (2007) 1053–1071. [48] F. Xhafa, J. Carretero, Experimental study of ga-based schedulers in dynamic distributed computing environments, in: Alba, et al. (Eds.), Optimization Techniques for Solving Complex Problems, Wiley, 2009 (Chapter 24). [49] P.S. Mann, Introductory Statistics, seventh ed., Wiley, 2010. [50] M. Bartschi Wall, A genetic algorithm for resource-constrained scheduling. Ph.D. Thesis, Massachusetts Institute of Technology, MA, 1996. [51] T. Grueninger, Multimodal optimization using genetic algorithms, Technical Report, Department of Mechanical Engineering, MIT, Cambridge, MA, 1997. [52] J. Kołodziej, S. Khan, Multi-level hierarchical genetic-based scheduling of independent jobs in dynamic heterogeneous grid environment, Information Sciences 214 (2012) 1–19. [53] F. Xhafa, J. Carretero, L. Barolli, A. Durresi, Requirements for an event-based simulation package for grid systems, Journal of Interconnection Networks 8 (2) (2007) 163–178. [54] J. Kołodziej, M. Rybarski, An application of hierarchical genetic strategy in sequential scheduling of permutated independent jobs, Evolutionary Computation and Global Optimization, in: Lectures on Electronics Jarosław Arabas (Ed.), vol. 1, Warsaw University of Technology, 2009, pp. 95– 103. [55] J. Kołodziej, F. Xhafa, Modern approaches to modelling user requirements on resource and task allocation in hierarchical computational grids, International Journal on Applied Mathematics and Computer Science 21 (2) (2011) 243–257.
)
–
15
Joanna Kołodziej graduated in Mathematics from the Jagiellonian University in Cracow in 1992, where she also obtained the Ph.D. in Computer Science in 2004. She is employed at Cracow University of Technology as an Assistant Professor. She has served and is currently serving as PC Co-Chair, General Co-Chair and IPC member of several international conferences and workshops including PPSN 2010, ECMS 2011, CISIS 2011, 3PGCIC 2011, CISSE 2006, CEC 2008, IACS 2008–2009, ICAART 2009–2010. Dr. Kołodziej is Managing Editor of IJSSC Journal and serves as a EB member and guest editor of several peer-reviewed international journals. For more information, please visit: http://www.joannakolodziej.org/.
Samee Ullah Khan is Assistant Professor of Electrical and Computer Engineering at the North Dakota State University, Fargo, ND, USA. Prof. Khan has extensively worked on the general topic of resource allocation in autonomous heterogeneous distributed computing systems. Recently, he has been actively conducting cutting edge research on energy-efficient computations and communications. A total of 111 (journal: 40, conference: 51, book chapter: 12, editorial: 5, technical report: 3) publications are attributed to his name. For more information, please visit: http://sameekhan.org/.
Lizhe Wang is a full professor at Center for Earth Observation and Digital Earth, Chinese Academy of Sciences, China. He also holds a ‘‘ChuTian Scholar’’ Chair Professor position at School of Computer, China University of Geosciences. Dr. Wang received his Bachelor and Master of Engineering degrees from Tsinghua University, China and Doctoral of Engineering from University Karlsruhe, Germany. His research interests include high performance computing, data-intensive computing, and Grid/Cloud computing.
Marek Kisiel-Dorohinicki received his Ph.D. in 2001 at AGH University of Science and Technology in Cracow. He works as an assistant professor at the Department of Computer Science of AGH-UST. His research focuses on intelligent software systems, particularly utilizing agent technology and evolutionary algorithms, but also other soft computing techniques.
Sajjad Ahmad Madani works at COMSATS institute of information technology (CIIT) Abbottabad Campus as associate professor. He joined CIIT in August 2008 as Assistant Professor. Previous to that, he was with the institute of computer technology from 2005 to 2008 as guest researcher where he did his Ph.D. research. He has already done B.Sc. Civil Engineering from UET Peshawar and was awarded a gold medal for his outstanding performance in academics. His areas of interest include low power wireless sensor network and application of industrial informatics to electrical energy networks. He has published more than 35 papers in international conferences and journals.
Ewa Niewiadomska-Szynkiewicz is a professor of control and information engineering at the Warsaw University of Technology, head of the Complex Systems Group. She is also the Director for Research of Research and Academic Computer Network (NASK). She is the author and coauthor of three books and over 120 journal and conference papers. Her research interests focus on complex systems modeling, control, and optimization, computer simulation, global optimization, parallel computation and computer networks. She was involved in a number of research projects including EU projects, coordinated the Groups activities, managed organization of a number of national-level and international conferences.
16
J. Kołodziej et al. / Future Generation Computer Systems ( Albert Y. Zomaya is currently the Chair Professor of High Performance Computing and Networking in the School of Information Technologies, The University of Sydney. He is the author/co-author of seven books, more than 400 papers, and the editor of nine books and 11 conference proceedings. He is the Editor in Chief of the IEEE Transactions on Computers and serves as an associate editor for 19 leading journals. Professor Zomaya is the recipient of the IEEE TCPP Outstanding Service Award and the IEEE TCSC Medal for Excellence in Scalable Computing, both in 2011.
)
– Cheng-Zhong Xu received the Ph.D. degree in computer science from the University of Hong Kong in 1993. He is currently a professor in the Department of Electrical and Computer Engineering of Wayne State University, and the director of Cloud and Internet Computing Laboratory (CIC) and Suns Center of Excellence in Open Source Computing and Applications(OSCA). His research interest is mainly in scalable distributed and parallel systems and wireless embedded computing devices. He has published two books and more than 160 articles in peer-reviewed journals and conferences in these areas. He is a senior
member of the IEEE.