Expert Systems With Applications 125 (2019) 369–377
Contents lists available at ScienceDirect
Expert Systems With Applications journal homepage: www.elsevier.com/locate/eswa
Resource selection in computational grids based on learning automata Alireza Enami, Javad Akbari Torkestani∗, Abbas Karimi Department of Computer Engineering, Arak Branch, Islamic Azad University, Arak, Iran
a r t i c l e
i n f o
Article history: Received 8 December 2018 Revised 29 January 2019 Accepted 30 January 2019 Available online 31 January 2019 Keywords: Grid computing Learning automata Task scheduling Resource selection
a b s t r a c t Grid computing, in simplest terms, is distributed computations that have reached higher evolution level. Grid scheduler is part of Grid. Scheduler generates an assignment of jobs to resources using the resource information provided by the Grid information service. Since the problems raised in the resource management system are NP-hard, classical methods, such as dynamic programming, are useful only for small-size problems. Being capable of producing efficient schedules at an acceptable time, even for large samples of the problem, heuristic algorithms are promising methods for solving the scheduling problem. The resource scheduling process in the Grid consists of three main phases: resource discovery, resource selection and job execution. In this paper, we propose an algorithm based on learning automata to resource selection in computational Grid. In this algorithm, decisions are made based on the list of resources that discovered at the resource discovery phase, and after being selected based on predicted time to execution or completion job, they would sent to the next phase namely the execution phase. The efficiency of the proposed algorithm is evaluated through conducting several simulation experiments under different Grid scenarios. The obtained results are compared with several existing methods in terms of the average turn-around time, average response time and throughput. © 2019 Elsevier Ltd. All rights reserved.
1. Introduction The scientific problems are extremely complex today so that huge computing power is required to solve them. The emergence of the term “Grid” dates back to the middle of the 1990s, which was particularly proposed for the distributed computing system, to provide computing services on-demand services like power and water supply networks (Amiri, Keshavarz, Ohshima, and Komaki, 2014; Ferreira et al., 2003). Ian Foster has defined the Grid computing as follows: Grid computing is about “coordinated resource sharing, problem solving in dynamic and multi-institutional virtual organization.” A set of entities and institutions defined by some shared rules under the name of virtual organization. This kind of virtualization is only possible through the use of open standards. The open standards ensure that the resources required by the applications are provided to them transparently and appropriately (Amalarethinam and Muthulakshmi, 2011; Foster, Kesselman, and Tuecke, 2001; Haque, Alhashmi, and Parthiban, 2011; Qureshi et al., 2014; Sangwan and Khurana, 2012). Generally, Grid systems are utilized to increase the utilization of heterogeneous systems with the aim of optimizing workload man-
∗
Corresponding author. E-mail addresses:
[email protected] (A. Enami),
[email protected] (J. Akbari Torkestani),
[email protected] (A. Karimi). https://doi.org/10.1016/j.eswa.2019.01.076 0957-4174/© 2019 Elsevier Ltd. All rights reserved.
agement (Kumari and Kumar, 2015). This facilitates the use of computational resources as they are formed within an organization like servers, network nodes, storage elements, or data backups. These resources collect together as a set to produce a robust computing environment. Grid computing enables independent users and organizations to utilize the untapped CPU cycles, databases, scientific tools and storage elements related to millions of computer systems placed in the global network with the minimal access cost. End users pay no or little fee to use of these resources. Grid computing is similar to power Grid lines, where several power companies take part in. This provides a model for data and computing resources sharing regardless of their location and origin. Also, Grid technology has emanated from existing technologies such as distributed computing, the World Wide Web (WWW), various cryptography applications, and the Internet. Grid users usually submit their jobs to Grid operating system via an interface. Then, the Grid system decides on locating suitable/available computing resources that can serve the needs of users (Argungu, A, and Omar, 2015; Dong, 2007; Nagariya and Mishra, 2013; Sood, Kour, and Kumar, 2016). A complete study of the Grid has been carried out in references (Foster and Kesselman, 20 03; Kandagatla, 20 03; Krauter, Buyya, and Maheswaran, 2002a, 2002b). Grid computing is coming up as a promising next generation technology, problem solving environment for science, engineering and research area, it has emerged as an important new field, distinguished from conventional distributed computing by its focus
370
A. Enami, J. Akbari Torkestani and A. Karimi / Expert Systems With Applications 125 (2019) 369–377
Fig. 1. Types of resources in the Grid.
on large-scale resource sharing, innovative applications. Grid computing is an issue resolving environment for utilizing the unused resources and maximizing the resource capability, also it is an innovative approach that leverages existing IT infrastructure to optimize compute resources and manage data and computing workloads (Kanzariya and Patel, 2013; Krawczyk and Bubendorfer, 2008; Vickrey, 1961). The concepts used later are described in this following section (Abba, Zakaria, and Haron, 2012; Xhafa and Abraham, 2010). Task: Represents a computational unit (typically a program and possibly associated data) to run on a Grid node. Although in the literature there is no unique definition of task concept, usually a task is considered as an indivisible schedulable unit. Tasks could be independent (or loosely coupled) or there could be dependencies (Grid workflows). Job: A job is a computational activity made up of several tasks that could require different processing capabilities and could have different resource requirements (CPU, number of nodes, memory, software libraries, etc.) and constraints, usually expressed within the job description. In the simplest case, a job could have just one task. Application: An application is the software for solving a problem in a computational infrastructure; it may require splitting the computation into jobs or it could be a “monolithic” application. In the later case, the whole application is allocated in a computational node and is usually referred to as application deployment. Applications could have different resource requirements and constraints, usually expressed within the application description. Resource: Resource is a basic computational entity (computational device or service) where tasks, jobs and applications are scheduled, allocated and processed accordingly. Resources have their own characteristics such as CPU characteristics, memory, software, etc. Several parameters are usually associated with a resource, among them the processing speed and workload, which change over time. Moreover, the resources may belong to different administrative domains, implying different policies on usage and access. The resource has several types (Fig. 1) (Navimipour, Rahmani, Navin, and Hosseinzadeh, 2014; Qureshi, et al., 2014). The rest of the article is organized as follows. Section 2 deals with the definition of resource allocation and its stages. Literature is reviewed in the Section 3. In Section 4, the definition of learning algorithms and learning automata with variable action set is described. In Section 5, the selection algorithm based on learning automata is proposed. In Section 6, the results of the proposed algorithm are presented, and finally, conclusion is provided in Section 7. 2. Resource allocation Resource allocation is one of the most important system services that should always be available to meet the goals of Grid computer. A common problem in Grid computing is to select the
best resource for running a specific program, also the users require to reserve the resources needed to run their programs on the Grid (Kanzariya and Patel, 2013; Nagariya and Mishra, 2013). Resource allocation is a process of distributing limited available resources between jobs based on predefined rules (Ismail, 2007; Qureshi et al., 2014; Wu, Ye, and Zhang, 2005). The resource allocation mechanisms play an essential role in the scheduling process in the Grid systems, and the efficiency of the mechanisms determines the quality of the service provided to users (Dakkak, Arif, and Nor, 2015). The resource allocation process in the Grid consists of four main phases (Batista and da Fonseca, 2007; Qureshi et al., 2014; Yousif, Abdullah, Latiff, and Bashir, 2011): 1. Scheduling: Resource scheduling is the process of mapping the job to the resource and contains 3 main phases (Nabrzyski, Schopf, and Weglarz, 2004): 1 Resource Discovery: In this phase, available resources are searched and a list of resources is generated. 2. Resource Selection: In this phase, the best resource is selected based on the quality of service from the list produced in the previous step. 3. Job Execution: Job execution is related to sending job to the resource/resources selected to execution. 2. Code Transfer: This phase involves transferring the job code to the desirable resource for execution. 3. Data Transmission: this phase has a close relationship with the previous phase in running the jobs. Job execution is done in case of completion of code transfer and data transmissions. 4. Monitoring: Monitoring is the process of gathering information about the status and features of given resources. In general, monitoring is the implementation of jobs, resources available, eligible, for using them and reserving them for the future. 3. Related works Resource Selection in the Grid system is to select a resource or a set of resources to execute jobs. A set of mechanisms and algorithms are introduced in this regards. Vijayakumar and WahidhaBanu (Vijayakumar and WahidhaBanu, 2008) proposed a secure algorithm for resource selection and job scheduling based on the trust factor (TF). TF is calculated based on the self-protection capability and reputation weight obtained from user community on its past behavior for each resource. Liu et al. (Liu, Xu, and Ma, 2007) developed a model based on proper repeat, as well as repetition control module for the economic Grid and their internal structure have been described, and a repetition-based min-min algorithm has been provided for resource selection. In (Toporkov, Toporkova, Bobchenkov, and Yemelyanov, 2011) presented multi-variant search algorithms for scheduling. The algorithms have linear complexity and reached optimal execution of a set of jobs by creating different execution configurations. Kavitha and Sankaranarayanan (Ganesh Kavitha and
A. Enami, J. Akbari Torkestani and A. Karimi / Expert Systems With Applications 125 (2019) 369–377
Sankaranarayanan, 2013) proposed a dynamic service level agreements (SLA) framework by monitoring the execution of work as well as the use of the resource by predicting the characteristics of the load on resources. Bawa and Sharma (Bawa and Sharma, 2012) provided a reliability-based approach to select resources in a Grid environment. In this approach, the reliability factor (RF) is calculated, and jobs are sent to resources that have higher RFs. In (G Kavitha and Sankaranarayanan, 2011) proposed an algorithm to select the resource based on the secure execution of a resource and a defined service quality. Selected resources are from a secure list that meets user requirements such as computational power, shorter response time, and lower budget. The selection strategy is based on user service quality parameters so that they meet the performance requirements and resource efficiency. Doulamis et al. (Doulamis, Kokkinos, and Varvarigos, 2014) proposed an algorithm based on the scheduling requirements, namely the start and end times. The purpose of assigning jobs to resources in this algorithm is to minimize the violation of scheduling requirements and to increase resource efficiency. This algorithm utilizes graph concepts that use spectral clustering methodology through normalized cuts. Various factors have been proposed for scheduling resource in the Grid environment, such as total time, global and local assignment policies, and resource efficiency. Nandagopal and Uthariaraj (Nandagopal and Uthariaraj, 2011) proposed a multi criteria resource selection algorithm (MCRS). Several conditions have been used to select a resource to run the job, including processing power, workload, and network bandwidth of the resource. A bidding-based resource selection approach can be used to avoid single point of failure and server overload issues in distributed systems. Although this method also faces the lack of a global information system for optimal decision making, however the applicants have access to partial information provided by resource providers. Wang et al. (Wang, Chen, Hsu, and Lee, 2010) proposed a set of non-reserved bidding-based deterministic and probabilistic heuristic algorithms for resource selection aiming at minimizing the turn-around time. In the present paper, several informational levels have been designed about the competitors to finally sum up which of the applicants can make better decisions considering the available information about the competitors’ jobs status. The resource selection mechanism is based on the two main minimum execution time (MET) and minimum completion time (MCT) algorithms in this paper. In these algorithms, the processing time for all jobs that are queued at resources is expressed as resource availability which is defined as the earliest time that the resource can complete the execution of all jobs that has previously assigned to it. In MET, each job is assigned to a resource based solely on resource processing speed and job size without considering resource availability. In contrast, MCT assigns each job to a resource that has minimum completion time for that job by taking into account the resource availability. In other words, in MET, each job is assigned to the resource based on the processing speed and the job size regardless of the resource availability. In the MCT algorithm, each job is assigned to the resource considering the resource availability and based on the minimum completion time of the job. 4. Learning automata The learning algorithm T is shown as P(n + 1) = T(p(n), α (n), β (n)). If T is a linear operator, the reinforcement learning algorithm is called linear. Otherwise, the learning algorithm is called nonlinear (MAL Thathachar and Harita, 1987; MA Thathachar and Narendra, 1989). The main idea of all learning algorithms is as follows: If the learning automaton chooses α i in the nth iteration, and receives an optimal response from the environment, pi (n) (the probability of the action α i increases and the probability of other
371
actions decrease. Instead, in case of undesirable response from the environment, the probability of action α i decreases and the probability of other automata actions increase. In any case, the changes are made in such a way that the sum of pi (n) remains constant and equal 1. If the action α i is not selected in nth step, then in the (n + 1)th step we have: A. The desirable response from the environment
pi (n + 1 ) = pi (n ) + a[1 − pi (n )] pj (n + 1 ) = (1 − a )pj (n ) ∀ j, j = i
(4–1)
B. The undesirable response from the environment
pi (n + 1 ) = (1 − b )pi (n ) b pj (n + 1 ) = r−1 (1 − b )pj (n )
(4–2)
∀ j, j = i
Considering the a and b values in the above relations, three modes can be considered. If the values a and b are equal, the learning automata is called linear reward penalty (LRP ). When b is equal to zero, the learning automata are called linear reward inaction (LRI ). If b a, the learning automata is called linear reward epsilon penalty (LRεP ). Nonlinear updating methods have also been introduced (Sivaramakrishnan Lakshmivarahan, 2012; S Lakshmivarahan and Thathachar, 1972; S Lakshmivarahan and Thathachar, 1973), which show no improvement than the other linear methods. 4.1. Learning automata with variable actions The learning automata has a fixed number of actions; however in some applications, there is a need to automata with variable actions (MAL Thathachar and Harita, 1987). At the time n, this Automaton selects its action among from only one non-empty subset of actions called active actions. The selection of the set V(n) is done randomly and by an external agent. How it works is as follows. To select an action in the time n, first calculates the total probability of its active actions (K(n)), and then, calculates the vector p(n) according to (4–3). Then, automata select randomly an action from its active action set according to the probability vector p(n) and apply to the environment. If the selected action is α i , after receiving the environmental response, the automata will update the probability vector p(n) in his actions based on the relationship (4–4) in case of receiving reward and according to Eq. (4– 5) in the case of receiving a penalty (in the P model environment) (S Lakshmivarahan and Thathachar, 1973).
pi (n ) = prob[ α (n ) = αi |V(n ) is set of active actions, p (n ) αi ∈ V(n )] = i K (n )
(4–3)
A. The desirable response from the environment
pi (n + 1 ) = pi (n ) + a[1 − pi (n )] α (n ) = αi pi (n + 1 ) = pj (n ) + a.pi (n ) α (n ) = αi , ∀ j, j = i
(4–4)
B. The undesirable response from the environment
pi (n + 1 ) = (1 − b )pi (n ) α (n ) = αi b pi (n + 1 ) = r−1 (1 − b )pj (n ) α (n ) = αi ,
∀ j, j = i
(4–5)
Then, the automata update the probability vector of the actions p(n) using the vector p(n + 1) as follows:
pj (n + 1 ) = pj (n + 1 ).K(n ) for all j, αj ∈ V(n ) pj (n + 1 ) = pj (n ) for all j, αj ∈ / V (n )
(4–6)
5. Resource selection algorithm Considering the challenging features of Grid, including the dynamic structure of the computational Grid, the high heterogeneity of resources and jobs, existence of local scheduling and the large
372
A. Enami, J. Akbari Torkestani and A. Karimi / Expert Systems With Applications 125 (2019) 369–377
scale of the Grid system, the heuristic solutions are undoubtedly the best way to solve the resource selection problem in the Grid system. The features of these solutions include (Singh, Chhabra, and GNDU, 2015; Xhafa and Abraham, 2010): 1. 2. 3. 4. 5. 6. 7. 8. 9.
Heuristic solutions are well-understood There is no need for optimal solutions Effective heuristic in a short time Dealing with multi-objective nature Appropriateness for batch and periodic scheduling Appropriateness for decentralized solutions Hybridization with other practices Designing robust Grid schedulers Libraries and frameworks for meta-heuristic
The learning automata and its hybrid models can be considered as a suitable model for solving the above problem because of having the following features: 1. The learning automata are able to perfectly adapt themselves to environmental changes. This feature is very suitable for use in Grid environments with a high degree of dynamism (Che, Li, and Lin, 1998; Howell, Frost, Gordon, and Wu, 1997). 2. In addition to very low computational requirements, the learning automata impose a small amount of communication costs in interacting with the environment. This feature distinguishes learning automata as a suitable alternative for use in environments with energy constraints and bandwidth than the other models (Asnaashari and Meybodi, 2007; El-Osery, Baird, and Abd-Almageed, 2005; MA Thathachar and Narendra, 1989). 3. Interacting with each other, the learning automata are able to perfectly model the distribution of Grid environments and in addition, simulate the changing behavioral patterns of the nodes in relation to each other and with the environment considering their learning ability and their adaptability to the environment (Atlasis, Saltouros, and Vasilakos, 1998; Beigy and Meybodi, 2006; Billard and Lakshmivarahan, 1998; A. Economides and Silvester, 1988; A. A. Economides, 1997; Mason and Gu, 1986; Srikantakumar and Narendra, 1982). 4. Interacting with each other, the learning automata are able to converge to the global optimal answer only based on the local decisions when solving optimization problems. Therefore, learning automata-based algorithms can be considered as an appropriate choice for the Grid as they can resolve the slag resulted from aggregation or dissemination of information in centralized algorithms (Beigy, 2004; Beigy and Meybodi, 2004; Hariri, Rastegar, Navi, Zamani, and Meybodi, 2005; Reza R Rastegar and Meybodi, 2004; Rastegar, Arasteh, Hariri, and Meybodi, 2004; Reza Rastegar, Meybodi, and Hariri, 2006). 5. The learning automata complete their information required for decision-making in an iterable process and over time, from the environment in which they are located. Accordingly, the tolerance of learning automata-based algorithms in case of the occurrence of possible errors will not affect the algorithm’s performance than the other algorithms (Billard and Lakshmivarahan, 1998). 5.1. The proposed algorithm As it mentioned in Section 2, at first, available resources are searched and a list of resources is provided in the resource discovery phase. Then, the best resource is selected in the resource selection phase based on the quality of service terms from the list created in the previous step. Hence, in the proposed algorithm, the decision is also made to select the resources that are extracted in the resource discovery phase and the selected resource is provided to execution phase after the implementation of the proposed algorithm.
We have used the learning automata to select resources, so that we have use formulas 4–4 to 4–6 for reward, penalty, and updating automata, so that a = b = 0.01, and r is also the number of actions (resources) in these formulas. The required data structure to run the algorithm is as follows: 1. Resource Available Predicted Time: RAPT = []1∗ j is an array of 1∗ j such that j is equal to the number of available resources for executing specified job. In other words, RAPT[1][i] is equal to the job (execution or completion) predicted time by the ith resource. 2. The Resource Available Probability list: RAP = []1∗ j is the array of 1∗ j, where j is the number of resources available to execute the specified job. In other words, RAP[1][i] is equal to the probability of the ith resource for executing or completing a specified job. The job predicted time is predicted by resources as well as the available resources extracted at the resource discovery phase are used in this algorithm. Two parameters are utilized for the jobs predicted time: predicted time to execute job and predicted time to complete job. The flowchart of the proposed algorithm is shown in Fig. 2. In this algorithm, the extracted resources in the resource discovery phase are firstly listed in the RAPT array, so that each element of array includes the predicted time of the given job. Then, the RAP array is initialized based on the formula 4–3. In case that no resource is extracted or discovered in the resource discovery phase, then the RAPT array has a length of zero, and so the given job is blocked and the execution of the algorithm is terminated. Otherwise, one of the resources in the RAP array, is selected and stored in the R variable. If RAPT [R] has a smaller time than other resources in the RAPT array, then the selected resource will be rewarded and is provided to the execution phase as the selected resource and other resources are penalized and the RAP array is updated according to formulas 4–4 and 4–6; otherwise, in case that the selected resource has not a smaller time than the other resources in RAPT, then the selected resource will be penalized and other resources are rewarded and the RAP array is updated according to formulas 4–5 and 4–6 and the algorithm is again run from the resource selection stage of the RAP array. The pseudo code of the proposed algorithm is shown in Fig. 3. The pseudo code algorithm has a procedure that the resource selection phase is performed by it. In the main body of the algorithm, the resources, extracted in the resource discovery phase (if any), are stored in the RAPT array, and then, the RAP array, including the discovered resources selection probability, is initialized according to formula 4–3, and then, the Resource_Selection procedure are called with two RAPT and RAP inputs. As described in the previous paragraphs, if no resource is extracted in the resource discovery phase, then the PART array is empty and the job is blocked and therefore, the algorithm stops running. However, if a resource is extracted in the discovery phase and the length of the PART array is not zero, then one of the resources in the RAP array is selected. If the selected resource has a smaller time than other resources in the RAPT array, then the selected resource will be rewarded, and other resources will be penalized and the RAP array will be updated according to formulas 4–4 and 4–6 and the selected resource will be provided to the execution phase. However, if the term of smaller time of the selected resource than the other resources in RAPT is not met, then the selected resource will be penalized and other resources will be rewarded and the RAP array will be updated according to formulas 4–5 and 4–6 and the algorithm will be run again from the beginning of the FOR loop.
A. Enami, J. Akbari Torkestani and A. Karimi / Expert Systems With Applications 125 (2019) 369–377
373
Fig. 2. The flowchart of the proposed algorithm.
6. Results For the purpose of evaluating the proposed algorithm, a computational Grid environment is simulated including three Grid systems (small scale, medium scale, and large scale): a small scale Grid system consists of 16 nodes, 128 processing elements, 50 users and the total number of jobs provided by users of the Grid is 10 0 0; a medium scale Grid system includes 32 nodes, 256 processing elements, 100 users, and the total number of jobs provided by users of the Grid is 20 0 0, and eventually a large scale Grid system include 128 nodes, 1024 processing elements, 400 users, and the total number of jobs provided by Grid users is 80 0 0. It is also assumed that the nodes have the same number of processing elements in all systems. The computational capacity of the processing elements is obtained by a Gaussian probability distribution function with an average of 10 0 0 MIPS with the variance of 150. The nominal bandwidth that connects two nodes is 100 Mbps. The execution time is generated by a Normal distribution with average of 500 MI and a variance of 100. Each job consists of a number of tasks. Each job is divided into k tasks, so that k is randomly chosen from the set U {1,2,3,4}. In each user, the new job production rate follows a Poisson distribution at an average rate of {5,10,15,20}. To improve the accuracy of the report results, each test is repeated 50 times, independently, and the average results are presented. The parameters of this simulation are summarized in Table 1. Two different implementations of the proposed algorithm are proposed. The LA-RS-E algorithm wherein, the predicted time to execution jobs by resources and LA-RS-C wherein, the predicted time to complete jobs by resources. To demonstrate the efficiency of the proposed algorithm, the obtained results were compared
Table 1 Simulation parameters. Simulation parameter
Description
value
Number of grid nodes
Small scale Medium scale Large scale Small scale Medium scale Large scale Small scale Medium scale Large scale Small scale Medium scale Large scale Uniform distribution Poisson distribution
16 32 128 128 256 1024 50 100 400 10 0 0 20 0 0 80 0 0 U{1,2,3,4} P{5,10,15,20} 100 Mbps N(50 0,10 0) MI N(10 0 0,150) MIPS
Number of processors
Number of users
Total number of tasks
Number of tasks of each job (k) The job production rate Nominal bandwidth Execution time Processor computing capacity
Normal distribution Normal distribution
with MET (Wang et al., 2010), MCT (Wang et al., 2010) and MCRS (Nandagopal and Uthariaraj, 2011) algorithms, in terms of the average turn-around time, average response time and throughput. An algorithm is also implemented as RANDOM. The algorithms are simulated on a JAVA Based discrete-event Grid simulation toolkit for Grid environment called GridSim. This toolkit provides facilities for modeling and simulating Grid resources and Grid users with different capabilities and configuration. To simulate the resource selection mechanism in the GridSim environment requires the modeling and creation of GridSim
374
A. Enami, J. Akbari Torkestani and A. Karimi / Expert Systems With Applications 125 (2019) 369–377
Fig. 3. The pseudo code of the proposed algorithm.
resources and applications that model jobs (Sulistio, Cibej, Venugopal, Robic, and Buyya, 2008). 6.1. Average Turn-around time and average response time Reduced turn-around time is one of the most important and general optimization criteria which can be defined as the time interval between accepting a job and completing it. This time includes execution time and waiting time for resources. Another important optimization criterion is the average response time. The response time is the interval between providing a job and the start of the response reception. A process mostly begins to produce an output when the job processing is continued. Then, this criterion is better than turn-around time in terms of the user and reducing and minimizing it, reduces the average response time of the Grid system. Fig. 4 shows the average turn-around time and average response time of different algorithms under different Grid scales. Resource selection is performed randomly in the randomized algorithm, and no additional information such as job execution time and job completion time is used. Therefore, this algorithm has poor results compared to other algorithms and, as it is clear, this inefficiency is more apparent with increasing Grid scales. For instance, the difference between the RANDOM and LA-RS-C algorithms in small, medium and large scales are 173, 181, and 450, respectively in terms of average turn-around time criterion, and are 153, 175, and 390 respectively in term of the average response time. According to Fig. 4(c), the difference between the two criteria of the average turn-around time and average response time is
increased in the RANDOM algorithm by increasing the Grid scale. Also, this difference is larger than all algorithms only in a large scale Grid. The MET and LA-RS-E algorithms use predicted time to execute jobs by resources and outperform the RANDOM algorithm. However, they do not have a good efficiency than the other algorithms due to lack of attention to waiting and job transfers times. In terms of the average turn-around time criterion, the LA-RS-E algorithm outperforms the MET algorithm in small and large scales with 18 and 40 units of difference, respectively. However, the MET algorithm outperforms in a medium scale with 10 units difference. In terms of the average response time criterion, the LA-RS-E algorithm outperforms the MET algorithm in a small and large scale Grid, with 11 and 28 units of difference, respectively. However, the MET algorithm outperforms in a medium scale with 15 units difference. According to Fig. 4(c), we find that the difference between the two criteria of the average turn-around time and average response time in the MET algorithm is greater than that of the LRRS-E algorithm in all scales and increases with increasing the Grid scale. Also, this difference in the MET algorithm is smaller than other algorithms in small scale Grid, and in the LR-ES-E algorithm, this difference is smaller than other algorithms in medium and large Grid scales. The MCRS algorithm uses parameters such as processing power, workload and network bandwidth to select resources, and does not consider the parameters such as execution time and completion time; nevertheless, it outperforms RANDOM, MET and LA-RS-E algorithms and has the higher average turn-around and
A. Enami, J. Akbari Torkestani and A. Karimi / Expert Systems With Applications 125 (2019) 369–377
375
Fig. 4. (a) Average turn-around time for different algorithms under different Grid scales, (b) Average response time for different algorithms under different Grid scales. (c) Turn-around time and average response time for different algorithms under different Grid scales.
response time compared with the MCT and LA-RS-C algorithms. For example, the MCRS algorithm outperforms the RANDOM, MET and LA- RS-E algorithms with 400, 70, and 30 units of difference, respectively in terms of the average turn-around time and in a large scale Grid; however it is worse than the MCT and LA-RS-C algorithms with 37 and 50 units of difference, re-
spectively in terms of the average turn-around time and in a large scale Grid. The difference between the two criteria of the average turn-around and response time increases by increasing the scale of the Grid in the MCRS algorithm. Also, this difference is larger than all other algorithms in a medium scale Grid.
376
A. Enami, J. Akbari Torkestani and A. Karimi / Expert Systems With Applications 125 (2019) 369–377
Fig. 5. Throughput for different algorithms under different Grid scales.
Due to the use of the completion time prediction parameter, MCT and LA-RS-C algorithms outperforms other algorithms. However, the LA-RS-C algorithm is relatively improved compared to the MCT algorithm. For example, the differences between the MCT and LA-RS-C algorithms in small, medium and large scales are 3, 5 and 13, respectively in terms of average turn-around time criterion, and is 0, 3, and 11, respectively in terms of the average response time. The difference between the two criteria of the average turn-around time and average response time is increased by increasing the scale of the Grid in both algorithms. Also, in the LA-RS-C algorithm, this difference is smaller than all other algorithms only in a small scale Grid. 6.2. Throughput Throughput is another important criterion that indicates the number of jobs that are processed per unit time. The simulation results are presented in Fig. 5. As mentioned earlier, the information such as execution time and completion time are not involved in the random algorithm, and resource selection is performed randomly. Therefore, it has poor results compared to other algorithms. For instance, the difference between the RANDOM and LA-RS-C algorithms is 240, 10 0 0, and 4600, respectively in small, medium and large scales. It is clear that according to the results, this difference is increased and the efficiency of the RANDOM algorithm is reduced by increasing the Grid scale. Due to the use of predicted time to execution jobs, MET and LA-RS-E algorithms outperforms the RANDOM algorithm. However, as noted in the previous sections, they do not have the proper efficiency compared to other algorithms due to the ignoring of waiting and job transfer times. The LA-RS-E algorithm outperforms the MET algorithm in a medium and large scale, with 400 and 1100 units of difference respectively, however, the MET algorithm outperforms in a small scale Grid with 50 units of difference. The MCRS algorithm outperforms the RANDOM, MET, and LARS-E algorithms, and has a lower throughput compared to the MCT and LA-RS-C algorithms. For example, the MCRS algorithm outperforms RANDOM, MET, and LA-RS-E algorithms with 370 0, 130 0, and 200 units of difference respectively in a large scale Grid, and is worse than the MCT and LA-RS-C algorithms with 700 and 900 units of difference, respectively in a large scale Grid. Such results are obtained because the MCRS algorithm uses parameters such as processing power, workload and network bandwidth to select resources, and does not consider parameters such as execution time and completion time.
MCT and LA-RS-C algorithms outperform other algorithms due to the use of the completion time prediction parameter. As it is clear from the results, the MCT algorithm outperforms the LA-RS-C algorithm with 100 units of difference in the medium scale Grid, and the LA-RS-C algorithm outperforms with the 20 and 200 units of difference in small and large scales, respectively. 7. Conclusion The present study proposed a learning automata-based algorithm to solve the resource selection problem in computational Grids. The job predicted time by resources is used in this algorithm. In other words, two different implementations of the algorithm are performed based on the predicted time including: the predicted time to execution jobs (LA-RS-E) and the predicted time to completion jobs (LA-RS-C). The data structures, used in this algorithm, is the job predicted time by the available resources, extracted in the resource discovery phase (RAPT), as well as the list of available resources probability to execution jobs (RAP). The decision-making, in this algorithm, is based on the list of resources that are extracted at the resource discovery phase and, the selected resource will be sent to the next stage, namely the execution phase, after selecting the resource based on the shortest predicted time. For the purpose of demonstrate the efficiency of the proposed algorithm, several simulations have been performed on three small, medium and large scales of Grid. Finally, the results of the proposed algorithm were compared with the results of RANDOM, MET, MCT and MCRS algorithms. According to the obtained results, the proposed algorithm outperforms the above mentioned algorithms in terms of average turn-around time, average response time and throughput. Credit authorship contribution statement Alireza Enami: Methodology, Software, Investigation, Resources, Writing - original draft. Javad Akbari Torkestani: Conceptualization, Methodology, Writing - review & editing. Abbas Karimi: Validation, Resources. References Abba, H. A., Zakaria, N. B., & Haron, N. (2012). Grid resource allocation: a review. Research Journal of Information Technology, 4, 38–55. Amalarethinam, D. G., & Muthulakshmi, P. (2011). An overview of the scheduling policies and algorithms in grid computing. International Journal of Research and Reviews in Computer Science, 2, 280–294. Amiri, E., Keshavarz, H., Ohshima, N., & Komaki, S. (2014). Resource allocation in grid: a review. Procedia-Social and Behavioral Sciences, 129, 436–440.
A. Enami, J. Akbari Torkestani and A. Karimi / Expert Systems With Applications 125 (2019) 369–377 Argungu, S, M., A, S., & Omar, M, H. (2015). Survey on job scheduling mechanism in grid environment. ARPN Journal of Engineering and Applied Sciences, 10, 6654–6661. Asnaashari, M., & Meybodi, M. R. (2007). Irregular cellular learning automata and its application to clustering in sensor networks. In Proceedings of 15th Conference on Electrical Engineering (15th ICEE), Volume on Communication, Telecommunication Research Center. Atlasis, A. F., Saltouros, M. P., & Vasilakos, A. V. (1998). On the use of a stochastic estimator learning algorithm to the ATM routing problem: a methodology. Computer Communications, 21, 538–546. Batista, D. M., & da Fonseca, N. L. (2007). A brief survey on resource allocation in service oriented grids. In Globecom Workshops, 2007 IEEE (pp. 1–5). IEEE. Bawa, R. K., & Sharma, G. (2012). Reliable resource selection in grid environment arXiv:1204.1516. Beigy, H. (2004). Intelligent channel assignment in cellular networks: A learning automata approach. Beigy, H., & Meybodi, M. R. (2004). A mathematical framework for cellular learning automata. Advances in Complex Systems, 7, 295–319. Beigy, H., & Meybodi, M. R. (2006). Utilizing distributed learning automata to solve stochastic shortest path problems. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 14, 591–615. Billard, E., & Lakshmivarahan, S. (1998). Simulation of period-doubling behavior in distributed learning automata. In Proceedings of the 1998 ACM symposium on Applied Computing (pp. 690–695). ACM. Che, H., Li, S.-q., & Lin, A. (1998). Adaptive resource management for flow-based IP/ATM hybrid switching systems. IEEE/ACM Transactions on Networking (TON), 6, 544–557. Dakkak, O., Arif, S., & Nor, S. A. (2015). Resource allocation mechanism in computational grid: a Survey. ARPN Journal of Engineering and Applied Sciences, 10, 6662–6667. Dong, F. (2007). A taxonomy of task scheduling algorithms in the Grid. Parallel processing letters, 17, 439–454. Doulamis, N. D., Kokkinos, P., & Varvarigos, E. (2014). Resource selection for tasks with time requirements using spectral clustering. IEEE Transactions on Computers, 63, 461–474. Economides, A., & Silvester, J. (1988). Optimal routing in a network with unreliable links. In Proc. of IEEE Computer Networking Symposium (pp. 288–297). Economides, A. A. (1997). Real-time traffic allocation using learning automata. In Systems, Man, and Cybernetics, 1997. Computational Cybernetics and Simulation., 1997 IEEE International Conference on: 4 (pp. 3307–3312). IEEE. El-Osery, A. I., Baird, D., & Abd-Almageed, W. (2005). A learning automata based power management for ad-hoc networks. In Systems, Man and Cybernetics, 2005 IEEE International Conference on: 4 (pp. 3569–3573). IEEE. Ferreira, L., Berstis, V., Armstrong, J., Kendzierski, M., Neukoetter, A., & Takagi, M. (2003). Introduction to grid computing with globus. Ibm Redbooks, 9, 1–37. Foster, I., & Kesselman, C. (2003). The grid 2: blueprint for a new computing infrastructure. San Francisco: Elsevier. Foster, I., Kesselman, C., & Tuecke, S. (2001). The anatomy of the grid: enabling scalable virtual organizations. The International Journal of High Performance Computing Applications, 15, 200–222. Haque, A., Alhashmi, S. M., & Parthiban, R. (2011). A survey of economic models in grid computing. Future Generation Computer Systems, 27, 1056–1069. Hariri, A., Rastegar, R., Navi, K., Zamani, M. S., & Meybodi, M. R. (2005). Cellular learning automata based evolutionary computing (CLA-EC) for intrinsic hardware evolution. In Evolvable Hardware, 2005. Proceedings. 2005 NASA/DoD Conference on (pp. 294–297). IEEE. Howell, M. N., Frost, G. P., Gordon, T. J., & Wu, Q. H. (1997). Continuous action reinforcement learning applied to vehicle suspension control. Mechatronics, 7, 263–276. Ismail, L. (2007). Dynamic resource allocation mechanisms for grid computing environment. In Testbeds and Research Infrastructure for the Development of Networks and Communities, 2007. TridentCom 2007. 3rd International Conference on (pp. 1–5). IEEE. Kandagatla, C. (2003). Survey and taxonomy of grid resource management systems. Austin: University of Texas. Kanzariya, D., & Patel, S. (2013). Survey on resource allocation in grid. International Journal of Engineering and Innovative Technology (IJEIT), 2, 129–133. Kavitha, G., & Sankaranarayanan, V. (2011). Resource selection in computational grid based on User QoS and trust. IJCSNS International Journal of Computer Science and Network Security, 11, 214–221. Kavitha, G., & Sankaranarayanan, V. (2013). A guaranteed service resource selection framework for computational grids. International Journal of Grid and Distributed Computing, 6, 29–42. Krauter, K., Buyya, R., & Maheswaran, M. (2002a). A taxonomy and survey of grid resource management systems. Technical Report: University of Manitoba (TR-2000/18) and Monash University (TR-2000/80). Krauter, K., Buyya, R., & Maheswaran, M. (2002b). A taxonomy and survey of grid resource management systems for distributed computing. Software: Practice and Experience, 32, 135–164.
377
Krawczyk, S., & Bubendorfer, K. (2008). Grid resource allocation: allocation mechanisms and utilisation patterns. In Proceedings of the sixth Australasian workshop on Grid computing and e-research: 82 (pp. 73–81). Australian Computer Society, Inc. Kumari, S., & Kumar, G. (2015). Survey on Job scheduling algorithms in grid computing. International Journal of Computer Applications, 115, 17–20. Lakshmivarahan, S. (2012). Learning algorithms theory and Applications: theory and applications. New York: Springer Science & Business Media. Lakshmivarahan, S., & Thathachar, M. (1972). Optimal non-linear reinforcement schemes for stochastic automata. Information Sciences, 4(2), 121–128. Lakshmivarahan, S., & Thathachar, M. (1973). Absolutely expedient learning algorithms for stochastic automata. In IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans (pp. 281–286). Liu, G‘., Xu, Y., & Ma, S. (2007). A new resource selection approach based on reputation driven Min-min algorithm in the grid economy. In Proceedings of the 2007 Asian technology information program’s (ATIP’s) 3rd workshop on High performance computing in China: Solution approaches to impediments for high performance computing (pp. 160–167). ACM. Mason, L., & Gu, X. (1986). Learning automata models for adaptive flow control in packet-switching networks. In Adaptive and learning systems (pp. 213–227). Boston: Springer. Nabrzyski, J., Schopf, J. M., & Weglarz, J. (2004). Grid resource management: state of the art and future trends: 64. New York: Springer Science & Business Media. Nagariya, S., & Mishra, M. (2013). Resource scheduling in grid computing: a Survey. Global Journal of Computer Science and Technology, 13(2), 9–12. Nandagopal, M., & Uthariaraj, R. (2011). Performance analysis of resource selection algorithms in grid computing environment. Journal of Computer Science, 7, 493. Navimipour, N. J., Rahmani, A. M., Navin, A. H., & Hosseinzadeh, M. (2014). Resource discovery mechanisms in grid systems: a survey. Journal of Network and Computer Applications, 41, 389–410. Qureshi, M. B., Dehnavi, M. M., Min-Allah, N., Qureshi, M. S., Hussain, H., Rentifis, I., et al. (2014). Survey on grid resource allocation mechanisms. Journal of Grid Computing, 12, 399–441. Rastegar, R., Arasteh, A., Hariri, A., & Meybodi, M. R. (2004). A fuzzy clustering algorithm using cellular learning automata based evolutionary algorithm. In Hybrid Intelligent Systems, 2004. HIS’04. Fourth International Conference on (pp. 310–314). IEEE. Rastegar, R., & Meybodi, M. (2004). A new evolutionary computing model based on cellular learning automata. In Cybernetics and Intelligent Systems, 2004 IEEE Conference on: 1 (pp. 433–438). IEEE. Rastegar, R., Meybodi, M. R., & Hariri, A. (2006). A new fine-grained evolutionary algorithm based on cellular learning automata. International Journal of Hybrid Intelligent Systems, 3, 83–98. Sangwan, S., & Khurana, D. (2012). Resource allocation in grid computing. International Journal of Computer Science and Management Studies, 12, 127–130. Singh, K., Chhabra, A., & GNDU, A. (2015). A survey of evolutionary heuristic algorithm for job scheduling in grid computing. International Journal of Computer Science and Mobile Computing, 4, 611–616. Sood, D., Kour, H., & Kumar, S. (2016). Survey of computing technologies: distributed, utility, cluster, grid and cloud computing. Journal of Network Communications and Emerging Technologies (JNCET) www. jncet. org, 6, 99– 102. Srikantakumar, P., & Narendra, K. (1982). A learning model for routing in telephone networks. SIAM Journal on Control and Optimization, 20, 34–57. Sulistio, A., Cibej, U., Venugopal, S., Robic, B., & Buyya, R. (2008). A toolkit for modelling and simulating data Grids: an extension to GridSim. Concurrency and computation: Practice and experience, 20, 1591–1609. Thathachar, M., & Harita, B. R. (1987). Learning automata with changing number of actions. IEEE Transactions on systems, man, and cybernetics, 17, 1095–1100. Thathachar, M., & Narendra, K. (1989). Learning Automata: an introduction. Toporkov, V., Toporkova, A., Bobchenkov, A., & Yemelyanov, D. (2011). Resource selection algorithms for economic scheduling in distributed systems. Procedia Computer Science, 4, 2267–2276. Vickrey, W. (1961). Counterspeculation, auctions, and competitive sealed tenders. The Journal of finance, 16, 8–37. Vijayakumar, V., & WahidhaBanu, R. (2008). Trust and reputation aware security for resource selection in grid computing. In Security Technology, 2008. SECTECH’08. International Conference on (pp. 121–124). IEEE. Wang, C.-M., Chen, H.-M., Hsu, C.-C., & Lee, J. (2010). Dynamic resource selection heuristics for a non-reserved bidding-based Grid environment. Future Generation Computer Systems, 26, 183–197. Wu, T., Ye, N., & Zhang, D. (2005). Comparison of distributed methods for resource allocation. International Journal of Production Research, 43, 515–536. Xhafa, F., & Abraham, A. (2010). Computational models and heuristic methods for Grid scheduling problems. Future Generation Computer Systems, 26, 608– 621. Yousif, A., Abdullah, A. H., Latiff, M. S. A., & Bashir, M. B. (2011). A Taxonomy of Grid Resource Selection Mechanisms. International Journal of Grid and Distributed Computing, 4, 107–117.