Journal of Network and Computer Applications 35 (2012) 2028–2036
Contents lists available at SciVerse ScienceDirect
Journal of Network and Computer Applications journal homepage: www.elsevier.com/locate/jnca
A distributed resource discovery algorithm for P2P grids Javad Akbari Torkestani n Department of Computer Engineering, Arak Branch, Islamic Azad University, Arak, Iran
a r t i c l e i n f o
a b s t r a c t
Article history: Received 11 April 2012 Received in revised form 3 August 2012 Accepted 3 August 2012 Available online 10 August 2012
Centralized or hierarchical administration of the classical grid resource discovery approaches is unable to efficiently manage the highly dynamic large-scale grid environments. Peer-to-peer (P2P) overlay represents a dynamic, scalable, and decentralized prospect of the grids. Structured P2P methods do not fully support the multi-attribute range queries and unstructured P2P resource discovery methods suffer from the network-wide broadcast storm problem. In this paper, a decentralized learning automatabased resource discovery algorithm is proposed for large-scale P2P grids. The proposed method supports the multi-attribute range queries and forwards the resource queries through the shortest path ending at the grid peers more likely having the requested resource. Several simulation experiments are conducted to show the efficiency of the proposed algorithm. Numerical results reveal the superiority of the proposed model over the other methods in terms of the average hop count, average hit ratio, and control message overhead. & 2012 Elsevier Ltd. All rights reserved.
Keywords: Grid P2P grid Resource discovery Resource allocation Learning automata
1. Introduction Grid systems interconnect a collection of heterogeneous and autonomous systems from multiple administrative domains geographically distributed to make possible the sharing of existing resources. Grid implies to an extensive concept that is often referred to as the parallel system of the 1970s, the large-scale cluster system of the 1980s, and the distributed system of the 1990s. Therefore, grids widely inherit of the traditional computing models. However, they have distinguished characteristics such as large scalability, heterogeneity and diversity, autonomy, dynamicity and open-endedness, and task complexity (Yu et al., 2005). Grid systems are mostly based on a centralized or hierarchical administration (Deng et al., 2009). Traditional centralized or hierarchical grid architectures are unable to effectively manage the large-scale, heterogeneous, and highly dynamic grid resources (Deng et al., 2009). The unique characteristics of the P2P architecture (e.g., distributed administration, large-scale size, and so on) enable it to cope with the dynamicity, scalability, and availability problems of the grids. In P2P networks unlike the traditional client-server models, each peer can simultaneously perform as a client or as a server. Depending on the organization method of the peers and the communication protocols, P2P systems are mainly subdivided into structured and unstructured classes. The former one uses a rigid structure to interconnect the peers, while the letter one lets the peers randomly join or leave.
n
Tel.: þ98 861 3663041-9. E-mail address:
[email protected]
1084-8045/$ - see front matter & 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.jnca.2012.08.001
Furthermore, hybrid approaches have been also proposed to keep the benefits and to overwhelm the drawbacks (Trunfio et al., 2007). File sharing, real time data streaming, and cycle stealing are some well-known representative services provided by P2P networks (Deng et al., 2009). Generally, grid and P2P systems both are resource sharing environments having different advantages. Integration of the grid system with the philosophy and techniques of the P2P architecture is a promising approach, called P2P grid, to alleviate the disadvantages of the traditional grid systems (Merz and Gorunova, 2007; Kocak and Lacks, 2012; Deng et al., 2009). The main objective of the resource sharing systems is to pool together the software and hardware resources from multiple administrative domains and providing a huge collection of available resources to assign to the user applications. Therefore, resource management is an integral part of these systems and resource discovery is a key service that locates the system resources across a large-scale distributed system (Trunfio et al., 2007). Classical approaches to the grid resource discovery problem are generally based on the centralized or hierarchical architectures. This may shorten the average response time of the local requests. However, centralized and hierarchical architectures make the grid system inefficient and susceptible to failure as the grid size rapidly grows in distributed environments (Trunfio et al., 2007; Kocak and Lacks, 2012). Centralized and hierarchical resource discovery approaches suffer from the single failure point, performance bottleneck in highly dynamic systems, and lack of scalability in large-scale distributed systems (Deng et al., 2009). Therefore, designing of efficient resource discovery algorithms is a crucial problem and of a great importance. P2P
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 2028–2036
overlay technology emerged as a scalable solution to the traditional grid systems offering several advantages over the centralized approaches (Trunfio et al., 2007; Merz and Gorunova, 2007). P2P grid exploits the synergy between the grid system and P2P network to efficiently manage the grid resources and services in large-scale distributed environments. Literature argues that the grid and P2P systems will eventually converge (Trunfio et al., 2007). Resource location and discovery in P2P grids has been of research interest during the recent years and several studies have been conducted. Depending on the division of the architecture of the P2P systems to the structured and unstructured architectures, grid P2P resource discovery approaches are grouped into structured and unstructured too. The unstructured underlying P2P architectures proposed in literature for grid resource discovery are generally classified as flat P2P networks (Iamnitchi and Foster, 2003; Talia and Trunfio, 2005), tree-based overlays (Marzolla et al., 2005), and cluster-based networks (Mastroianni et al., 2005a, 2005b; Puppin et al., 2005). The following briefly reviews some well-known architectures. A fully decentralized P2P architecture was proposed by Iamnitchi and Foster (2003) for resource discovery in grid systems. In this architecture, the resource discovery process is divided into four parts: membership protocol, overlay construction, preprocessing, and request processing. Resource information is stored in one or more peers. User sends its request to the local peer. Peer checks its table to see if it has a matching resource description. If so, it responds the user. Otherwise, it forwards the request to another peer. This process repeats until the requested resource is found or the TTL of the request expires. Talia and Trunfio (2005) proposed a P2P architecture for resource discovery in OGSA-compliant grids. The proposed model has two layers. The lower layer is composed of a number of hierarchical index services and the upper one is a P2P layer including peer services and contact services. Each index service publishes the resource information of its own virtual organization. Peer services are responsible for resource discovery and contact services for organizing the peer services in a P2P network. Mastroianni et al. (2005a, 2005b) designed a regional resource information service for P2P grids based on the super-peer concept. This model comprises two types of peer: super-peer and regular peer. Each super-peer is associated with a number of local regular peers. Super-peers are connected by an overlay P2P network. A regular peer sends its request to its local super-peer. Super-peer returns the response, if it finds a local peer providing the requested service. Otherwise, it forwards the request to its neighboring super-peers. Another super-peer based resource discovery scheme was proposed by Puppin et al. (2005). In this scheme, grid nodes are partitioned as clusters, each having one or more super-peers. This model includes two main components: agent and aggregator. Each aggregator plays the role of a superpeer responsible for data collection, query processing and forwarding, and information indexing. A P2P network connects the neighboring super-peers. At each cluster, agent publishes the information of the provided resources. A tree structure-based grid resource discovery approach was proposed by Marzolla and Mordacchini (Orlando). Structured P2P grid resource discovery approaches are generally based on a distributed indexing service known as hashing technique. To maintain the rigid structure, a self-organization mechanism is required in structured P2P systems that imposes a heavy burden to the systems (Trunfio et al., 2007). Cai et al. (2003) proposed a multi-attribute addressable network for grid information services called MAAN that is an extension of the P2P structured Chord system (Stoica et al., 2001). Andrzejak and Xu (2002) designed a scalable efficient range queries for grid information services that is based on an extension of the distributed
2029
hash table (DHT) CAN system (Ratnasamy et al., 2001). In DHTbased systems, to location a resource (or to map the file attribute to the network address), the resource is initially associated with an ID (key) by using a hash table. Then, a lookup function calculates the value of the key (or where the source is stored). A decentralized single-dimensional DHT-based information discovery technique supporting multi-attribute queries was proposed by Schmidt and Parashar (2003). In this method, each resource having multiple attributes is mapped into the node whose ID is obtained by interleaving the binary representation of the attributes value. Ratnasamy et al. (2003) presented a load distribution approach based on a uniform hash function. Besides the underlying DHT, a binary tree structured overlay is also used to allow the efficient range query resolution. Each resource is registered only at the leaf node whose range contains its attribute value. Spence et al. (2003) extended the Pastry (Rowstron and Druschel, 2001) indexing and routing system. This model allows multidimensional search by preparing a separate Pastry ring for each resource attribute. Key values are managed in a tree like structure whose leaves are the nodes. Non-leaf nodes summarize the range of values of their children. To find the responsible node, key value of the tree like structure is mapped into the Pastry ring structure. In Merz and Gorunova (2007), Merz and Gorunova proposed a hybrid fault tolerant resource discovery mechanism for P2P grid environments combining efficient chord-like spanning tree algorithms and robust epidemic algorithms (Eugster et al., 2004). Deng et al. (2009) designed an ACO (Ant Colony Optimization) based resource discovery algorithm for large-scale P2P grid systems. The proposed ACO-based algorithm avoids the notorious global flooding problem by sending the packets along the routes that are frequently travelled by the ants. This considerably reduces the network load. Moreover, the proposed method supports multi-attribute range queries. To improve the grid performance, this algorithm can use multiple ants searching the resources in parallel. Kocak and Lacks (2012) proposed a resource discovery protocol in which the network routers are in charge of resource discovery process. In this protocol, besides the routing table, each router is equipped with a resource table. Resource table maps the IP addresses to the available computing resource values. Discovery packets are encapsulated within the TCP/IP packets and look up the resource tables for finding the requested resources. As mentioned earlier, unstructured P2P resource discovery methods suffer from the network-wide broadcast storm problem. Flooding the resource queries makes the unstructured approaches inappropriate for current large-scale grid systems. On the other hand, structured methods do not perform well in highly dynamic networks and multi-attribute range queries. In this paper, a decentralized resource discovery algorithm is proposed for large-scale P2P grids to relief the problems with the previous methods. Taking advantage of learning automata, the proposed resource discovery algorithm finds the shortest path (the path with the minimum hop count) connecting the user to the resource providing peer. In this method, the communication link that is chosen by each peer to route the resource provider is selected at random by the automaton. If the route that is selected at each stage is shorter than the average length of the routes selected so far, algorithm rewards the selected route, otherwise it is penalized. Therefore as the proposed algorithm proceeds, algorithm converges to the route with the minimum expected length. The proposed algorithm supports the highly dynamicity of the scalable P2P grids where the peers frequently and unpredictably joins, leaves, and rejoin the system. To show the performance of the proposed resource discovery algorithm, several simulation experiments are conducted under several grid scenarios. The results of the proposed algorithm are compared with those of
2030
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 2028–2036
KL (Kocak and Lacks, 2012) and DWC (Deng et al., 2009). Simulation results show that the proposed algorithm outperforms the other methods in terms of the average hop count, average hit ratio, and control message overhead. The rest of the paper is organized as follows. In the next section, learning automata theory is briefly reviewed. In Section 3, a learning automata-based algorithm is proposed for resource discovery in P2P grid environments. In Section 4, the performance of the proposed algorithm is evaluated through simulation experiments, and Section 5 concludes the paper.
2. Learning automata theory A learning automaton (Narendra and Thathachar, 1989; Thathachar and Harita, 1987) is an adaptive decision-making unit that improves its performance by learning how to choose the optimal action from a finite set of allowed actions through repeated interactions with a random environment. The action is chosen at random based on a probability distribution kept over the action-set and at each instant the given action is served as the input to the random environment. The environment responds the taken action in turn with a reinforcement signal. The action probability vector is updated based on the reinforcement feedback from the environment. The objective of a learning automaton is to find the optimal action from the action-set so that the average penalty received from the environment is minimized. Learning automata have been found to be useful in systems where incomplete information about the environment exists. Learning automata are also proved to perform well in complex, dynamic and random environments with a large amount of uncertainties. Learning automata have a wide variety of applications in combinatorial optimization problems (Akbari Torkestani, 2012, 2012j; Akbari Torkestani and Meybodi, 2012), computer and communication networks (Akbari Torkestani and Meybodi, 2011a, 2011b; Akbari Torkestani, 2012h, 2012b, 2012f, 2012g, 2012e), grid computing (Akbari Torkestani, 2012a), and Web engineering (Akbari Torkestani, 2012c, 2012d, 2012k). Fig. 1 shows the relation ship between the learning automaton and the random environment. The environment can be described by a triple o a , b ,c 4, where a fa1 , a2 ,. . ., ar g represents the finite set of the inputs, b b1 , b2 ,. . ., bm denotes the set of the values that can be taken by the reinforcement signal, andc c1 ,c2 ,. . .,cr denotes the set of the penalty probabilities, where the element ci is associated with the given action ai. If the penalty probabilities are constant, the random environment is said to be a stationary random environment, and if they vary with time, the environment is called a non stationary environment. The environments depending on the nature of the reinforcement signal b can be classified into P-model, Q-model and S-model. The environments in which the reinforcement signal can only take two binary values 0 and 1 are referred to as P-model environments. Another class of the environment allows a finite number of the values in the interval [0,1] can be taken by the reinforcement signal. Such an α
Random Environment
Learning Automaton
β
Fig. 1. The relationship between the learning automaton and its random environment.
environment is referred to as Q-model environment. In S-model environments, the reinforcement signal lies in the interval [a,b]. Learning automata can be classified into two main families (Narendra and Thathachar, 1989): fixed structure learning automata and variable structure learning automata. Variable structure learning automata are represented by a triple o b , a ,T 4 , where b is the set of inputs, a is the set of actions, and T is learning algorithm. The learning algorithm is a recurrence relation which is used to modify the action probability vector. Let ai ðkÞ A a and p ðkÞ denote the action selected by learning automaton and the probability vector defined over the action set at instant k, respectively. Let a and b denote the reward and penalty parameters and determine the amount of increases and decreases of the action probabilities, respectively. Let r be the number of actions that can be taken by learning automaton. At each instant k, the action probability vector p ðkÞ is updated by the linear learning algorithm given in Eq. (1), if the selected action ai(k) is rewarded by the random environment, and it is updated as given in Eq. (2) if the taken action is penalized. ( pj ðkÞ þ a 1pj ðkÞ j ¼ i pj ðk þ1Þ ¼ ð1Þ 8j a i ð1aÞpj ðkÞ (
pj ðk þ1Þ ¼
ð1bÞpj ðkÞ b r1 þð1bÞpj ðkÞ
j¼i 8j a i
ð2Þ
If a¼ b, the recurrence Eqs. (1) and (2) are called linear rewardpenalty (LR P) algorithm, if acb the given equations are called linear reward-E penalty (LREP ), and finally if b¼0 they are called linear reward-Inaction (LR I). In the latter case, the action probability vectors remain unchanged when the taken action is penalized by the environment. A variable action-set learning automaton is an automaton in which the number of actions available at each instant changes with time. It has been shown in Thathachar and Harita (1987) that a learning automaton with a changing number of actions is absolutely expedient and also E-optimal, when the reinforcement scheme is LR I. Such an automaton has a finite set of r actions, a ¼ a1 , a2 ,. . ., ar . A¼{A1,A2,y,Am} denotes the set of action subsets and A(k)D a is the subset of all the actions can be chosen by the learning automaton, at each instant k. The selection of the particular action subsets is randomly made by an external agency according to the probability distribution C(k)¼{C1(k),C2(k),y, Cm(k)} defined over the possible subsets of the actions, where r Ci(k) ¼prob[A(k)¼A i9AiAA,1rir2 1]. p^ i ðkÞ ¼ prob aðkÞ ¼ ai 9AðkÞ, ai A AðkÞ denotes the probability of choosing action ai, conditioned on the event that the action subset A(k) has already been selected and aiAA(k) too. The scaled ^ i ðkÞ is defined as probability p
p^ i ðkÞ ¼
pi ðkÞ KðkÞ
ð3Þ
P where K ðkÞ ¼ ai A AðkÞ pi ðkÞ is the sum of the probabilities of the actions in subset A(k), and pi(k)¼prob[a(k)¼ ai]. The procedure of choosing an action and updating the action probabilities in a variable action-set learning automaton can be described as follows. Let A(k)be the action subset selected at instant k. Before choosing an action, the probabilities of all the actions in the selected subset are scaled as defined in eq. (3). The automaton then randomly selects one of its possible actions ^ ðkÞ. Depending according to the scaled action probability vector p on the response received from the environment, the learning automaton updates its scaled action probability vector. Note that the probability of the available actions is only updated. Finally, the probability vector of the actions of the chosen subset is ^ i ðkþ 1Þ A KðkÞ, for all aiAA(k). The absolute rescaled as pi ðkþ 1Þ ¼ p
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 2028–2036
expediency and e-optimality of the method described above have been proved in Thathachar and Harita (1987).
3. Resource discovery algorithm Resource management is one of the key design issues in grid systems. Classic grid resource discovery methods are generally administered in a hierarchical or centralized manner, while grid environments are highly dynamic, large scale, and naturally distributed. A peer-to-peer overlay network is a distributed, dynamic, and scalable approach to connect the grid nodes. Existing P2P resource discovery approaches are generally classified into unstructured and structured. The former one suffers from the network-wide broadcast storm problem, and the latter one does not fully support the multi-attribute range queries. The aim of this paper is to design a learning automata-based resource discovery algorithm for P2P grids to cope with the problems of the previous structured and unstructured methods. Let graph G oP,L4 denotes the topology graph of the P2P network, where P¼ {p1,p2,y,pn} denotes the set of peers, and LDP P denotes the set of communication links connecting the peers. Let R¼{r1,r2,y,rm} denotes the set of available resource types. In this method, a network of learning automata A¼{A1,A2,y,An} isomorphic to the P2P network graph G oP,L4 is formed by assignment of a learning automaton Ai to each peer pi. Each learning automaton comprises m action-sets (and action probability vectors), where m denotes the number of available resource types. The action-set of automaton is denoted as a i ¼ a ik 91 rk r m. Let a ik A a i denotes the action-set of kth action probability vector of learning automaton Ai assigned to peer pi. The action-set a ik includes an action ajik for each neighboring peer pj of peer pi. Let us assume that Di denotes the number of peers that are directly connected to (are neighbors of) peer pi. Hence, for each action probability vector 1 rkrm, action-set a ik ¼ ajik 9 8ðpi ,pj Þ A L includes Di different actions. Selection of action ajik means that automaton Ai selects the connection ðpi ,pj Þ to forward the query message of resource rk. Let pjik A p ik denotes the choice probability of action ajik by automaton Ai (or communication link ðpi ,pj Þ) to locate the peer providing resource rk. For each action-set a ik , all actions (communication links) are initially chosen with the same probability
1
Di .
This is due to the fact that learning auto-
maton has not a priori knowledge of the resource location. So, at first it impartially selects the links at random.
2031
P2P grid is a highly dynamic scalable environment where the peers frequently and unpredictably enter, depart, and rejoin the system. Under such circumstances, the topology of the P2P network and consequently that of the isomorphic network of learning automata frequently changes. The action probability vectors must be updated upon a topological change. When a peer pj joins the P2P system, it sends a ERQ (Enter ReQuest) message to all its neighboring peers. ERQ message includes the sender ID, resource information, and neighbors IDs. Upon receiving ERQ message, each neighboring peer pi calls procedure ERQ(pi) shown in Fig. 2. In this procedure, each neighboring peer pi checks the information of the resources provided by the newly joined peer. If there exist one or more new type of resources connecting to the system by the new arrived peer, each neighboring peer pi creates a new action-set for every new type of resource. In this case, since all peers must be aware of the new resource types, ERQ is flooded within the network. To do so, each peer resends the receiving ERQ message to its neighboring peers until the TTL of the ERQ message expires. Regardless of providing new resource, each neighboring peer pi must update all its action-sets and action probability vectors by adding a new action ajik for each resource k as shown in Lines 04–10 of Fig. 2. When a peer pj decides to leave the P2P grid system, it sends a DRQ (Departure ReQuest) message to all its neighboring peers. Each neighboring peer pi calls procedure DRQ(pi) shown in Fig. 3 as soon as it receives a DRQ message. Neighboring peer pi removes the action corresponding to the leaving peer pj from all its k possible action-set. To do this, the choice probability of all remaining actions must be increased proportional to the choice probability of the removed action (see Lines 04–09 of Fig. 3). For each resource rkAR provided by leaving peer pj, action-set a ik of all learning automata AiAA must be removed, if no other peer can provide such a resource. In this case, each peer that receives the DRQ message resends it to all its neighboring peers. In the proposed method, the action-sets play the role of resource and routing tables that are used in the other approaches. The aim of procedures ERQ and DRQ is to keep the routing and resource information of the grid system up to date. Sending DRQ message is not mandatory to remove a peer from the grid system. The action-set of the learning automata is updated by removing the resources that are provided by the leaving peer as soon as one routing query fails to access the leaving peer. The first neighboring peer that cannot be connected to the leaving peer generates the DRQ message. By this scenario, the proposed resource discovery mechanism tolerate the peer failures.
Fig. 2. Pseudo code of procedure ERQ (enter request).
2032
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 2028–2036
Fig. 3. Pseudo code of procedure DRQ (departure request).
Fig. 4. Pseudo code of procedure RSQ (resource query).
When a user asks the P2P grid system for a resource of type rk, its resource query is initially submitted to its local peer pi as a RSQ (ReSource Query) message. RSQ message includes the source peer pd, the travelled path Gsk , receiver ID realine;id , path length Lsk , and dynamic validity threshold vsk . Source peer pd is the peer to which the resource query is submitted. Travelled path Gsk is a stack structure comprising IDs of the traversed peers at stage k to locate resource rk. At each stage k, each activated peer must append its ID to the travelled path Gsk by a ‘PUSH’ operation (Gsk ’Gsk þ ID pi ). Path length Lsk is defined as the number of peers traversed at stage k to locate the requested resource. Dynamic validity threshold vsk denotes the average number of peers traversed during s 1 earlier stages to find the resource location. Validity threshold vsk is initially (i.e., for the first stage) set to the number of peers. In RSQ message that is received at source pd, stack Gk is empty, receiver ID realine;id is ID of pd, and path length Lsk is set to zero.
Upon receiving a RSQ message at each stage s, each peer pi calls procedure RSQ(pi,s,k) shown in Fig. 4. Procedure RSQ(pi,s,k) is locally run at each peer pi for locating resource rk at stage s. In this procedure, source peer pd first checks its resource table stored as the action-sets to see if resource rk is provided by the P2P grid system. If that is not the case (i.e., if a ik2 = a i ), peer pd returns an error message stating that the requested resource is not available and terminates the procedure. Otherwise, every peer pi increments path length Lsk by one and checks its available resources to see if it is able to provide the requested resource itself. If so, peer pi (which is hereafter called resource providing peer and denoted as pd0 ) returns its location by a RLC (Resource LoCation) message to the source peer to which user submits its resource request. RLC message is composed of five parts: resource providing peer pd0 , receiver ID realine;id , path Gsk including the travelled peers from the source peer to the resource providing peer in a stack order, s updated validity threshold vsk þ 1 , and reinforcement signal bk .
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 2028–2036 s
Reinforcement signal bk is used to update the internal state of activated learning automata based on the optimality of the travelled path to locate the resource. To set the reinforcement s signal bk , resource providing peer pd0 compares path length Lsk s with validity threshold vsk . If Lsk rvsk , then bk is set to zero and all learning automata corresponding to the peers included in Gsk are rewarded. Otherwise, it is set to one and all learning automata are penalized. At each stage k, validity threshold vsk þ 1 is updated as ðk1Þvsk þ Lsk ð4Þ vsk þ 1 ¼ k Otherwise (if peer pi can not provide the requested resource), peer pi activates its corresponding automaton Ai. Learning automaton Ai updates action-set a ik and action probability vector p ik by temporarily disabling the actions corresponding to the peers selected so far (included in Gsk ) as described earlier in Section 2 and procedure DRQ. This is to avoid the loop formation and repetitive peers in Gsk . Then, learning automaton Ai randomly chooses one of its possible actions from a ik based on p ik , if any. If there is no more action in action-set a ik , travelled path Gsk is traced back to find a peer with non-empty action-set. This is done by sending a TRB (TRacing Back) message to the peer appended to stack Gsk before current peer. This peer resumes the resource discovery process and chooses one of its possible actions from non-empty action-set a ik . Let us assume that automaton Ai chooses action ajik . This implies that peer pj is the next peer to which the task of resource location is entrusted. Selected action is temporarily removed from the action-set a ik . Peer pi sends a RSQ message to peer pj through communication link (pi,pj). This process continuous until the resource providing peer pd0 is found. As mentioned earlier, resource providing peer pd0 sends the location of the requested resource to the user along traversed path Gsk by a RLC message. To do so, the resource providing peer pd0 extracts the peer appended to stack Gsk before itself (e.g., peer pi) and sends a RLC message to it. Upon receiving a RLC message, each
2033
peer pi calls procedure RLC shown in Fig. 5. In this procedure, the s reinforcement signal bk is first checked and the internal state of s automaton Ai is updated by applying Eq. (1) on p ik if bk is zero and on Eq. (2) otherwise. After updating the action probability vector, the action-set must be restored again by enabling the disabled actions. Then, peer pi extracts the peer appended to stack Gsk before itself (e.g., pj) and sends a RLC message to it (see Lines 07–09 of Fig. 5). This procedure repeats until the RLC message is received at source peer pd. When RLC message is received at source peer pd, the resource discovery process is over and source peer pd can be connected to resource providing peer pd0 through Gsk . Upon receiving a TRB message at peer pi, it calls procedure TRB(pi,s,k). In this procedure, learning automaton Ai checks its action-set to see if it is empty. If so, peer pi decrement the path length Lsk by one and sends a TRB message to the peer pj that has been added to Gsk before peer pi. This process is repeated until a peer with non-empty action-set is found. In this case, the learning automaton corresponding to the found peer selects one of its actions at random according to p ik , and resumes the resource discovery process by sending a RSQ message to the selected peer (see Lines 07–09 of Fig. 6).
4. Experimental results In this section, several simulation experiments are performed to investigate the efficiency of the proposed resource discovery algorithm called LARD (short for Learning Automata-based Resource Discovery algorithm) under three different grid sizes: small, medium, and large scale P2P grids. The small scale P2P grid system is composed of 256 peers, and 1024 resources of 4 different resource types (each resource type having four classs). The medium scale P2P grid system is composed of 2048 peers, and 8192 resources. The large scale P2P grid system is composed of 16,384 peers, and 65,536 resources. In real scenarios, large scale P2P grids may include tens of
Fig. 5. Pseudo code of procedure RLC (resource location).
Fig. 6. The pseudo code of procedure TRB (tracing back).
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 2028–2036
thousand peers or even more. However, in this paper, large scale grid systems are composed of 16,384 peers. Resources are generally of 4 different types: CPU, memory, disk and operating system. CPU, memory, and disk can be of four different capacities: low, moderate, high, and very high. Operation system can be also of four different types on different machines. Therefore, grid resources are generally of 4 different types and 16 classes. Resources are evenly and randomly distributed between the peers. 1024, 8192, and 65,536 resource queries are submitted to the randomly chosen peers of small, medium, and large scale systems. Queries are for different resource types selected at random. P2P network topologies are generated as follows. For small, medium, and large scale systems, peers are randomly and evenly distributed within the square simulation area of size 250 250, 1000 1000, 4000 4000 unit, respectively. Neighboring peers are connected together if the Euclidean distance between them is less than or equal to 20, 40, and 80 unit for small, medium, and large scale P2P grid systems, respectively. The nominal bandwidth of the network connecting every two peers is assumed to be 10 Mbps. To improve the precision of the reported results, each experiment is independently repeated 50 times and the obtained results are averaged over these runs. The performance of the proposed resource discovery algorithm is compared with that of KL (a resource discovery method proposed by Kocak and Lacks (2012) in which the network routers are responsible for locating the grid resources) and DWC (an ant colony-based resource discovery algorithm proposed by Deng et al. (2009)) in terms of the following metrics of interest.
Hop count This metric is defined as the average number of
peers that are traversed to locate the requested resource. Hop count is affected by the network routing mechanism, resource distribution, and prior knowledge of the resource location. Hit ratio This is defined as the percentage of the success resource discoveries. A resource discovery is successful if at least one resource providing peer can be found for the requested resource before TTL expires. Control Message Overhead This metric is defined as the number of (extra) control messages required for resource discovery process. This metric is measured as the number of control messages that must be sent per second.
In our experiments, the learning algorithm is LR P with the same reward and penalty parameters (learning rate). Obviously, the effectiveness of the proposed algorithm directly depends on the choice of a proper learning rate. By the proper choice of the learning rate, a trade off between the cost of algorithm (control message overhead) and the solution optimality (hit ratio and hop count) can be made. Depending on the application nature, different learning rates can be chosen. If an application sacrifices the cost in favor of the solution optimality, a small learning rate is preferred, a larger one can be chosen otherwise. Several experiments were initially conducted to determine the best value of the learning rate. To find such a proper value, different learning rates ranging from 0.05 to 0.5 were tested on small, medium, and large scale P2P grids. The obtained results showed that the best results are achieved when the learning rate is set to 0.075, 0.080, and 0.090 for small, medium, and large scale systems, respectively. Therefore, the learning rate is set to the above mentioned values in different P2P grid scales for further experiments.
average hop count of the proposed resource discovery algorithm with KL (Kocak and Lacks, 2012) and DWC (Deng et al., 2009) for different grid scale scenarios. From the results shown in this figure, it can be seen that the average hop count increases as the system scale (network size) increases. One possible reason might be that the resources are distributed in a wider area and so the distance (number of hops) between the user and resource increases. The results shown in Fig. 7 are averaged over the number of submitted resource queries to the system. Each experiment is repeated 50 times and the results are also averaged over the number of runs. Comparing the results given in Fig. 7, it is clear that the proposed algorithm significantly outperforms the other algorithms in terms of the number of hops, KL (Kocak and Lacks, 2012) lags far behind LARD and DWC (Deng et al., 2009) has the largest hop count. One reason is that the proposed algorithm avoids appearing the cycle and redundant peer in the constructed path. The results also show that the gap between the proposed algorithm and the other methods becomes more significant as the system scale increases. Contrary to KL (Kocak and Lacks, 2012) and DWC (Deng et al., 2009), no significant growth can be seen in the number of hops of the proposed algorithm as the network size increases. This is because the proposed algorithm is fully distributed and independent from the network size. 4.2. Hit ratio Hit ratio is a very important measure to evaluate the effectiveness of a resource discovery algorithm that represents the rate of successful discoveries. This set of experiments is performed to investigate the hit ratio of different algorithms under different grid scales. The obtained results are shown in Fig. 8. Form the results shown in this figure, it is obvious that the hit ratio of the proposed resource discovery algorithm is higher than that of the other approaches. Comparing the results shown in Fig. 8, we find that the gap between the proposed algorithm and the other methods becomes larger as the network size grows. This shows the higher scalability of LARD. The proposed method taking advantage of learning automata is able to memorize the shortest path toward the resource. This path is stored as the probability vectors in learning automata. When a path is constructed to connect a requesting peer to a resource provider, it can be used to connect the intermediate peers for the same resource queries too. Among different possible paths toward the same resources, LARD converges to the shortest path. That is why LARD selects the more probable paths and has a higher hit ratio.
20 18
KL DWC
16
LARD
14
Hop Count
2034
12 10 8 6 4 2
4.1. Hop count
0 Small Scale
The aim of this experiment is to show the ability of different resource discovery algorithms to locate the nearest peer providing the requested resource. Fig. 7 represents a comparison of the
Medium Scale
Large Scale
Grid Size Fig. 7. Average hop count under different grid scale.
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 2028–2036
1
KL DWC
Hit Ratio
0.95
LARD
0.9
0.85
0.8 Small Scale
Medium Scale
Large Scale
Grid Size Fig. 8. Average hit ratio under different grid scale.
0.07
Control Message Overhead
KL 0.06
DWC LARD
0.05
2035
unstructured P2P grids. This algorithm was designed to relief the negative impacts of the global flooding problem on the network performance and to support the multi-attribute range queries too. In this method, the resource queries are forwarded through the shortest paths ending at the grid peers more likely having the requested resources. In the proposed algorithm, each peer is equipped with a learning automaton and network of learning automata is responsible for routing the query toward the resource provider through the shortest path. The proposed algorithm supports the highly dynamicity of the scalable P2P grids. Several simulation experiments were conducted on small, medium, and large scale P2P grid environments to show the performance of the proposed resource discovery algorithm. The obtained results were compared with those of KL (Kocak and Lacks, 2012) and DWC (Deng et al., 2009) in terms of average hop count, average hit ratio and control message overhead. Numerical results showed that the proposed algorithm outperformed KL (Kocak and Lacks, 2012) and DWC (Deng et al., 2009) in all small, medium, and large scale grids. The more significant gap between the hop count, hit ratio and message overhead of the proposed algorithm and the others for large scale grid environments show the higher scalability of the proposed algorithm as compared to KL (Kocak and Lacks, 2012) and DWC (Deng et al., 2009).
0.04 0.03
References
0.02 0.01 0 Small Scale
Medium Scale
Large Scale
Grid Size Fig. 9. Control message overhead under different grid scale.
4.3. Control message overhead These experiments are conducted to measure and compare the control message overhead of different resource discovery mechanism. The experimental results are depicted in Fig. 9. Comparing the results shown in this figure, it can be seen that the proposed algorithm LARD has the lowest rate of control message overhead and DWC (Deng et al., 2009) has the highest one. KL (Kocak and Lacks, 2012) encapsulates the resource discovery packets within the TCP/IP packets and so it places a considerably smaller amount of extra control packets to the system as compared to DWC (Deng et al., 2009). As mentioned earlier, the main objective of the proposed algorithm is to alleviate the impact of the network-wide broadcast storm problem (to reduce the number of broadcasts). The proposed algorithm sends the resource query messages only to the peers that have the requested resources with a much higher probability. As the proposed algorithm proceeds, the resource queries are forwarded along the shortest paths connecting the peers with a probability as close to one as possible. This meaningfully decreases the rate of extra message overhead required for resource discoveries. As shown in Fig. 9, the rate of control message overhead increases as the grid becomes larger. This is clear because the hop count and so the number of rebroadcasts increases as the P2P network size increases.
5. Conclusion In this paper, a decentralized learning automata-based resource discovery algorithm was proposed for large-scale
Akbari Torkestani J. A new approach to the job scheduling problem in computational grids, Cluster Computing, in press, 2012a. Akbari Torkestani J. LAAP: a learning automata-based adaptive polling scheme for clustered wireless Ad-Hoc networks, Wireless Personal Communication, in press, 2012b. Akbari Torkestani J. An adaptive learning automqata-based ranking function discovery algorithm, Journal of intelligent information systems, in press, 2012c. Akbari Torkestani J. An adaptive focused web crawling algorithm based on learning automata, Applied Intelligence, in press, 2012d. Akbari Torkestani J. Backbone formation in wireless sensor networks, Sensors and Actuators A: Physical, in press, 2012e. Akbari Torkestani J. Mobility prediction in mobile wireless Networks. Journal of Network and Computer Applications 2012f;35:1633–45. Akbari Torkestani J. A stable virtual backbone for wireless MANETS, Telecommunication Systems Journal, in press, 2012g. Akbari Torkestani J. An adaptive backbone formation algorithm for wireless sensor networks. Computer Communications 2012h;35:1333–44. Akbari Torkestani J. Degree constrained minimum spanning tree problem in stochastic graph. Journal of Cybernetics and Systems 2012i;43(1):1–21. Akbari Torkestani J. An adaptive heuristic to the bounded-diameter minimum spanning tree problem, Soft Computing, in press, 2012j. Akbari Torkestani J. An adaptive learning to rank algorithm: learning automata approach. Decision Support Systems, in press, 2012k. Akbari Torkestani J, Meybodi MR. LLACA: an adaptive localized clustering algorithm for wireless Ad hoc networks based on learning automata. Journal of Computers & Electrical Engineering 2011a;37:461–74. Akbari Torkestani J, Meybodi MR. A link stability-based multicast routing protocol for wireless mobile Ad hoc networks. Journal of Network and Computer Applications 2011b;34(4):1429–40. Akbari Torkestani J, Meybodi MR. Finding minimum weight connected dominating set in stochastic graph based on learning automata. Information Sciences 2012;200:57–77. Andrzejak A, Xu Z. Scalable, efficient range queries for grid information services In: Proceedings of 2nd international conference on P2P computing, pp. 33–40, 2002. Cai M, Frank M, Chen J, Szekely P. MAAN: a multi-attribute addressable network for grid information services. in: Proceedings of 4th international workshop on grid computing, pp. 184–191, 2003. Deng Y, Wang F, Ciura A. Ant colony optimization inspired resource discovery in P2P grid systems. Journal of Supercomputing 2009;49:4–21. Eugster PT, Guerraoui R, Kermarrec AM, Massoulie L. From epidemics to distributed computing. IEEE Computer 2004;37(5):60–7. Iamnitchi A, Foster IT. A P2P approach to resource location in grid environments, grid resource management. In: Weglarz J, Nabrzyski J, Schopf J, Stroinski M, editors. Kluwer; 2003. Kocak T, Lacks D. Design and analysis of a distributed grid resource discovery Protocol. Cluster Computing 2012;15(1):37–52. Marzolla M, Mordacchini M, Orlando S. Resource discovery in a dynamic grid environment. In: Proceedings of DEXA workshop, pp. 356–360, 2005.
2036
J. Akbari Torkestani / Journal of Network and Computer Applications 35 (2012) 2028–2036
Mastroianni C, Talia D, Verta O. A super-peer model for building resource discovery services in grids: design and simulation analysis. In: Proceedings of European grid conference, LNCS, vol. 3470, pp. 132–143, 2005a. Mastroianni C, Talia D, Verta O. A super-peer model for resource discovery services in large-scale grids. Future Generation Computer Systems 2005b;21:1235–48. Merz P, Gorunova K. Fault-tolerant resource discovery in P2P grids. Journal of Grid Computing 2007;5:319–35. Narendra KS, Thathachar MAL. Learning automata: an introduction. New York, Printice-Hall; 1989. Puppin D, Moncelli S, Baraglia R, Tonelotto N, Silvestri F. A grid information service based on P2P. In: Proceedings of 11th Euro-Par conference, LNCS, vol. 3648, pp. 454–464, 2005. Ratnasamy S, Hellerstein JM, Shenker S. Range queries over DHTs, IRB-TR-03-009, Intel Corporation, 2003. Ratnasamy S, Francis P, Handley M, Karp RM, Shenker S. A scalable contentaddressable network. In: Proceedings of ACM SIGCOMM 2001 conference on applications, technologies, architectures, and protocols for computer communication, pp. 161–172, 2001. Rowstron A, Druschel P. Pastry: Scalable, decentralized object location and routing for large scale P2P systems. In: Proceedings of IFIP/ACM international conference on distributed systems platforms, middleware, LNCS, vol. 2218, pp. 329–350, 2001.
Schmidt C, Parashar M. Flexible information discovery in decentralized distributed systems. In: Proceedings of 12th international symposium on highperformance distributed computing, pp. 226–235, 2003. Spence D, Harris T, XenoSearch. Distributed resource discovery in the XenoServer open platform. In: Proceedings of the 12th IEEE international symposium on high performance distributed computing, pp. 216–225, 2003. Stoica I, Morris R, Karger DR, Frans Kaashoek M, Balakrishnan H. Chord: a scalable P2P lookup service for internet applications. In: Proceedings of ACM SIGCOMM 2001 conference on applications, technologies, architectures, and protocols for computer communication, pp.149–160, 2001. Talia D, Trunfio P. P2P protocols and grid services for resource discovery on grids. In: Grandinetti L, editor. Grid computing: the new frontier of high performance computing, advances in parallel computing, Vol. 14. Elsevier Science; 2005. p. 83–105. Thathachar MAL, Harita BR. Learning automata with changing number of actions. IEEE Transactions on Systems, Man, and Cybernetics 1987;SMG17:1095–100. Trunfio P, Talia D, Papadakis H, Fragopoulou P, Mordacchini M, Pennanen M, Popov K, Vlassov V, Haridif S. P2P resource discovery in grids: models and systems. Future Generation Computer Systems 2007;23:864–78. Yu H, Bai X, Marinescu DC. Workflow management and resource discovery for an intelligent grid. Parallel Computing 2005;31:797–811.