A scalable group communication mechanism for mobile agents

A scalable group communication mechanism for mobile agents

ARTICLE IN PRESS Journal of Network and Computer Applications 30 (2007) 186–208 www.elsevier.com/locate/jnca A scalable group communication mechanis...

819KB Sizes 3 Downloads 87 Views

ARTICLE IN PRESS

Journal of Network and Computer Applications 30 (2007) 186–208 www.elsevier.com/locate/jnca

A scalable group communication mechanism for mobile agents Hojjat Jafarpoura,, Nasser Yazdanib, Navid Bazzaz-zadehb a

EECS Department, University of California, Irvine, CA 02697, USA b ECE Department, University of Tehran, Tehran, Iran

Received 24 April 2005; received in revised form 24 August 2005; accepted 8 September 2005

Abstract Many multi-agent applications based on mobile agents require message propagation among group of agents. A fast and scalable group communication mechanism can considerably improve performance of these applications. Unfortunately, most of the existing approaches do not scale well and disseminate messages slowly when the number of agents grows. In this paper, we propose Sama, a new group communication mechanism, to speed up message delivery for a group of mobile agents on a heterogeneous internetwork. The main contribution of Sama is distribution and parallelization of message propagation in an efficient way to achieve scalability and high-speed of message delivery to group members. Sama uses message dispatcher objects (MDOs), which are stationary agents on each host, to propagate messages concurrently. The proposed mechanism is independent of agent locations and transparently delivers messages to the group using constant number of remote messages. It also transparently recovers from host failures. We also present a Hop-Ring protocol that considerably improves the performance of message dissemination in Sama. Our experimental results show that message propagation in Sama is significantly fast compared to the previously proposed methods. r 2005 Elsevier Ltd. All rights reserved. Keywords: Mobile agents; Group communication; Multicast

Corresponding author. Tel.: +1 949 7250703; fax: +1 949 7250703.

E-mail address: [email protected] (H. Jafarpour). 1084-8045/$ - see front matter r 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.jnca.2005.09.002

ARTICLE IN PRESS H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

187

1. Introduction Mobile agents have introduced an attractive model for distributed systems. They are executing programs that can migrate, at time of their own choosing, from one machine to another in a heterogeneous network. Their ability to migrate and perform their tasks locally reduces the network load and execution time considerably (Fuggetta et al., 1998). Mobile agents have been used in various distributed applications such as wireless computing (Spyrou et al., 2004), distributed information retrieval (Brewington et al., 1999), network management (Cabri et al., 2001) and E-commerce (Dasgupta et al., 1999). Mobile agent development platforms provide facilities to develop agent based applications; among them to mention Voyager (Recursion Software Inc, 2003), Aglets (IBM Japan Research Group Aglets Workbench) and Grasshopper (2001). In many multi-agent applications, agents need to communicate with each other through exchanging messages. Different models for agent communication have been proposed including broadcasting, forwarding and central server (Wojiehowski, 2001). A more specific type of communication, which is used in many multi-agent systems, is group communication. In group communication, a message is delivered to number of receivers that construct a group. In the mobile agent realm, mobility of the group members introduces new challenges for message dissemination. Clearly, in scenarios that fully exploit mobility, where objects are rapidly and autonomously moving or their migration is not as tightly controlled, most of the conventional group communication techniques are inapplicable. Scalability of group communication mechanisms is also a critical factor in large-scale agent systems. Examples of large-scale mobile agent systems that need communication between agents include e-business applications and Internet-wide data warehouses (Wijngaards et al., 2002). Therefore, an efficient communication mechanism can considerably improve performance of these systems. In this paper, we propose a fast and scalable group communication mechanism for mobile agents. Our approach, called Sama, considerably speeds up message delivery to the group members by parallelizing message dissemination using an efficient algorithm. The main goal of Sama is to recruit hosts in the system to concurrently propagate messages. It distributes the load of message propagation among network nodes and uses a constant number of remote messages. Sama uses message dispatcher objects (MDOs), which are objects on each host, to parallelize message dissemination process. It also propagates messages among group members faster than the previously proposed mechanisms for mobile agents. Our approach detects failures in the system and recovers from them automatically. We also propose a new Hop-Ring protocol for our group communication mechanism that considerably increases the performance of message dissemination for large-scale multi agent systems on the Internet. Since it is designed for mobile agents, Sama also can be used as a message multicasting mechanism for ordinary distributed systems by assuming that agents are objects with fixed locations (Khuller and Kim, 2004). We have implemented Sama using Java programming language and evaluated it using ModelNet emulator. We also simulated the message propagation algorithm using ns-2 (Network Simulator). Our experimental results show that Sama performs very well in message propagation among large groups and using Hop-Ring protocol considerably improves performance of multicasting. The reminder of the paper is structured as follows. Section 2 reviews the related work and describes problems that arise when using them. In Section 3, we propose our group

ARTICLE IN PRESS 188

H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

communication mechanism. In this section, the system model and message propagation algorithm of Sama are presented. Section 4 describes how Sama detects failures and recovers from them. We present our Hop-Ring protocol for Sama in Section 5. Section 6 presents some characteristics of the mechanism. Our experimental results are presented in Section 7. Finally, Section 8 concludes the paper. 2. Related work Significant research has been done on group communication in distributed systems so far (Chockler et al., 2001). Many mechanisms have been proposed; however, none of them have considered mobility of the group members. Since delivery of data to a group of agents is required in many multi-agent systems, some group communication mechanisms for mobile agents have also been developed. Generally, we can classify these mechanisms into two categories: 1. Mechanisms that depend on agent locations and restrict migration of agents. 2. Mechanisms that are independent of agent locations and agents can migrate autonomously. In the first category, an agent can be reached using a stationary proxy that knows the agent’s current location. Upon migration, agent has to inform its corresponding proxy about its new location. This requirement restricts agent’s autonomy and forces agent to communicate with its proxy whenever it changes its location. Therefore agent implementation will be more complicated. On the other hand, in the second category there is no proxy for agents and they can migrate freely. This approach provides high rate of migration and autonomy for agents in the system and is more desired. Mobile Process Groups (Assis Silva and Maceˆdo, 2001) and Voyager Spaces (Recursion Software Inc, 2003) and Inc are mechanisms from the first category. Mobile Process Groups are process groups that support migrating processes. Each process installs a view, which is a mapping between all processes and their locations. This implies that each process knows all other group members and their locations. This also enforces the agents to maintain consistent views of the system and update them whenever a process migrates from one host to another, which is clearly costly in large-scale system with highly mobile processes. In this approach, message propagation is not transparent to the mobile processes, which means the sender process should know all group members. When a mobile process wants to send a message to the group members, it sequentially sends the message to all agents in its installed view. This approach is too slow and does not scale well to large group members. Voyager mobile agent platform provides group communication mechanism for agent groups as well. It uses a specialized architecture with spaces and subspaces to deliver the messages (Recursion Software Inc, 2003). In Voyager, a space is a logical container that can span multiple virtual machines across the network. Subspace class is the basic element and building block of a space. Each space is responsible for a subgroup of agents. Users connect the subspaces using different topologies. A message is sent into a space by publishing it into one of its subspaces. Then, it is cloned in all neighboring subspaces. In addition, the message is delivered to every agent that the subspace is responsible for. As the message propagates, it leaves behind a marker unique to that message which prevents the

ARTICLE IN PRESS H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

189

message from being re-propagated into subspaces that have already been sent. The mechanism has negative impact of sending many unnecessary messages and consuming high bandwidth for a large number of connected subspaces. Indeed, many nodes might receive a message several times. Because members of a subspace can migrate to different locations, the number of remote messages can also increase rapidly. The mechanism proposed in Murphy and Picco (2002) and group communication for mobile agents using IP multicast are from the location independent category. In Murphy and Picco (2002), a group communication mechanism based on reliable communication in fault-free environment is proposed. The mechanism attempts to deliver a message to every agent using a method similar to the distributed snapshot. When a message is sent to a group of agents, it is propagated to all nodes and along all links in the network and agents, whose identifiers match the message target, accept the message. Sending message along all the links makes this approach very slow when there is a path between any pair of nodes. In this approach, a message may be delivered several times to an agent. The most important weakness of the model is its assumption for fault free environment that makes it vulnerable in environments like the Internet where the failure probability is mot negligible. In Hartroth and Hofmann (1998) a group communication mechanism for mobile agents based on IP multicast has been proposed. The method captures the inherent agility of mobile agents in a scheme of dynamically adapted multicast groups. When an agent migrates to a new location, the group is changed and the new location is added to the group. If there was no agent in the previous location any more, it is discarded from the group. The method uses IP Multicast (Deering) as an infrastructure for message propagation. Unfortunately, IP Multicast comprises only a small fraction of the Internet routers. This considerably restricts the applicability of the method. The closest work to our mechanism is an event propagation mechanism proposed in McCormick et al. (2000), which is similar to the event model of Java. The method uses ‘EventTransceiverServers’ to distribute messages over the network. However, the sender should send the message to ‘EventTransceiverServers’ sequentially, which is efficient in large scale systems. As it can be seen, the previously proposed mechanisms suffer from slow message dissemination in large scale and also from assumptions like fault free environment or availability of specific services such as IP Multicast in underlying network. Our mechanism, Sama, is a location independent mechanism. We do not assume proxy for mobile agents and a message sender does not need to know group members. Sama delivers messages to group of mobile agents without knowing their locations and does not restrict their migration even. In the next section we describe the main concepts of Sama. 3. Sama group communication mechanism 3.1. System model Sama is an application level group communication mechanism. We assume a heterogeneous internetwork model such as the Internet where there exists at least one path between every two machines. In order to be able to accept mobile agents, there should be an agent server running on each host in the system. We also assume that the underlying mobile agent framework provides communication features among system components. All communications are done in application level and use techniques such as remote method

ARTICLE IN PRESS 190

H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

invocation. For instance, in our implementation we use Voyager’s messaging service for communication (Recursion Software Inc, 2003). For the sake of simplicity, we assume each host can send one message every time unit. By small changes in the algorithm that is discussed in the next subsection, Sama can exploit capability of hosts to send more than one message per time unit for increasing speed of message propagation. We also suppose that the following parameters are available for our mechanism:

 

Maximum message transfer time (MMTT) Maximum agent migration time (MAMT)

MMTT is the maximum amount of time takes a message to be transferred between two hosts. We can calculate MMTT using the round trip application level delay between hosts in the system. Since the calculated value can vary considerably on the Internet, we use a reasonable approximation of the upper bound of this value. MAMT is the maximum time takes an agent migrates from one host to another. MAMT can also be calculated in the same way as MMTT. 3.1.1. Message dispatcher object (MDO) MDOs are the main components in our mechanism which route and deliver messages to mobile agents. There is one MDO on each host. They can be assumed as a part of agent servers running on every host. In our implementation, each MDO is an object on an agent server. The system administrator using an MDO generator program creates MDOs on all hosts before the mechanism starts its work. Each MDO has the following parameters:

    

MDO List Message Storage Queue List of the Local Group Members MMTT MAMT

Each MDO knows all MDOs and their locations on the network. MDOs store this information in their MDO lists that can be a typical data structure such as an array. Since we store minimum required information to reach MDOs that consists of two numbers, MDO identifier and its IP address, the size of MDO list can scale well to the large number of MDOs. The position of each MDO is the same in all MDO lists in the system. Each MDO also has a message storage queue that is used to cache incoming messages. A timeout value is assigned for each incoming message, which is calculated using MMTT and MAMT. MDOs use this timeout values to discard messages from their message storage queues. They also have a list of their local mobile agents that are group members. Using this list, MDOs deliver incoming messages to their local group members. The mentioned structure for MDOs is sufficient for best effort message propagation among group members. MDOs also provide facilities for mobile agents to join, leave, register or unregister to a group. Agents use register and unregister methods of MDOs when they want to migrate to another host. Before migration, an agent unregisters itself from the list of local group members of the MDO of the source host. Then, it migrates to destination host. After

ARTICLE IN PRESS H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

191

Fig. 1. A sample system with eight hosts with agent servers.

migration, the agent registers itself to the MDO of the destination host and receives all messages, which it could not receive during its migration. Mobile agents also know their local MDO and send messages to the group by passing them to their local MDO. Fig. 1 depicts a sample system. The system consists of 8 hosts with agent servers running on them. There is one MDO object on each host and there are agents on host 0, 1, 3, 4, 5 and 6. There is no agent on host 2 and 7; however, agents can migrate to these hosts too. The hosts are connected through the Internet. 3.2. Message propagation mechanism When a mobile agent wants to send a message to the group, it does not send the message to the members directly. Instead, it delivers the message to its local MDO and message propagation process starts. This process has two main phases. In the first phase, MDOs propagate the message among themselves. Then each MDO delivers the message to its local mobile agents using its List of the Local Group Members. Message dissemination among MDOs is done using a Binomial Tree structure and in a concurrent manner. The main idea can be described as follow. When the first MDO receives a message, it first sends the message to one of MDOs. Now, two MDOs have the message, the sender MDO and the MDO, which just received the message. Thus, the second MDO can also participate in message propagation process. In the next step, both MDOs send the message to two other MDOs and the number of MDOs that have the message will be four. Then, all the four MDOs send the message to four other MDOs and this process will continue until all MDOs receive the message. Therefore the number of MDOs, which have received the message, is doubled after each step and upon receiving the message each MDO participates in message delivery.1 In fact the mechanism tries to recruit 1

We remind the assumption that each host can send one message in each time unit. This is not a necessary assumption and later we describe how our mechanism can exploit capability of sending multiple messages per time unit.

ARTICLE IN PRESS 192

H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

Fig. 2. Message propagation tree among MDOs.

as many hosts as it can to participate in message propagation to make this process concurrent and fast. The result of message dissemination among MDOs using described method is a binomial tree, which MDOs are its nodes. Fig. 2 depicts a sample tree for a system with 16 MDOs. In this figure we suppose that an agent in host 0 sends a message to the group. Therefore, it passed the message to its local MDO, MDO0. This MDO starts message propagation process and in the first step sends it to MDO8. In the second step, MDO0 and 8 send the message to MDO4 and12, simultaneously. Then these four MDOs that have the message concurrently send it to MDO2, 6, 10 and 14, in the third step. Finally the remaining MDOs receive the message from the 8 MDOs that have the message concurrently and in four steps all 16 MDO will have the message and start delivering it to their local agents. Message propagation can start from any of the MDOs. In fact, the MDO that its local agent sends a message to the group is the MDO that starts message propagation among MDOs. To construct the binomial tree and specify the MDO that should receive the message in each step, every MDO runs a distributed tree-generating algorithm using its MDO List. The algorithm is shown in Fig. 3. The customized MDO List in the algorithm is a sub-list of the main MDO List that is used in every step. As an example of execution of the algorithm assume the system with 16 MDOs and MDO0 wants to send a message. According to the algorithm, boundaries of its customized MDO list are (1, 15). It founds the median MDO in this list which is MDO8 ((1+((15–1) mod 16)/2) mod 16 ¼ 8), and sends the message to this MDO which is depicted in Fig. 2. Now MDO8 also starts running the algorithm with (9, 15) as boundaries and MDO0 restart the algorithm with (1, 7) as boundaries. As can be observed there, in each step, the number of MDOs that receive the message is doubled and the number of communication steps required for dissemination of a message among all MDOs is upper bound of log2 n where n is the

ARTICLE IN PRESS H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

193

Every MDO does the following steps after receiving a message Suppose the number of MDOs is n and (a , b) are the boundaries of the Customized MDO List 1. Get message and the boundaries for the MDO list and calculate the customized MDO List. 2. If there is no boundaries a. If you are the starter MDO and your position in the MDO list is p then set the boundaries as a = ( p + 1 ) mod n and b = ( p – 1 ) mod n b. Else finish 3. If b-a mod n < 2 send the message to the MDOs that are at indices a and b and then finish 4. Find the median component of the customized MDO list. Assume that its position in the customized list is m then m = (a + ( ( b – a ) mod n ) / 2) mod n 5. Calculate the boundaries of the new customized MDO list, which is the first half of the list as follow: a=a

,

b = ( m – 1) mod n

6. Send the message and the following boundaries to the median MDO a = ( m + 1 ) mod n

,

b=b

7. Go 1.

Fig. 3. Tree generating algorithm that is run in every MDO upon receiving a message.

number of MDOs in the system. More details about the algorithm can be found in Jafarpour and Yazdani (2003a, b). The proposed model is enough for best effort message delivery. However, if we need more reliability, Sama can provide an acknowledgement and timeout mechanism to ensure reception of a message by all group members. In this optional feature, after receiving a message and delivering it to their local group members, MDOs that are leaves of the message dissemination tree, inform their parent MDOs in the tree by sending an acknowledgement. Every interior node also sends an acknowledgement to its parent MDO after delivering the message to its local group members and receiving acknowledgements from its entire child MDOs. This acknowledgement informs the parent about correct propagation of the message among all MDOs located in the sub-tree rooted by the child MDO. Finally, reception of acknowledgements from its entire child MDOs by the root MDO, which is the sender MDO, indicates that all the MDOs have received the message successfully. MDOs store the message for a limited period of time to ensure that the migrating group members have also received it. By making some changes in the algorithm, Sama can exploit capability of hosts to send more than one message in a time unit as well. Suppose each host can send three messages simultaneously. Then, instead of dividing customized list into two sections, the algorithm divides the customized list into four sections and sends the message to three MDOs simultaneously.

ARTICLE IN PRESS 194

H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

In the above description, we assumed that the number of MDOs does not change in the system. However, if a new MDO wants to join to the system, it connects to one of MDOs and informs it about the join. This MDO adds the new one to its MDO list and informs all other MDOs including the new MDOs about the new MDO using Sama’s propagation algorithm with its updated MDO list. Each MDO that receives information of the new MDO, adds it to the MDO list and contributes in informing other MDOs about the new one using its updated MDO list and the same message propagation method. At the end of this process, all MDOs including new MDO have the same MDO list. When an MDO wants to leave the system it also informs one of MDOs about the leave and this MDO informs all other MDOs to update their MDO lists using the same mechanism as described above. 4. Recovery from host failure Unlike other group communication mechanisms for mobile agents that do not provide any solution for failures and just ignore them, Sama detects failures in the system and automatically recovers from them. This is a unique feature in our message propagation mechanism. As mentioned, for more reliability an optional acknowledgement mechanism after message propagation among MDOs can be used. Furthermore, MDOs can store messages for a limited period of time to ensure that migrating group members have also received them. This period of time is calculated using MMTT and MAMT values and should not be less than the following amount (see Section 6). MMTT  ðlog2 N MDOs Þ þ MAMT:

(1)

In formula (1), NMDOs presents the number of MDOs in the system. Each MDO before sending a message to its child MDOs sets a timer. The child MDOs should send the acknowledgement before the expiration of the timer in its parent. If an MDO does not receive acknowledgement from at least one of its child MDOs after expiration of the timer it infers that there is a failure in the sub-tree rooted by the child MDO. The timer value is calculated using the MMTT and the first Customized MDO List in the tree generating algorithm. The value is a reasonable estimation of the time that all child MDOs should send the acknowledgement to their parent MDO. MDOs use the following formula to calculate the timer value for their child MDOs in the tree generation algorithm. Timer42  MMTT  ðlog2 First Customized List SizeÞ þ LMDT  N Agents .

(2)

In formula (2), LMDT (local message delivery time) is the maximum amount of time that takes an MDO delivers the message to one of its local group members and NAgents is the number of group members. Formula (2) can be added to the tree generation algorithm. Using the mentioned parameters Sama can detect host failures in the system and recover from them Jafarpour and Yazdani. Regarding the location of an MDO in the propagation tree and the failure time, we categorize host failures in the system into three different groups. 1. Host failure before and during receiving a message 2. Host failure after receiving a message and before sending acknowledgement

ARTICLE IN PRESS H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

195

3. Sender MDO’s2 host failure after sending a message and before receiving its acknowledgements.

4.1. Host failure before receiving a message As mentioned, communications in Sama are performed using application layer mechanisms such as Remote Method Invocation. Our mechanism interprets failure in establishing this kind of connection as the failure of the receiver host and excludes the failed node from the message propagation tree as the following. The failed host’s parent MDO detects the failure while sending a message to the failed MDO. After detection of the failure, the parent MDO first informs its own parent about the failure and requests it to reset its timer. By receiving a timer reset request after resetting its timer, every MDO sends a timer reset request to its own parent MDO until the request reaches to the root of the tree. Then the MDO, which has detected the failure, sends the message to the MDO that is located next to the failed MDO in the Customized MDO List. After sending the message, the detector MDO removes the failed MDO from its MDO List and then generates a correction message to inform all other MDOs about the failure. The correction message is propagated among all MDOs using another message propagation tree rooted at the failure detector. Every MDO upon receiving the message updates its MDO List by removing the failed MDO from it and then executes the tree generating algorithm to construct the new message propagation tree using the updated MDO List. Fig. 4 shows the message propagation tree during the recovery from the failure of Host 12 in a system with 16 MDOs when MDO0 starts sending process. As it can be observed there, MDO 8 after detecting the failure of Host12 sends the message to MDO13, which is located next to the failed MDO in the MDO List. After propagation of the correction message and removal of the failed MDO from all MDO Lists, the new message propagation tree will be updated as shown in Fig. 5. 4.2. Host failure after receiving a message and before sending To recover from the second category of failures, Sama uses timers. As explained previously, before sending a message to its child MDO, each MDO sets a timer. If the MDO does not receive ACK from its entire child MDOs before timeout, it concludes that there is a failure and locates the MDO that has not sent ACK. The MDO first resets its timer and then sends timers reset request to its parent MDO. As described in Section 4.1, all the MDOs in the path from the detector to the proxy MDO in the root of the tree reset their timers by receiving requests from their child MDOs. The MDO then resends the message to the failed MDO. Now the failure is similar to the first category of failure and the mechanism tries to exclude the failed host using the same method described in the previous subsection. In this kind of recovery, the MDOs can detect duplicate messages using the message sequence numbers. 2

By sender MDO we mean the first MDO that starts message propagation which is also the root in message propagation tree.

ARTICLE IN PRESS 196

H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

Fig. 4. Message propagation tree during recovery form failure of Host 12.

Fig. 5. Message propagation tree after recovery form failure of Host 12.

4.3. Sender MDO’s host failure after sending a message and before receiving its acknowledgements The third form of the host failure is detected by the child MDOs of the failed sender MDO. If the host of the sender MDO (root MDO) fails before receiving

ARTICLE IN PRESS H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

197

Fig. 6. Message propagation tree after recovery form failure of Host 0.

acknowledgements from its entire child MDOs, the child MDOs can detect its failure when they try to send acknowledgement. The first child MDO that detected the failure then selects itself as the new root and deletes the failed MDO from its MDO List and sends correction message to all other MDOs to update their MDO Lists. Then it resends the message using the updated message propagation tree. Duplicate messages can be detected through their sequence numbers in MDOs. Fig. 6 shows the updated message propagation tree when MDO1 detects the failure of MDO0. In case of multiple host failure, Sama uses combination of the described methods for recovery. Our implementation of Sama using the described recovery methods automatically recovers from all of the mentioned failure categories. If we have network partition in the system, each partition also can continue its work by assuming the hosts in the next portion have failed and recovering from the failures using described methods. In case of temporal failures, Sama assumes them as a leave and a join operation. When a host disconnects or fails for a limited time Sama excludes the MDO from the system. Then when the node rejoins the system, Sama treats it as a new MDO and performs the process of adding this MDO to the system. More details can be found in Jafarpour and Yazdani (2004). 5. Improving Sama using a Hop-Ring protocol Sama does not assume any special message dissemination feature provided by the underlying networks such as broadcast or IP Multicast. Therefore, even if there are such services in some parts of our heterogeneous internetwork, Sama does not use them. To improve the efficiency and exploit network capabilities we propose a new Hop-Ring protocol for Sama (Jafarpour et al., 2005). This protocol tries to use capabilities such as broadcast or IP multicast as much as it can. The Hop-Ring protocol was introduced in Spread group communication mechanism (Amir and Stanton, 1998). The main idea is to

ARTICLE IN PRESS 198

H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

use underlying network features such as broadcast or IP multicast wherever they are available. In this protocol, message dissemination is done in two levels. In the first level which is the Hop protocol, a tree is generated among sites, which can be LANs or a collection of LANs supporting IP multicast protocol. In the second level, messages are propagated in sites via their message dissemination features like broadcast or IP multicast. Spread tries to find the optimum tree among sites in Hop protocol using Dijkstra’s shortest path algorithm. To do this, the mechanism needs to create metrics to decide how to connect the sites to generate the most efficient tree. However, many of these metrics change dramatically in large-scale heterogeneous networks. Therefore, the tree structure among sites should be reconstructed to maintain its efficiency. This is difficult in large-scale systems that may contain hundreds of sites. We use this idea to improve the performance of our group communication service for mobile agents. The Hop-Ring protocol in our mechanism is used to propagate messages among MDOs. To do this, we change the tree generation method in Hop protocol and use our tree generation algorithm instead. The idea is to use Sama’s tree generation algorithm to generate a message propagation tree among sites and then disseminate messages among MDOs on each site using its message propagation features. To apply this idea in our mechanism, we should make some changes in our system model. 5.1. The improved system model The new system model for our group communication service is similar to the previous one. We have a Mobile Agent Servers and an MDO on every host that group members can migrate on. Unlike the previous model in which, MDOs store MDO information in a one dimensional array, in our new model MDOs store these information into a two dimensional array. In this array structure, the first dimension shows the sites and the second dimension shows the list of MDOs on each site. The MDO information array is the same for all MDOs in the system. Fig. 7 shows a sample MDO information array. As mentioned, sites can be LANs or collection of LANs which support IP multicast protocol. In each site, one of the MDOs is the proxy of that site and messages that should be delivered to the group members located in that site are sent to this MDO. Communication

Fig. 7. Two-dimensional MDO information storage. The first MDO for each site is its Proxy MDO.

ARTICLE IN PRESS H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

199

between Proxy MDOs is done using point-to-point application layer methods such as RMI. Communication between Proxy MDO and other MDOs in the same site is done using features of that site such as broadcast or IP Multicast. Every MDO also knows MMTT, MAMT and list of its local group members. 5.2. Message propagation process Message propagation in our new Hop-Ring protocol is done in three phases. In the first phase, a message is disseminated among proxy MDOs. In this phase, every site will receive the message. In the second phase, every proxy MDO in each site disseminates the message among MDOs co-located in the same site using available features such as broadcast or IP Multicast. In the last phase, every MDO delivers the message to its local mobile agents. The propagation process in Hop-Ring protocol is started from the Proxy MDO of the sender’s site. When a message is going to be sent to the group, it should be delivered to the Proxy MDO in the site of the sender agent. This MDO first assigns a sequence number to the message and then starts disseminating the message among other Proxy MDOs. Message dissemination in this phase is done using a binomial tree structure described in the previous section. All proxy MDOs participate in tree generation by executing the tree generation algorithm. However, in this phase, MDOs use the first row of MDO information array instead of previously used MDO List. After dissemination of the message among Proxy MDOs on each site, they send the message to all other MDOs in their own site using features that have been provided by the underlying network such as broadcast or IP multicast. Many group communication mechanisms based on IP multicast have been proposed for LANs or WANs. These mechanisms provide high speed and reliable message dissemination among group members on a LAN. Among them to mention are Horus (van Renesse et al., 1996), Totem (Moser et al., 1996) and Transis (Dolev and Malki, 1996). To disseminate messages among MDOs in a site, our mechanism can use any of these mechanisms. After dissemination of the message among all MDOs, they deliver the message to their local mobile agent group members. In order to provide more reliability we can use an acknowledgement mechanism similar to previous section. All Proxy MDOs can send acknowledgement to their parent nodes in the binomial tree and when the sender Proxy MDO receives acknowledgements from all of its children in the tree, the mechanism guarantees that all group members have received the message. Fig. 8 depicts message propagation in a sample system with 8 sites. In this figure, there is one proxy MDO for each site and messages are disseminated among them using Sama’s message propagation tree. 5.3. Membership management Membership management in our improved group communication protocol is similar to the previous one. All MDOs provide join and leave methods that can be called by mobile agents. When a mobile agent calls the join method of an MDO, the MDO adds the agent to its local group member list and the agent will receive the incoming message from its local MDO. When a group member wants to leave the group, it calls the leave method of its local MDO. The MDO then removes the agent from its local group member list and the agent will not receive any more messages.

ARTICLE IN PRESS 200

H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

Fig. 8. Message propagation process in a sample system with 8 sites.

5.4. Recovery from failures Recovery from host failure in the improved group communication mechanism is a little different from the previous one. In the new protocol, we use the previously proposed recovery from host failure mechanism for Proxy MDOs. Therefore, the failures of Proxy MDOs are categorized into three classes. However, after deletion of a failed MDO from the two dimensional array of MDO information, the next MDO in the column of the failed MDO is chosen as the new Proxy MDO for that site. Consequently, the structure of tree for message dissemination among Proxy MDOs does not change. If there is no other MDO in the site, the site should be removed from the MDO information array. The tree will be updated after deletion of the empty site from the array in this case. The two dimensional array structure in Fig. 7 shows a sample MDO information array. If one MDO, which is not Proxy MDO, fails it is detected by the Ring protocol and the Proxy MDO of its site informs other Proxy MDO about the failure by the same group communication mechanism and all MDOs remove the failed MDO from their MDO information array. Also in this case, the tree does not change.

6. The main characteristics of Sama In this section, we review some of the main characteristics of Sama group communication mechanism.

ARTICLE IN PRESS H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

201

6.1. Message delivery for highly mobile agents Sama delivers the messages to the all group members assuming that it has the MMTT and MAMT values. Some scenarios are presented in Murphy and Picco (2002) that show even in a fault-free environment; highly mobile agents might not receive messages successfully. To solve this problem, Sama stores messages in MDOs for a limited period of time to ensure that all agents receive the message from at least one MDO. MDOs store messages in their Message Storage Queues. If the number of MDOs in the system presented by NMDOs, the amount of time each message should be stored in every MDO should be Message Storage Time4MMTT  log2 ðN MDOs þÞMAMT:

(3)

To illustrate formula (3) assume in the system shown in Fig. 2 with 16 MDOs, the message transfer time for each of the following channels is MMTT and other links in the system have a very small message transfer time that can be ignored. ð0; 8Þ; ð8; 12Þ; ð12; 14Þ and ð14; 15Þ: Consequently, MDO0 is the first receiver of the message and MDO15 is the last receiver among MDOs and the message reaches from 0 to 15 after 4  MMTT. If an agent migrates from host 15 to host 0 just before message reception by MDO15, it should receive the message from the MDO in host 0. MDO0 should hold a copy of the message to ensure that it can deliver it to the incoming agents, which have not received the message yet. Obviously, if MDO0 stores the message for at least (4  MMTT+MAMT), it can deliver the message to the new arriving agent. After this amount of time, there is no need to store the message and MDO can discard the message from its queue. Fig. 9 shows the scenario and agent migration path.

Fig. 9. An agent migrates from host 15 to 0 before receiving message from MDO15.

ARTICLE IN PRESS 202

H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

6.2. Constant number of remote messages This is one of the most important advantages of Sama comparing to other mechanisms. Sending remote messages takes considerably more time than local messages, especially on the Internet. As it can be inferred from the algorithm, each MDO receives the message once and, then, delivers it locally to its corresponding agents. Consequently, the number of remote messages is equal to the number of MDOs in the system. For instance, the number of remote messages for the previous example is 16. This number can grow rapidly for other proposed mechanisms. 6.3. Message delivery independent of agent locations An important characteristic of our group communication mechanism is that it delivers messages to the group members independent of their locations. This approach does not restrict agents’ migration and their independence (Murphy and Picco, 2002). 6.4. Message delivery time Our approach reduces message delivery time in large-scale mobile agent systems. As mentioned before, the message delivery operation among MDOs is done in a logarithmic time, which considerably improves message propagation speed. To calculate the maximum amount of time it takes to disseminate a message among all MDOs, we can use formula (4). Maximum Time for MDOs ¼ MMTT  log2 ðN MDOs Þ:

(4)

After this amount of time all MDOs have an instance of the message. If we assume that the local message delivery time (LMDT) shows the amount of time to deliver a local message and NAgents shows the number of all agents in the system, we can calculate the upper bound of message delivery time in the worst case scenario, when all agents are located on the host with the slowest path from the source of the message, using formula (5). Maximum Delivery Time ¼ MMTT  log2 ðN MDOs Þ þ LMDT  N Agents þ MAMT:

ð5Þ

After this amount of time all MDOs have delivered the message to their local agents. To ensure that migrating agents have also received the message, we also need to add message storage time in formula (3) to this amount. 6.5. Transparent message delivery Transparent message delivery to a group is another property of our mechanism. By transparent message delivery we mean the sender does not need to know the group members and it just sends a message to the group and the mechanism will deliver the message to all group agents. This is a very important characteristic that makes implementation of the sender and the agents easy.

ARTICLE IN PRESS H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

203

6.6. Open and close groups Groups can be open or close. In an open group, agents, which are not group members, can send message to the group members. In contrast, in a close group just group members can send messages to the group. To support an open group, we can make one of the MDOs accessible from the outside as the proxy of the group and send messages through this MDO. 6.7. Easy configuration Sama can be easily reconfigured. MDOs can be implemented as mobile agents and the system administrator can make changes in MDO structure and redeploy them among mobile agents servers. This capability makes Sama more customizable and easy to be managed. 7. Experimental results We have implemented our mechanism in Java. We have also used Voyager’s messaging features that provide communications using techniques similar to Remote Method Invocation. We have compared Sama to the Mobile Process Groups approach with respect to the different number of agents. The test configuration was made of 16 hosts in three 100 Mbps Ethernet networks. We chose a 20 KB string message to be disseminated among agents. We measured average message propagation time among group members. The calculated time is the time between sending the message and receiving acknowledges from all group members. Fig. 10 shows that our proposed mechanism is considerably faster than Mobile Process Groups. Since Sama uses constant number of remote messages and concurrently disseminates messages, its propagation time remains almost the same for different number of group members. However, in Mobile Process Groups approach, message propagation time increases linearly by incrementing group size. 1800

Propagation Time (ms)

1600 1400 1200 1000

Mobile Process Groups Sama

800 600 400 200 0 4

6

8

10

12

14

16

Number of Agents

Fig. 10. The message delivery time in Sama and mobile process groups based on the number of agents in a system with 16 hosts.

ARTICLE IN PRESS 204

H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

For better evaluation of our implementation, we also tested it using ModelNet. ModelNet is a cluster-based emulation environment for evaluation of distributed services. Application code runs on end hosts within a cluster that routes packets through a core. For each packet, this core introduces a delay, bandwidth, and loss characteristics according to a target topology. ModelNet also emulates each hop in the topology and thus tracks the effects of congestion and competition among competing packets. Therefore, it provides a very accurate environment to emulate execution of our mechanism on a wide-area network like the Internet. We used Georgia Tech’s topology generator (GT ITM) to build a network topology for our test-bed. After setting up ModelNet, we evaluated a simplified version of Sama and Mobile Process Group implementations, which do not use acknowledgements, on the test-bed. We compared these two mechanisms regarding the number of group members and message size. Fig. 11 shows the emulation results based on different message sizes. In this experiment, we emulated a system with 64 hosts with one agent on each host. Since messages usually are small events or notifications, we choose three different sizes for message were tested (100, 1000, 10000 Bytes). As it can be seen, in all cases Sama disseminate messages significantly faster that Mobile Process Groups. It also shows that message size does not have considerable impact on dissemination time. The next experiment was based on the number of group members. We considered four different cases where number of group members are 16, 32, 48 and 64. Fig. 12 shows the message propagation time among group members in these four scenarios. As we expected, propagation time remains the same for Sama with different number of group members. However, in Mobile Process Groups, by increasing the number of group members, propagation time linearly increases which is not desired in large-scale systems with hundreds of group members. This also validates our results in Fig. 10. We have also evaluated the improved Sama mechanism that uses Hop-Ring protocol and compared it with the previous one. We have simulated both of our group communication mechanisms and Mobile Process Groups with ns-2 network simulator (Network Simulator). In order to do this simulation, we wrote a new Agent class with C++ in ns-2. The new class in fact plays the role of MDOs in our mechanism and there is one of these agents on each host, which should receive the messages. These agents implement the tree generation algorithm and messages are disseminated among them

Propagation Time (ms)

60000

50220

50243

50446

50000 40000 Mobile Process Groups

30000

20472

20479

20577

Sama

20000 10000 0

100

1000 10000 Message Size (Bytes)

Fig. 11. Message propagation time for Sama and mobile process groups for different message sizes in a system with 64 hosts.

ARTICLE IN PRESS H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

205

Propagation Time (ms)

60000 50000 40000

Mobile Process Group

30000

Sama

20000 10000 0

16

32 48 Number of Group Members

64

Fig. 12. Message propagation time for Sama and mobile process groups for different number of group members.

Fig. 13. A screen snapshot of the simulation.

through the binomial tree. Our sample internetwork consists of eight LANs which were connected through a group of inter connected routers. Fig. 13 shows a screenshot of the simulation. The message dissemination time that we considered in our simulation is the time between starting message propagation by the sender’s Proxy MDO to receiving all acknowledgments by the Proxy MDO from its child MDOs in the tree. We did the

ARTICLE IN PRESS 206

H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

Fig. 14. Message propagation time for Sama and improved Sama.

Fig. 15. Message propagation time for three approaches.

simulation for four different cases. We started with one MDO in every LAN and then increased number of MDOs in each LAN to two, four and eight. Simulation results are shown in Figs. 14 and 15. Fig. 15 depicts the message propagation time for improved Sama that uses Hop-Ring protocol in comparison with the Sama. As it can be seen, we have gained considerable improvement by using the Hop-Ring protocol. Fig. 15 shows the message propagation time for Mobile Process Groups, Sama and improved Sama with Hop-Ring. Again it is obvious that Sama superior Mobile Process Groups, especially in groups with large number of members. It can also be observed that the improved version of Sama performs well comparing to previous version and it has almost the same message propagation time for all of the cases.

ARTICLE IN PRESS H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

207

The reason for this is that our improved mechanism does not do any extra work by increasing the number of MDOs. 8. Conclusions We proposed Sama, a distributed and scalable application level group communication mechanism for large-scale mobile agent applications. Sama uses Message Dispatcher Objects (MDOs), which are special objects on agent servers, to parallelize and speed up message propagation. Sama does not assume any special network model and can be used on heterogeneous networks like the Internet. Our proposed mechanism automatically detects failures in the network and recovers from them. We also proposed a Hop-Ring protocol to improve the performance of Sama and exploit capabilities of underlying networks whenever they are available. Our experimental results show that Sama reduces message dissemination time considerably. As another improvement, we consider generating an optimum binomial tree in tree generation phase. To do this, we need to know the network metrics such as round trip time for each pair of nodes. However, finding the optimal tree is a NP hard problem for heterogeneous networks (Khuller and Kim, 2004). We are working on this case and try to devise a heuristic to find near optimum binomial tree generation algorithm for a fully connected graph. References Amir Y, Stanton J. The Spread Wide Area Group Communication System. John Hopkins University, MD. Technical Report CNDS-98-4, 1998. Assis Silva FM, Maceˆdo, RJA. Reliable Communication for Mobile Agents with Mobile Groups. In: The Proceedings of the Workshop on Software Engineering and Mobility (co-located with IEEE/ACM ICSE 2001). Toronto, Ontario, Canada, 2001. Brewington B, Gray R, Moizumi K, Kotz D, Cybenko G, Rus D. Mobile agents in distributed information retrieval. In Intelligent Information Agents 1999:355–95. Cabri G, Leonardi L, Zambonelli F. Mobile agent coordination for distributed network management. J Network Syst Manage 2001;9(4):435–56. Chockler GV, Keidar I, Vitenberg R. Group communication specifications: a comprehensive study. ACM Comput Survey 2001;33(4). Spyrou C, Samaras G, Pitoura E, Evripidou P. Mobile agents for wireless computing: the convergence of wireless computational models with mobile-agent technologies. Mobile Network Appl 2004;9(5):517–28. Dasgupta P, Narasimhan N, Moser LE, Melliar-Smith PM. MAgNET: Mobile agents for networked electronic trading. IEEE Trans Knowledge Data Eng 1999;24(6):509–25. Deering S. Host Extention for IP Multicasting. IETF. RFC 1112. Dolev D, Malki D. The Transis approach to high availability cluster communication. Commun ACM 1996;39(4). Fuggetta A, Picco GP, Vigna G. Understanding code mobility. IEEE Trans Software Eng 1998;24(5). Grasshopper, Release 2.2, Basics and Concepts (Revision 1.0), March 2001. At URL: http://www.Grasshopper.de GT ITM: Georgia Tech Internetwork Topology Models. At URL: http://www.cc.gatech.edu/fac/Ellen.Zegura/ graphs.html Hartroth A, Hofmann M. Using IP multicast to improve communication in large-scale mobile agent systems. In: Proceedings of 31st Annual Hawaii International Conference on System Sciences (HICSS), vol. VII, Hawaii, 1998; pp. 64–73. IBM Japan Research Group Aglets Workbench. At URL: www.aglets.trl.ibm.co.jp Jafarpour H, Yazdani N. Sama: a scalable group communication mechanism for mobile agents. In: Proceedings of SNPD2003. Germany: Lubeck; 2003a. p. 506–11.

ARTICLE IN PRESS 208

H. Jafarpour et al. / Journal of Network and Computer Applications 30 (2007) 186–208

Jafarpour H, Yazdani N. A fast group communication mechanism for large-scale distributed objects. In: Lecture Notes in Computer Science 2003b;2889:1036–44. Jafarpour H, Yazdani N. Provision of recovery from host failure for Sama group communication middleware for mobile agents. In: Proceedings of SERA2004. Los Angeles, CA, 2004. p. 242–7. Jafarpour H, Yazdani N, Bazazzadeh N. Improving Sama group communication mechanism for mobile agents via a hop-ring protocol. In: The 2005 International Conference on Internet Computing (ICOMP’05), Las Vegas, NV, 2005. Khuller S, Kim Y-A. On broadcasting in heterogeneous networks. In: ACM-SIAM symposium on discrete algorithms. Louisiana: New Orleans; 2004. p. 1011–20. McCormick J, Chaco´n D, McGrath S, Stoneking C. A distributed event messaging system for mobile agent communication. Lockheed Martin Advanced Technology Laboratories, Technical Report TR-01-02, 2000. ModelNet. At URL: http://issg.cs.duke.edu/modelnet.html Moser LE, Melliar-Smith PM, Agarwal DA, Budhia RK, Lingley-Papadopoulos CA. Totem: a fault-tolerant multicast group communication system. Commun ACM 1996;39(4). Murphy AL, Picco GP. Reliable communication for highly mobile agents. J Autonom Agents Multi-Agent Syst 2002:81–100. Network Simulator (ns-2). At URL: http://www.isi.edu/nsnam/ns/ Recursion Software, Inc. Voyager ORB Developer’s Guide, 2003. At URL: www.recursionsw.com, 2003. van Renesse R, Birman K, Maffeis S. Hourus: a flexible group communication system. Commun ACM 1996;39(4). Wijngaards NJE, Overeinder BJ, Steen, van Brazier M. F.M.T. supporting internet-scale multi-agent systems. Data Knowledge Eng 2002;41(2–3):229–45. Wojiehowski PT. Algorithms for Location-Independent Communication between Mobile Agents. De´partement Syste`mes de Communication, EPFL Technical Report DSC-2001/13, 2001.