Constructing application-layer multicast trees for minimum-delay message distribution

Constructing application-layer multicast trees for minimum-delay message distribution

Information Sciences xxx (2014) xxx–xxx Contents lists available at ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/ins...

1000KB Sizes 2 Downloads 99 Views

Information Sciences xxx (2014) xxx–xxx

Contents lists available at ScienceDirect

Information Sciences journal homepage: www.elsevier.com/locate/ins

Constructing application-layer multicast trees for minimum-delay message distribution Hwa-Chun Lin ⇑, Tsung-Ming Lin, Cheng-Feng Wu Department of Computer Science, National Tsing Hua University, Hsinchu 30013, Taiwan

a r t i c l e

i n f o

Article history: Received 23 March 2009 Received in revised form 25 March 2014 Accepted 29 March 2014 Available online xxxx Keywords: Minimum-delay multicast Heterogeneous nodal delay Application-layer multicast

a b s t r a c t This paper considers the problem of disseminating messages from a source to multiple destinations in an overlay network in which nodal delays for processing messages are taken into account in addition to communication delays. The objective is to find a multicast tree to deliver a message from a source to multiple destinations in the minimum delay time. This problem is referred to as the minimum-delay multicast problem with nodal processing delays. This paper develops several heuristic algorithms that are based on the principle of iteratively selecting one of the destination nodes not yet on the current multicast tree and attaching the selected node to it until all destination nodes are on the multicast tree. A new delay measure called reception-and-processing delay is introduced in this paper as one of the delay measures used in selecting one of the destination nodes for attaching to the current multicast tree. This paper finds that the least-delay path from the current multicast tree to a destination node not yet on the tree may intersect or overlap with the current multicast tree if a commonly used method is employed to find the least-delay path. An efficient method is devised to avoid this undesirable phenomenon. The performances of the four heuristic algorithms and the effects of various characteristics of the overlay networks on the performances of the heuristic algorithms are studies via simulations. Ó 2014 Elsevier Inc. All rights reserved.

1. Introduction A number of applications, such as multicast video/audio streaming, distance teaching/learning with multicast support, and multicast data transfer, require dissemination of messages from a source to multiple destinations in a fast and efficient manner. Using a multicast tree to deliver messages from a source to a number of destinations is a common technique for point-to-multipoint or group communications. To achieve fast delivery of a message, it is desirable to construct a multicast tree such that the time interval measured from the beginning of the message delivery until the message is received by all destinations is as short as possible. Multicast technologies that replicate messages at routers are known as IP-layer multicast. A survey of IP-layer multicast routing algorithms and protocols can be found in [19]. IP-layer multicast has not been widely deployed due to various deployment issues such as router migration, group management, and security [8]. The lack of IP-layer multicast support has led to numerous research and development activities on application-layer multicast, in which message replication is

⇑ Corresponding author. Tel.: +886 3 5731071; fax: +886 3 5723694. E-mail address: [email protected] (H.-C. Lin). http://dx.doi.org/10.1016/j.ins.2014.03.130 0020-0255/Ó 2014 Elsevier Inc. All rights reserved.

Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

2

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

performed at end hosts. Application-layer multicast is usually referred to as ALM in short. Comprehensive surveys of ALM can be found in [1,13,23]. Shifting the functionality of message replication from routers to end hosts has the advantage of easier deployment of multicast services. The disadvantage of using application-layer multicast to provide multicast services is increased consumption of processing resources and network resources of the end hosts that perform message replication for others end hosts. This drawback is balanced by the benefit of easy deployment of multicast services. Most of the researches on multicast take solely communication delays or costs on the links into account. A number of researches on multicast also considered nodal processing delay in their model. For the purpose of disseminating messages to multiple destinations in minimum delay time, the models considered in [3,2,4–6,10,11,14,16–18] take into account the nodal processing delays for sending messages from one node to other nodes. The basic model with nodal processing delays is the telephone model [10,11,16,17] in which all nodes communicate synchronously and each node sends or receives a single message in one time unit or one communication round. The problem of finding a scheme to send a message from a node to all other nodes in the minimum number of rounds is known as the telephone broadcast problem which is an NP-complete problem [12]. The model presented in [5] takes the processing delays for receiving messages into account. The model described in [18] generalizes the model in [5] such that the delay for processing a message depends on the size of the message. More general models are the postal model [2] and the LogP model [6,14]. In these two models, a sending node is allowed to send a message to another node before the message is completely received by the current receiving node and a receiving node is free at the beginning stage of the delivery process. Both the postal model and LogP model assume that the delays for sending the same message between different sending and receiving nodes are the same. The heterogeneous postal model [3] generalizes the postal model such that the delays for sending the same message may be different for different sending and receiving nodes. The overlay model in [4] is similar to the heterogeneous postal model [3]. The heterogeneous postal model [3] incorporates the sending and receiving delays in the communication delay; while the overlay model [4] separates the sending delay from the communication delay. The sending delay is referred to as the processing delay in the overlay model [4]. This paper adopts the overlay model described in [4]. The objective is to find a multicast tree that disseminates a message from a source node to a set of destinations nodes in minimum delay time. The problems of finding minimum delay multicast schemes using these more general models are NP-complete since these models include the telephone model as a special case. An approximation algorithm has been presented in [3] for the minimum-delay multicast problem that employs the postal model. An approximation algorithm and a heuristic algorithm have been developed in [4] for the minimum-delay multicast problem that incorporates the overlay model. This paper concentrates on developing heuristic algorithms and studying how various characteristics of the overlay networks affect the performances of the heuristic algorithms. Starting with the source node as the initial tree, the heuristic algorithms studied in this paper are based on the principle of adding one destination node at a time to the current multicast tree until all destination nodes are on the multicast tree. This paper identifies a pitfall that may cause intersections or overlaps of the current multicast tree and the least-delay path to a destination node while trying to attach a destination node to the current multicast tree. To get around this problem, a procedure is devised to contract the current multicast tree into a single super node before calculating the least-delay path to a destination node. A typical delay measure in selecting one of the destination nodes for attaching to the current multicast tree is the delay that a node receives the message. This paper introduces a new delay measure called reception-and-processing delay that takes into account the nodal processing delay of the candidate destination node to capture the effect of the nodal processing delay in case of attaching one or more children nodes to the candidate destination node. Our simulation results show that a heuristic algorithm that takes into account the nodal processing delays of the candidate destination nodes and iteratively adds the destination node with the least reception-and-processing delay to the current multicast tree yields the lowest average overall multicast delay among the heuristic algorithms studied in this paper for a wide range of the characteristics of the overlay networks. The rest of this paper is organized as follows. The minimum-delay multicast problem with nodal processing delays is given in the next section. The heuristic algorithms studied in this paper are described in Section 3. The performances of the heuristic algorithms are studied via simulations in Section 4. Finally, some concluding remarks are given in Section 5.

2. The problem Let G ¼ ðV; EÞ be a connected directed graph, where V is the set of vertices and E is the set of directed edges. Each vertex represents a host or a node, and each directed edge represents a directed communication channel from a host to another host, which can be a TCP connection or an UDP connection. If there is a directed edge from node u to node v, there is also a directed edge from node v to node u. If the hosts or nodes are connected by an underlying overlay network, the existence of a pair of directed edges between two hosts depends on whether there is a communication link between them in the overlay network. In general, the topology of an overlay network can be arbitrary. Thus, the graph G may have an arbitrary topology. Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

3

Each directed edge ðu; v Þ 2 E connecting from node u to node v is associated with a delay cuv representing the communication delay for sending a message from node u to node v. The communication delay cuv depends on the path taken by the message from node u to node v. Measurements of Internet delays can be found in [15,20,22]. Most of the Internet delays are in the order of milliseconds ranging from 1 ms to hundreds of ms [15,20,22]. This paper considers multicast applications in which a message may incur a non-negligible processing delay at a host (or node). Media transcoding is an example operation that may be performed on the messages at a host. Measurements of software processing delays for media transcoding at end hosts can be found in [9,21]. Duysburgh et al. [9] proposed a multicast service for delivering layered audio with media transcoding. In their experiments, it takes respectively 3.64 ms, 3.65 ms, 3.77 ms, and 4.07 ms for a computer with a 1 GHz AMD-K7 processor to transcode a 10 ms voice fragment from PCM16, G.711 l-law, G.726 4 bits, and G.728 encoded data into G.729 encoded data. Shorfuzzaman et al. [21] designed a video multicast system that is able to perform requantization transcoding on DCT (discrete cosin transformation) coefficients of MPEG-1 video streams based on the requirements and available bandwidths of the downstream receivers in the multicast tree. Experiments were carried out to measure a number of performance data using an Intel IXP1200-based implementation. The per frame processing times for trnascoding 300 Kbps video streams into 200 Kbps streams are in the range of 10–50 ms for single-threaded implementation of the transcoder. Compared with the Internet delays, the processing delays for media transcoding are non-negligible. Another property of the multicast applications considered in this paper is that message-processing at each node is serialized. Messages generated or arriving at a host for the same multicast application may need to be processed sequentially due to a number of reasons. First of all, a host usually has a single active NIC (network interface card) at any given time. Sending multiple messages through the NIC must be serialized. Secondly, a host may only have a single CPU that executes programs sequentially such that messages are processed sequentially. Thirdly, the multicast application software is written as a single-threaded program such that messages are processed by the software sequentially. Each node u 2 V is associated with a processing delay pu representing the delay for processing a message at node u. To send a message from node u to multiple other nodes, say nodes v 1 ; v 2 ; . . . ; v k , in the listed order, the message is first processed by node u incurring a processing delay of pu . The message is then transmitted over edge ðu; v 1 Þ to node v 1 incurring a communication delay of cuv 1 . Immediately after the message has been processed for sending to node v 1 , node u is available for processing a second copy of the message for sending to node v 2 . After the second copy is processed, it is transmitted over edge ðu; v 2 Þ to node v 2 incurring a communication delay of cuv 2 . The time interval for processing the second copy overlaps with the time interval for sending the previous copy of the message over edge ðu; v 1 Þ to node v 1 . We assume that all nodes are non-lazy; i.e., each node does not become idle once it receives a message and has not finished distributing the message if it needs to do so. Suppose that node u starts delivering the message at time 0. Then, node u will be available at time kpu and node v i (i ¼ 1; 2; . . . ; k) will receive the message at time ipu þ cuv i . The time at which node u becomes available is called the ready time of node u. The delay for node v to receive the message is referred to as the reception delay of node v. Note that the time instant at which each node receives the message will be different if the nodes v 1 ; v 2 ; . . . ; v k are ordered differently. Given a connected directed graph G ¼ ðV; EÞ, a source node s 2 V and a set of destination nodes M  V, the goal is to find an ordered tree T, rooted at s and containing all destination nodes in M, to send a message from the source node s to all the nodes in the destination set M such that the time at which the message is received by all destination nodes is as early as possible. In other words, the delay that all destination nodes receive the message is minimal.

3. Heuristic algorithms 3.1. Skeleton algorithm This paper develops several heuristic algorithms that are based on the following skeleton algorithm. Starting with the source node s as the initial multicast tree, the destination nodes are added to the current multicast tree one by one until all destination nodes are on the multicast tree. The skeleton algorithm is given as follows: 1. Set node s as the initial multicast tree. 2. Repeat the following steps until all destination nodes are on the multicast tree: (a) Find the points of attachment: Consider each destination node not yet on the current multicast tree as a candidate node and find the location on the current multicast tree to attach the candidate node based on a delay measure and a decision rule. This decision rule will be referred to as the attachment rule. (b) Select a destination node: Choose one of the destination nodes not yet on the current multicast tree based on the same delay measure used in step (2a) and another decision rule. This decision rule will be referred to as the selection rule. (c) Attach the selected node to the current multicast tree. 3. Calculate the optimal ordering for the multicast tree. After adding all destination nodes to the multicast tree, the procedure devised in [4] is used to calculate the optimal ordering for the multicast tree. Given an unordered tree, the optimal order for the children of each node can be easily Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

4

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

calculated using a recursive procedure developed in [4]. The recursive procedure takes a bottom-up approach that first calculates the optimal order for the children of the sub-trees at the lowest level. It then continues to calculate the optimal order for the children of the sub-trees at the next lowest level and so on until the order for the children of the root is calculated. 3.2. The delay measure The delay measure used in steps (2a) and (2b) of the skeleton algorithm plays an important role. The delay that a destination node receives the message, which is defined as the reception delay, is used as the delay measure in the heuristic algorithm developed in [4]. Let T be the current multicast tree. Let / be a path from tree T to node u. We shall use the notation rðu; T; /Þ to denote the reception delay of node u when it is attached to tree T through path /. This paper introduces a new delay measure that takes into account the processing delay of the destination node under consideration such that it captures the effect of the processing delay in case that the node is further attached with one or more children nodes. Specifically, the new delay measure is the sum of the reception delay and the processing delay of the destination node under consideration, which will be referred to as the reception-and-processing delay. Let rp ðu; T; /Þ denote the reception-and-processing delay of node u when it is attached to tree T through path /. The reception-and-processing delay of node u; r p ðu; T; /Þ, is calculated as follows:

rp ðu; T; /Þ ¼ rðu; T; /Þ þ pu

ð1Þ

3.3. The attachment rule The purpose of the attachment rule in step (2a) of the skeleton algorithm is to find the location on the current multicast tree to attach a destination node under consideration. Recall that the objective is to minimize the delay that all destination nodes receive the message. When each destination node is considered individually, for the benefit of each individual destination node, it is reasonable to attach the destination node to the current tree such that value of the delay measure of the node is minimum. Thus, the attachment rule in step (2a) of the heuristic algorithms studied in this paper is the minimum of either the reception delay or reception-and-processing delay. Let U be the set of all possible paths from tree T to node u. Let r min ðu; TÞ and r pmin ðu; TÞ denote respectively the minimum possible reception delay and reception-and-processing delay of node u when node u is attached to tree T. Then, rmin ðu; TÞ and rpmin ðu; TÞ are respectively defined as follows:

rmin ðu; TÞ ¼ min rðu; T; /Þ

ð2Þ

rpmin ðu; TÞ ¼ min rp ðu; T; /Þ

ð3Þ

/2U

/2U

Note that

rpmin ðu; TÞ ¼ r min ðu; TÞ þ pu

ð4Þ

3.4. The selection rule The purpose of the selection rule in step (2b) of the skeleton algorithm is to select one of the destination nodes not yet on the multicast tree in order to attach it to the multicast tree. Hopefully, after all destination nodes are attached to the multicast tree, the delay that all destination nodes receive the message is short. The decision of choosing one destination node at the current iteration of the algorithm may affect the decisions of choosing destination nodes at the following iterations. Thus, the choice of the decision rule in the step (2b) is critical in trying to minimize the overall delay. Two options are considered for the selection rule in step (2b) of the skeleton algorithm. One of the options is to choose the destination node with the smallest rmin ð; TÞ or r pmin ð; TÞ depending on the delay measure used, which will be referred to as the smallest-first rule. Suppose that node v is the selected node. Then

rmin ðv ; TÞ ¼ min r min ðu; TÞ or

ð5Þ

rpmin ðv ; TÞ ¼ min r pmin ðu; TÞ

ð6Þ

u2MV½T

u2MV½T

depending on the delay measure used, where V½T is the set of nodes on tree T. The reason for using this option for the selection rule is the hope that selecting the destination node with the smallest rmin ð; TÞ or r pmin ð; TÞ at each iteration will also lead to short overall delay. Intuitively, selecting the destination node with the smallest r min ð; TÞ or rpmin ð; TÞ in step (2b) at each iteration will result in connecting destination nodes with smaller values of the delay measure to the tree before those destination nodes with larger values of the delay measure. A destination node attached to the multicast tree at an early iteration is likely to have one or Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

5

more children nodes. Thus, a delay measure that takes into account the processing delay of each destination node is expected to yield smaller overall delay. The other option for the selection rule is to choose the destination node with the largest rmin ð; TÞ or r pmin ð; TÞ depending on the delay measure used, which will be referred to as the largest-first rule. Suppose that node v is the selected node. Then

rmin ðv ; TÞ ¼ max rmin ðu; TÞ or

ð7Þ

rpmin ðv ; TÞ ¼ max rpmin ðu; TÞ

ð8Þ

u2MV½T

u2MV½T

depending on the delay measure used. The rational behind this option for the selection rule is that sending the message to destination nodes with larger delays first will hopefully reduce the reception delays of these nodes and hence the overall delay. 3.5. The heuristic algorithms Determination of the delay measure and the decision rules in the second step of the skeleton algorithm defines a heuristic algorithm. Recall that we have described two delay measures, namely reception delay and reception-and-processing delay, and two selection rules, namely smallest-first and largest-first rules. The two delay measures and two selections rules are combined to obtain the following four heuristic algorithms: 1. 2. 3. 4.

Smallest reception delay first (SRDF) algorithm. Largest reception delay first (LRDF) algorithm. Smallest reception-and-processing delay first (SRPDF) algorithm. Largest reception-and-processing delay first (LRPDF) algorithm.

The LRDF algorithm is the same as the heuristic algorithm developed in [4] except that a different procedure is used to find the minimum reception delay rmin ðu; TÞ for each destination node u not on the current multicast tree T. What remain to be described are the details for finding the minimum reception delay r min ðu; TÞ or minimum receptionand-processing delay rpmin ðu; TÞ for each destination node u not on the current multicast tree T. These details are given in the following. 3.6. Procedure for finding rmin ðu; TÞ and r pmin ðu; TÞ It is sufficient to find rmin ðu; TÞ for each destination node u since rpmin ðu; TÞ is simply rmin ðu; TÞ þ pu . A commonly used method to find the minimum possible reception delay r min ðu; TÞ for each destination node u not yet on the current multicast tree T is as follows: 1. Calculate the ready time of each node x on the tree T. Recall that the ready time of node x is defined as the time at which node x becomes available. Let the ready time of node x on the tree T be denoted as tðx; TÞ. (Note that the ready time of each node on the tree T can be calculated in the process of constructing the tree.) 2. For each node x on the tree T, find the least-delay path from node x to each destination node u not yet on the multicast tree T. The delay of a path from node x to node u is the sum of the processing delays and communication delays of the nodes and links along the path. Let the delay of the least-delay path from node x to node u be tðx; uÞ. 3. Calculate r min ðu; TÞ for each destination node u not yet on tree T as follows:

rmin ðu; TÞ ¼ min½tðx; TÞ þ tðx; uÞ x2T

ð9Þ

This commonly used method was employed in calculating rmin ðu; TÞ in the heuristic algorithm developed in [4]. However, there is a pitfall in the above commonly used method. The least-delay path from a node x on the tree T to a destination node u not yet on the tree may intersect or partially overlap with the tree T. Fig. 1 shows an example of an intersection between the newly added least-delay path and the current multicast tree and Fig. 2 shows an example of an overlap between the newly added least-delay path and the current multicast tree. Both of the figures display the multicast tree at the end of each iteration in the SRDF algorithm using the above commonly used method for finding the minimum possible reception delay of each node not on the current multicast tree. Intersection of the branches of a multicast tree may cause a message to arrive at an intersecting node that is busy processing a copy of the same message from a different parent node. In this case, the message received latter may need to be delayed until the intersecting node is free to process the message instead of processing the message immediately. Thus, the actual reception delay of the message at any child node of the intersecting node may not be the same as the calculated reception delay at the node. Similarly, a partial overlap between the current multicast tree and the newly added least-delay path implies that the calculated reception delay at a node not yet on the tree may not be the actual reception delay. Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

6

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

Fig. 1. Multicast tree after each iteration using the commonly used method for finding the minimum possible reception delay of each node not on the current multicast tree in the SRDF algorithm. Path v ! w ! u intersects with the multicast tree.

Fig. 2. Multicast tree after each iteration using the commonly used method for finding the minimum possible reception delay of each node not on the current multicast tree in the SRDF algorithm. Path s ! w ! v overlaps with the multicast tree.

Consider sending a message from node s to nodes u; v ; w, and y using the multicast tree in Fig. 1(e). By applying the method in [4] for finding the optimal ordering, the optimal order for disseminating the message is calculated to be as follows. Node s first sends a copy of the message to nodes v and u using path s  v  w  u and then sends a second copy of the message to nodes w and y using path s  w  y. Node s starts processing and sending the message at time 0. The second copy of the message arrives at node w at time 15 before the first copy which arrives at node w at time 16. While the second copy is being processed by node w, the first copy has to wait until time 21 before it can be processed. The second copy will arrive at node y at time 22. The first copy will arrive at node u at time 30 which is larger than the largest calculated reception delay, 25, in Fig. 1(e). It takes 30 time units to deliver the message to all destination nodes. Suppose that a message is to be delivered from node s to nodes u and v using the multicast tree in Fig. 2(c). The optimal order for disseminating the message is to first send a copy of the message to node u using path s  w  u and then send a second copy of the message to node v using path s  w  v . Node s starts processing and sending the message at time 0. The first copy of the message arrives at node w at time 7. The second copy of the message arrives at node w at time 12. The second copy needs to wait until time 15 before it can be processed by node w since the first copy is being processed. The first copy will be received by node u at time 21. The second copy will be received by node v at time 28 which is greater than the largest calculated reception delay, 26, in Fig. 2(c). It takes 28 time units to deliver the message to all destination nodes. Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

7

Intersections and partial overlaps between the current multicast tree and the newly added least-delay path are undesirable. One way to get around the above problem is to modify step (2) of the commonly used method for finding r min ðu; TÞ as follows: 2. For each node x on the tree T, remove all nodes on the tree T except node x and remove all edges among the nodes on the tree. Then, find the least-delay path from node x to each destination node u not yet on the multicast tree T. This modified method is not efficient in that it generates a new graph for each node x on the tree T; then, the new graph is used to find the least-delay path to each destination node u not yet on the multicast tree T. The computational complexity of this modified method is dominated by that of the modified step (2) which is OðjVjðjVj þ jEjÞ log jVjÞ, where OððjVj þ jEjÞ log jVjÞ is the computational complexify of the Dijkstra’s algorithm [7] for finding the least-delay paths to the destination nodes not yet on the current multicast tree. In this paper, a more efficient procedure is devised to avoid the above problem in order to find r min ðu; TÞ. The procedure takes the graph G as input, assigns appropriate costs to the nodes and edges, contracts the current tree T into a single super node, and calculates the least-cost paths from the super node to the nodes not on the tree T. Thus, only one new graph is generate by contracting T into a single super node. The Dijkstra’s algorithm needs to be performed only once in order to find the least-cost paths from the super node to the nodes not on the tree T. Details of the procedure are given as follows: 1. Calculate the ready time of each node x; tðx; TÞ, on the current multicast tree T. 2. For each edge ðx; yÞ, where x 2 T and y 2 G  T, assign tðx; TÞ þ pðxÞ þ cðx; yÞ as the cost of edge ðx; yÞ. For each edge ðx; yÞ, where x; y 2 G  T, assign pðxÞ þ cðx; yÞ as the cost of edge ðx; yÞ. 3. For each node y 2 G  T, if there are two or more nodes, say x1 ; x2 ; . . . ; xk 2 T, on tree T such that each node has an edge to node y, find the node with the minimum edge cost to node y. Let ðxj ; yÞ be the edge with minimum edge cost. Keep edge ðxj ; yÞ and remove edge ðy; xj Þ and edges ðxi ; yÞ and ðy; xi Þ for i ¼ 1; . . . ; j  1; j þ 1; . . . ; k. 4. Contract tree T into a single super node. Assign zero cost to the super node. 5. For each node x 2 G  T, assign zero cost to node x. 6. Calculate the least-cost paths from the super node to the nodes in G  T using the Dijkstra’s algorithm. The cost of the least-cost path from the super node to a node, say node u, not on the multicast tree is the minimum possible reception delay rmin ðu; TÞ of node u. The minimum possible reception-and-processing delay r pmin ðu; TÞ of node u is calculated as r min ðu; TÞ þ pu . The computational complexity of the above procedure is dominated by that of step (6) which is OððjVj þ jEjÞ log jVjÞ.

Fig. 3. Multicast tree after each iteration using the method devised in this paper for finding the minimum possible reception delay of each node not on the current multicast tree in the SRDF algorithm. The graph in (b) is the graph obtained after assigning costs to the edges and contracting the initial tree into a single super node. Each of the graphs in (c), (d), and (e) is the graph obtained after assigning costs to the edges and contracting the current tree into a single super node. The graph in (f) is the final multicast tree.

Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

8

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

Fig. 3 shows the steps for constructing a multicast tree in the graph in Fig. 1(a) using the SRDF algorithm and applying the above procedure to find r min ðu; TÞ for each destination node u not on the current multicast tree. The graph in Fig. 3(b) is the graph obtained after assigning costs to the edges and contracting the initial tree into a single super node. Each of the graphs in Fig. 3(c)–(e) is the graph obtained after assigning costs to the edges and contracting the current tree into a single super node. The graph in Fig. 3(f) is the final multicast tree. The optimal ordering for the multicast tree in Fig. 3(f) is to first send a copy of the message from node s to nodes v and u via path s  v  u, then, a second copy from node s to nodes w and y through path s  w  y. Node s starts processing and sending the message at time 0. It takes 26 time units for the first copy to arrive at node u and 22 time units for the second copy to arrive at node y. Thus, it takes 26 time units to deliver the message to all destination nodes. Recall that it takes 30 time units to deliver the message to all destination nodes using the multicast tree in Fig. 1(e). This shows that using the above procedure to find r min ðu; TÞ for each destination node u not on the current multicast tree may be able to shorten the overall message delivery delay. 4. Simulation study Simulations are used to study the performances of the four heuristic algorithms, namely SRDF, LRDF, SRPDF, LRPDF, obtained by combining the two delay measures, namely reception delay and reception-and-processing delay, and the two selection rules, namely smallest-first and largest-first rules. Note that the LRDF algorithm is not exactly the same as the heuristic algorithm proposed in [4] since the procedure for finding r min ðu; TÞ is different. 4.1. Simulation model Random graphs are generated to represent the topologies of overlay networks. The random graphs are generated in the following manner for the convenience of studying the effects of the characteristics of the overlay networks on the performances of the heuristic algorithms. For a given number of nodes in a random graph, an edge probability q is used to determine whether a pair of directed edges exists between any two nodes. Given any two nodes, a pair of directed edges exists between them with probability q and no edge between them with probability 1  q. The communication delay of an directed edge is selected randomly between c  c=2 and c þ c=2 where c is the average communication delay of the edges. The communication delays of the two directed edges are selected independently. The processing delay of a node is chosen randomly p =2 and p þp =2 where p  is the average processing delay of the nodes. between p The values of the parameters used in our simulations are selected as follows unless the performances of the heuristic algorithms are being studied for the variation of a specific parameter. The number of nodes jVj in a random graph is 100.  and communication delay c are both 1. The node with the lowThe edge probability q is 0.25. The average processing delay p est ID is selected as source node, and the rest of the nodes are destination nodes; i.e., jMj ¼ jVj  1. For a given set of values of the parameters, one hundreds random graphs are generated. Each of the four heuristic algorithms is applied to find a multicast tree on each of the 100 random graphs. The delay between the time instance at which the source node starts delivering a message and the time instance that all destination nodes receive the message is calculated. Each data point in the figures to be presented is the average value of the 100 delays of the 100 multicast trees. The 90% confidence intervals are also calculated and plotted in the figures. 4.2. Simulation results The effects of the parameters (including average edge degree of the nodes, number of nodes in the network, number of destination nodes, processing delay and communication delay) on the performances of the four heuristic algorithms are studied. The results are described in the following. 4.2.1. Comparison for various average edge degrees First of all, we study how the average edge degree of the nodes affects the performances of the four heuristic algorithms. The average edge degree of the nodes ranges from 2.5 to 50. For each value of average edge degree, the edge probability q is properly selected to yield the specific value of average edge degree. The average multicast delays produced by each of the four heuristic algorithms for various average edge degrees of the nodes in the networks are shown in Fig. 4. The following effects of the average edge degree on the multicast delays can be observed from the figure:  When the average edge degree of the nodes is small, it is obvious that the depth of the constructed multicast tree will be large. Hence, the delay of the multicast tree is also large. As the average edge degree increases, the constructed multicast trees are expected to have smaller depth. Thus, in general, the multicast delay also decreases.  The four heuristic algorithms yield similar average multicast delays when the average edge degree of the nodes is small (5 or less). When the average edge degree is 10 or more, the differences among the average multicast delays produced by the four algorithms become significant.  When the average edge degree is 10 or more, the SRDF and SRPDF algorithms yield lower average multicast delays than the LRDF and LRPDF algorithms. This is because the SRDF and SRPDF algorithms try to keep the delay of the multicast tree small while adding each destination node to the multicast tree. Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

9

20 18 16 14

Delay

12 LRPDF LRDF SRDF SRPDF

10 8 6 4 2 0

2.5 5

10

15

20

25

30

35

40

45

50

Average edge degree  ¼ 1; c ¼ 1. Fig. 4. Performance comparison for various average edge degrees of the nodes in the networks, jVj ¼ 100; jMj ¼ jVj  1; p

 When the average edge degree is 10 or more, the SRPDF algorithm produces lower average multicast delays than the SRDF algorithm. In other words, taking into account the processing delays of the destination nodes while selecting one of them for adding to the current multicast tree is able to effectively reduce the delays of its children nodes (if any) and in turn reduce the delay of the multicast tree.  When the average edge degree is 10 or more, the LRPDF algorithm yields higher average multicast delays than the LRDP algorithm. The reason is explained as follows. The purpose of taking into account the processing delays of the destination nodes while selecting one of them for adding to the current multicast tree is to reduce the delays of the children nodes. However, the destination node selected by the LRPDF algorithm may have a larger processing delay since the LRPDF algorithm selects the destination node with the largest reception-and-processing delay for adding to the current multicast tree. Thus, the children nodes of the selected destination node will experience longer processing delays at the selected destination node. As a result, the delay of the multicast tree becomes larger. 4.2.2. Comparison for various numbers of nodes jVj Next, we study the effect of the number of nodes in the network on the multicast delay for a fixed average edge degree of 15. An average edge degree of 15 is achieved by properly choosing the edge probability while generating the random graphs. The number of nodes ranges from 20 to 100. Simulations are not performed for networks of size 10 since the number of nodes needs to be at least 16 in order to have an average edge degree of 15. Fig. 5 shows the average multicast delays produced by each of the four heuristic algorithms for various numbers of nodes in the networks with a fixed average edge degree of 15. From the figure, the following observations can be made:  In general, the average multicast delay increases as the number of nodes increases since the message needs to be delivered to more destination nodes. 12 LRPDF LRDF SRDF SRPDF

10

Delay

8

6

4

2

0

20

30

40

50

60

70

80

90 100

Number of nodes |V|  ¼ 1; c ¼ 1. Fig. 5. Performance comparison for various numbers of nodes in the networks with a fixed average edge degree of 15, jMj ¼ jVj  1; p

Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

10

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

 Regardless of the number of nodes in the networks, – the SRDF and SRPDF algorithms always yield lower average multicast delays than the LRDF and LRPDF algorithms; – the SRPDF algorithm always produces lower average multicast delays than the SRDF algorithm; – the LRPDF algorithm always yields higher average multicast delays than the LRDF algorithm. 4.2.3. Comparison for various sizes of multicast group The multicast group of a multicast session includes the source node and the destination nodes; i.e., the size of the multicast group is jMj þ 1. In this comparison, we study the effect of the size of the multicast group on the multicast delay for fixed average edge degree of the nodes. The edge probability q is selected such that the average edge degree of the nodes is 15. The size of the multicast group ranges from 10 to 100. Fig. 6 shows the average multicast delays produced by each of the four heuristic algorithms for various sizes of multicast group with a fixed average edge degree of 15. The following observations can be made from the figure:  In general, the average multicast delay increases as the size of multicast group increases since the message needs to be delivered to more destination nodes.  The four heuristic algorithms yield similar average multicast delays when the size of the multicast group is small (10). As the size of the multicast group increases beyond 10, the differences among the average multicast delays produced by the four algorithms become significant.  to p  þ c 4.2.4. Comparison for various ratios of p  to p  þ c on the multicast delay. The average multicast delays proIn this comparison, we study the effects of the ratio of p  to p  þ c are depicted in Fig. 7. The sum of the average duced by each of the four heuristic algorithms for various ratios of p  and average communication delay c is 10; i.e., p  þ c ¼ 10. The ratio of p  to p  þ c ranges from 0.1 to 0.9. The processing delay p following observations can be made from the figure: =p  þ c is small. The SRDF and SRPDF also  The LRDF and LRPDF algorithms yield similar average multicast delays when p =p  þ c is small. The reason is that, when the processing delays produce about the same average multicast delays when p are relatively small, it does not make much difference whether or not to take into account the processing delays of the destination nodes while selecting one of them for adding to the current multicast tree. =p  þ c increases. This is because  The delay improvement of the SRPDF algorithm over the SRDF algorithm increases as p =p  þ c increases. Thus, the processing delays become more significant compared to the communication delays when p the effectiveness of taking into account the processing delays of the destination nodes while selecting one of them for adding to the current multicast tree becomes more evident. 4.2.5. Comparison for various ranges of processing delays and communication delays In this comparison, our purpose is to study the effects of the variations of the processing delays and communication delays on the multicast delays. In our simulations, the values of the processing delays and communication delays are randomly selected within the respective ranges of the processing delays and communication delays. The variations of the processing delays and communication delays increase as the ranges of the processing delay and communication delay  ¼ 1 and c ¼ 1. The width of the range increase. The average processing delay and communication delay are fixed at 1.0; i.e., p varies from 0 to 2.0. The average multicast delays produced by each of the four heuristic algorithms for various ranges of

12 LRPDF LRDF SRDF SRPDF

10

Delay

8

6

4

2

0

10

20

30

40

50

60

70

80

90 100

Size of multicast group  ¼ 1; c ¼ 1. Fig. 6. Performance comparison for various sizes of multicast group in the networks with a fixed average edge degree of 15, jVj ¼ 100; p

Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

11

90 80 LRPDF LRDF

70

SRDF

Delay

60 50

SRPDF

40 30 20 10 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

 to p þ  þ c ¼ 10. Fig. 7. Performance comparison for various ratios of p c; jVj ¼ 100; jMj ¼ jVj  1; q ¼ 0:25; p

processing-delays and communication-delays are shown in Figs. 8 and 9 respectively. The following observations can be made from Fig. 8:  The average multicast delays produced by the SRDF and SRPDF algorithms decrease as the variation of processing delays increases. The SRDF and SRPDF algorithms place those nodes with smaller delays from the source node closer to the source node such that their children nodes will experience smaller delays for receiving messages. With larger variation of processing delays, the effectiveness of the smallest-first selection rule in the SRDF and SRPDF algorithms becomes more evident.  The average multicast delays produced by LRDF and LRPDF algorithms also tend to decrease as the variation of processing delays increases especially when the variation is large.  When the variation of processing delays is small, the SRDF and SRPDF algorithm yield similar average multicast delays. The reason is obvious since the difference between the SRDF and SRPDF algorithm is that whether the processing delays of the destination nodes are taking into account while selecting one of them for adding to the current multicast tree. This is also the case for the LRDF and LRPDF algorithms.  The improvement of the SRPDF algorithm over the SRDF algorithm increases as the variation of processing delay increases. The reason is also due to that the SRPDF algorithm takes into account the processing delays of the destination nodes while selecting one of them for adding to the current multicast tree. The following observations can be made from Fig. 9:

12

10

LRPDF LRDF

8

Delay

SRDF 6 SRPDF 4

2

[1

.0 ,1 [0 .0] .9 ,1 [0 .1] .8 ,1 [0 .2] .7 ,1 [0 .3] .6 ,1 [0 .4] .5 ,1 [0 .5] .4 ,1 [0 .6] .3 ,1 [0 .7] .2 ,1 [0 .8] .1 ,1 [0 .9] .0 ,2 .0 ]

0

Range of processing delays  ¼ 1; c ¼ 1. Fig. 8. Performance comparison for various widths of processing-delay ranges, jVj ¼ 100; jMj ¼ jVj  1; q ¼ 0:25; p

Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

12

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

12

LRPDF LRDF

8

SRDF SRPDF

Delay

10

6

4

2

[1 .

0, 1 [0 .0] .9 ,1 [0 .1] .8 ,1 [0 .2] .7 ,1 [0 .3] .6 ,1 [0 .4] .5 ,1 [0 .5] .4 ,1 [0 .6] .3 ,1 [0 .7] .2 ,1 [0 .8] .1 ,1 [0 .9] .0 ,2 .0 ]

0

Range of communication delays  ¼ 1;  Fig. 9. Performance comparison for various widths of communication-delay ranges, jVj ¼ 100; jMj ¼ jVj  1; q ¼ 0:25; p c ¼ 1.

 In general, the average multicast delays produced by the four algorithms decrease as the variation of communication delays increases.  Regardless the degree of variation of the communication delays, – the SRDF and SRPDF algorithms always yield lower average multicast delays than the LRDF and LRPDF algorithms; – the SRPDF algorithm always produces lower average multicast delay than the SRDP algorithm; – the LRPDF algorithm always yields higher average multicast delay than the LRDP algorithm. 5. Conclusions This paper studies the minimum-delay multicast problem with heterogeneous nodal processing delays. Four heuristic algorithms, namely SRDF, SRPDF, LRDF, and LRPDF, have been developed by combining two delay measures, namely reception delay and reception-and-processing delay, and two rules, namely smallest-first and largest-first rules, for selecting one of the destination nodes for attaching to the current multicast tree. The reception-and-processing delay is a new delay measure introduced in this paper to capture the effect of the nodal processing delay when a destination node is further attached with one or more children nodes. In the process of developing the heuristic algorithms, a pitfall has been identified that may cause intersections or overlaps of the current multicast tree and the least-delay path to a destination node. An efficient procedure has been devised to avoid this problem by contracting the current multicast tree into a single super node before calculating the least-delay paths from the current multicast tree to the destination nodes. Simulations have been performed to study the average multicast delays produced by the four heuristic algorithms. The main results obtained from our simulations are as follows:  For wide ranges of the characteristics of the overlay networks (including average edge degree of the nodes, number of nodes in the networks, size of multicast group, the ratio between average processing delay and average communication delay, and variations of the processing delays and communication delays), – the SRPDF algorithm that takes into account the nodal processing delays of the destination nodes and iteratively adds the destination node with the smallest reception-and-processing delay to the current multicast tree yields the lowest average multicast delay; – the SRDF and SRPDF algorithms produce lower average multicast delay than the LRDF and LRPDF algorithms.  The LRDF and LRPDF algorithms yields slightly lower average multicast delay than the SRDF and SRPDF algorithms when the average edge degree of the nodes is small (5 or less).  The improvement of the SRPDF algorithm over the other three algorithms increases as the ratio of the average processing delay to average communication delay increases.  The improvement of the SRPDF algorithm over the other three algorithms increases as the variation of the nodal processing delays increases.

Acknowledgments This research was supported by the National Science Council, Taiwan, under Grant NSC95-2221-E-007-048-MY3. Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130

H.-C. Lin et al. / Information Sciences xxx (2014) xxx–xxx

13

References [1] S. Banerjee, B. Bhattacharjee, A Comparative Study of Application Layer Multicast Protocols. . [2] A. Bar-Noy, S. Kipnis, Designing broadcasting algorithms in the postal model for message passing systems, Math. Syst. Theor. 27 (5) (1994) 431–452. [3] A. Bar-Noy, S. Guha, J. Naor, B. Schieber, Message multicasting in heterogeneous networks, SIAM J. Comput. 30 (2) (2001) 347–358. [4] E. Brosh, A. Levin, Y. Shavitt, Approximation and heuristic algorithms for minimum-delay application layer multicast trees, IEEE/ACM Trans. Netw. 15 (2) (2007) 473–484. [5] I. Cidon, I. Gopal, S. Kutten, New models and algorithms for future networks, IEEE Trans. Inf. Theor. 41 (3) (1995) 769–780. [6] D.E. Culler, R.M. Karp, D.A. Patterson, A. Sahay, K.E. Schauser, E. Santos, R. Subramonian, T. von Eicken, Log P: towards a realistic model of parallel computation, in: Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, May 1993. Also appears as TR No. UCB/CS/92 713. [7] E.W. Dijkstra, A note on two problems in connexion with graphs, Numer. Math. 1 (1959) 269–271. [8] C. Diot, B.N. Levine, B. Lyles, H. Kassem, D. Balensiefen, Deployment issues for the IP multicast service and architecture, IEEE Netw. 14 (1) (2000) 78–88. [9] B. Duysburgh, T. Lambrecht, F. De Turck, B. Dhoedt, P. Demeester, An active networking based service for media transcoding in multicast sessions, IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev. 34 (1) (2004) 19–31. [10] M. Elkin, G. Kortsarz, A combinatorial logarithmic approximation algorithm for the directed telephone broadcast problem, in: Proc. ACM Symp. Theory of Computing (STOC), Montreal, QC, Canada, 2002, pp. 438–447. [11] M. Elkin, G. Kortsarz, Sublogarithmic approximation algorithm for the undirected telephone broadcast problem: a path out of a jungle, in: Proc. ACM Symp. Discrete Algorithms (SODA), Baltimore, MD, 2003, pp. 76–85. [12] M.R. Garey, D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-completeness, W.H. Freeman and Company, 1979. [13] M. Hosseini, D.T. Ahmed, S. Shirmohammadi, N.D. Georganas, A survey of application-layer multicast protocols, IEEE Commun. Surv. Tutorials 9 (3) (2007) 58–74. [14] R. Karp, A. Sahay, E. Santos, K.E. Schauser, Optimal broadcast and summation in the LogP model, in: Proceedings of the Fifth Symposium on Parallel Algorithms and Architectures (SPAA), Association for Computing Machinery, New York, 1993. [15] S. Kaune, K. Pussep, C. Leng, A. Kovacevic, G. Tyson, R. Steinmetz, Modeling the internet delay space based on geographical locations, in: Proceedings of the 17th IEEE Euromicro International Conference on Parallel, Distributed and Network-based Processing, February 2009, pp. 301–310. [16] G. Kortsarz, D. Peleg, Approximation algorithm for minimum time broadcast, SIAM J. Discrete Math. 8 (1995) 401–427. [17] R. Ravi, Rapid rumor ramification: approximating the minimum broadcasting time, in: Proc. 35th IEEE Symp. Foundations of Computer Science, 1994, pp. 202–213. [18] D. Raz, Y. Shavitt, New models and algorithms for programmable networks, Comput. Netw. 38 (3) (2002) 311–326. [19] L.H. Sahasrabuddhe, B. Mukherjee, Multicast routing algorithms and protocols: a tutorial, IEEE Netw. 14 (1) (2000) 90–102. [20] S. Saroiu, P.K. Gummadi, S.D. Gribble, A measurement study of peer-to-peer file sharing systems, in: Proceedings of SPIE, Multimedia Computing and Networking, vol. 4673, December 2002, pp. 156–170. [21] M. Shorfuzzaman, R. Eskicioglu, P. Graham, Video transcoding using network processors to support dynamically adaptive video multicast, in: Proceedings of 20th IEEE International Conference on Advanced Information Networking and Applications (AINA 2006), 2006, pp. 6–12. [22] S. Sundaresan, W. De Donato, N. Feamster, R. Teixeira, S. Crawford, A. Pescapè, Measuring home broadband performance, Commun. ACM 55 (11) (2012) 100–109. [23] C.K. Yeo, B.S. Lee, M.H. Er, A Survey of application level multicast techniques, Comput. Commun. 27 (15) (2004) 1547–1568.

Please cite this article in press as: H.-C. Lin et al., Constructing application-layer multicast trees for minimum-delay message distribution, Inform. Sci. (2014), http://dx.doi.org/10.1016/j.ins.2014.03.130