Computer Communications 29 (2006) 1936–1947 www.elsevier.com/locate/comcom
Multicast-based inference for topology and network-internal loss performance from end-to-end measurements Hui Tian *, Hong Shen Department of Computing and Mathematics, Manchester Metropolitan University, Oxford Road, Manchester, M1 5GD, UK Graduate School of Information Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Tatsunokuchi, Ishikawa 923-1292, Japan Received 11 April 2005; received in revised form 9 November 2005; accepted 6 December 2005 Available online 5 January 2006
Abstract The use of multicast traffic as measurement probes is effective to infer network-internal characteristics. In this paper, we propose novel approaches to infer multicast network topology and link loss performance from end-to-end measurements. First, we present a new algorithm, binary hamming distance classification algorithm (BHC), that identifies multicast network topology based on the hamming distance of the sequences on receipt/loss of probe packets maintained at each pair of nodes. It is proved by analysis and simulation that BHC can infer the topology at a higher accuracy and efficiency than the previous algorithms with a finite number of probe packets. We also propose a new statistical approach to infer network-internal link loss performance based on the inferred topology. The inferred link loss rate is proved to be consistent with the real loss rate as the number of probe packets tends to infinity. Our new approach makes it possible to infer multicast network topology and loss performance simultaneously. We extend our algorithms for both multicast topology and loss performance inference in binary trees to general trees, and present a new method of loss rate-based scheme for general tree topology inference so that the inferred topology can correctly converge to the true topology which was difficult to achieve previously. 2005 Elsevier B.V. All rights reserved. Keywords: Multicast network; Topology inference; Hamming distance; Loss rate
1. Introduction For the efficient use of network resources, multicast has become one of the most popular forms of communication. Knowledge of multicast network topology can significantly facilitate resource management, and can be applied to build schemes for loss recovery and congestion control in the context of multicast sessions supporting heterogeneous receivers [11,12]. Accordingly, how to measure the network-internal performance accurately is another important factor in successful design, control and management of networks. End-to-end measurements-based multicast inference has been identified as a very efficient and effective approach to obtain important information on network-in-
*
Corresponding author. Tel.: +81 9020906628; fax: +81 761511360. E-mail address:
[email protected] (H. Tian).
0140-3664/$ - see front matter 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.comcom.2005.12.003
ternal characteristics such as topology and loss performance. Much research on multicast topology inference from end-to-end measurements can be found in [2,6–9,14]. The key idea underlying this approach is that receivers sharing common paths on the multicast tree associated with a given source will see correlations in their packet losses or delays. The multicast tree can thus be inferred based on the shared loss or delay statistics on transmitted probe packets. The main advantage of the approach lies in its applicability to infer multicast trees with no requirement of support from internal nodes. It permits, however, identification of a logical multicast topology rather than the actual physical topology. In previous work as [7,9], a long path with no branches would be identified as a single logical link. Thus, all single-child nodes are deleted in the inferred topology. It is obvious that in practice this may not be an appropriate inference of the actual topology because there may exist
H. Tian, H. Shen / Computer Communications 29 (2006) 1936–1947
many nodes with only one child in the actual topology. Moreover, the prevalent method to estimate correlation used in [2,7–9,15,16] for siblings identification may produce false results in some cases to be discussed in Section 3. In this paper, we propose the binary hamming distance classification (BHC) algorithm which infers the multicast topology based on end-to-end observations of packet losses and hop count by effective comparison of the hamming distances among all pairs of sequences on received probe packets maintained at individual nodes. It can not only correct false correlation existing in the previous algorithms, but also identify the nodes and links failed to be inferred by the previous algorithms. With priori knowledge of multicast network topology, a lot of work has been done on inferring other network-internal characteristic such as packet loss rates on individual links and link delay [3–6,10,13]. In this paper, we propose a novel and simple method to estimate the loss rates on all individual links in the multicast network using only statistical observations at receivers. Incorporating with the procedure of inferring topology by the BHC algorithm, we can estimate the link loss rate simultaneously while inferring the multicast topology. We also extend the approach for link loss rate inference to general trees, and further show that the result of loss rate inference can be used to infer multicast topology that correctly converges to the true topology for general trees which is often very complicated as discussed in [9]. The paper is organized as follows. In Section 2, the mathematical model of multicast network is introduced. Section 3 proposes the BHC algorithm and presents the analysis and simulation results. Section 4 presents the loss performance inference. The analysis and simulation on the loss performance inference algorithm are also given in Section 4. Section 5 extends both kinds of inference to the general tree case and proposes a loss rate-based scheme for topology inference. Section 6 concludes the paper. 2. Mathematical model of multicast network We begin with description of the mathematical model for the real multicast network as presented in [7,9]. In this model, the physical multicast tree is represented by a tree comprising actual network elements (the nodes) and communication links connecting them. Let T = (V, L) denote a multicast tree with node set V and link set L. The root node 0 is the source of probe packets, and R V denotes the set of leaf nodes representing the receivers. A link is said to be internal if neither of its endpoints is the root or a leaf node. Each non-leaf node k has a set of descendent nodes d (k) = {di (k) j 1 6 i 6 nk}, and each non-root node k has a parent node p (k). The link (p (k), k) 2 L is denoted by link k. Write j p k if j is descended from k, k = pr (j) if j is r-level descended from k, where r is a positive integer. Let a (U) denote the nearest common ancestor of a node set U V. Nodes in U are said to be siblings if they have the same parent, i.e., if p (k) = a (U), "k 2 U. The subtree of T
1937
Fig. 1. A multicast tree model.
rooted at k is denoted by T (k) = (V (k), L (k)), and the receiver set R (k) = R \ V (k). Fig. 1 shows an example of multicast tree model. For each link an independent Bernoulli loss model is assumed with each probe packet being successfully transmitted across link k with probability pk. Thus, the progress of each probe packet down the tree is described by an independent copy of a stochastic process X = (Xk)k2V as follows. X0 = 1, Xk = 1 if the probing packet reaches node k 2 V and 0 otherwise. If Xk = 0, Xj = 0, "j 2 d (k). Otherwise, P[Xj = 1jXk = 1] = pj and P[Xj = 0jXk = 1] = 1 pj = aj. Define p0 = 1. The pair (T, p) is called a loss tree. Let PT,p denote the distribution ðiÞ of X on the loss tree (T, p), X k denote the loss measurement of node k for the ith probe packet. For n probe packets, the 0–1 sequence maintained on the node k is ðnÞ denoted by fX k g; k 2 V . If 0 < pk < 1, "k 2 Vn{0}, the loss tree is said to be a canonical tree. Any tree (T, p) in non-canonical form can be reduced to a canonical tree [9]. Henceforth, only canonical loss trees are considered in this paper. 3. Multicast network topology inference In this section, we first consider topology inference for multicast network in binary tree form for simplicity. We begin with discussing the problem in the previous algorithm for multicast topology inference. We then introduce the hamming distance approach to overcome the existing problem and propose our Binary Hamming Distance Classification (BHC) algorithm. Due to its simplicity and efficiency, the hamming distance approach plays an important role on identifying siblings in the BHC algorithm. Thereafter, comparison of inference accuracy between hamming distance and estimation methods used previously is given. Our simulation result justifies the superiority of the BHC algorithm. 3.1. Existing approach for siblings classification In the previous work such as [7–9,15], network topology was inferred by identification of siblings from end-to-end measurements. Thus, identification of each siblings pair is the key for topology inference. Many previous approaches used notation A (i, j) to determine whether nodes i and j are
1938
H. Tian, H. Shen / Computer Communications 29 (2006) 1936–1947
siblings. We call these approaches A-approach for simplicity in this paper. When a probe packet is sent from the root of the multicast tree, several copies are generated at each router encountered. When a copy of probe packet i reaches node ðiÞ ðiÞ k, X k is set to 1. Otherwise, X k is set to 0, indicating that node k does not receive any copy of probe packet i. For ðiÞ those internal routers, X k is 1 if the ith probe packet reachðiÞ es it and 0 otherwise. In practice, X k of internal nodes is ðiÞ obtained by _l2RðkÞ X l because the internal node is said to receive a probe packet surely if any receiver descended from it receives the probe packets. Thus, each node obtains ðiÞ a ‘‘0–1’’ sequence fX k g, 0 < i < n, k 2 V. Because the paths from the root to the receivers are different, each sequence maintained by any node is, more or less, different from others. We can thus compare the correlation between the sequences so as to reconstruct the multicast network topology and infer the internal link characteristics. In A-approach, A (k) = Pjkpj is defined as the product of the probabilities of successful transmission on all the links between the root 0 to node k is widely used in topology inference from measured end-to-end loss. Let A (i, j) be the product of the probabilities of successful transmission on each link between the root 0 to the nearest common ancestor node of receivers i and j. Previous work showed that if A (i, j) is minimized, nodes i and j are identified as siblings. That is, nodes i and j are siblings if the following expression is minimized: Aði; jÞ ¼
P ½_l2RðiÞ X l ¼ 1P ½_l2RðjÞ X l ¼ 1 . P ½_l2RðiÞ X l ¼ _l2RðjÞ X l ¼ 1
ð1Þ
Because each probability in Eq. (1) is impossible to obtain in practice, A (i, j) is estimated by A(n) (i, j) as used in all previous work: Pn
ðnÞ
A ði; jÞ ¼ ðmÞ
ðmÞ Pn ðmÞ m¼1 X j m¼1 X i Pn ðmÞ ðmÞ n m¼1 X i X j ðmÞ
;
ð2Þ
where X k ¼ _l2RðkÞ X l . Each probability in Eq. (1) is estimated by the observation from packets, e.g., Pnn probe ðmÞ P[l2R(i)Xl = 1] is estimated by m¼1 X i =n. As n goes to infinity, A(n) (i, j) is consistent with A (i, j), i, j 2 V. However, with a finite number of probe packets, the estimation may cause obvious mistakes as shown in Fig. 2 when determining whether two nodes are siblings. According to the estimated value A(n) (i, j), if receiver j loses many probe packets, the value of A(n) (i, j) will be very
small, even smaller than the value between i and its actual sibling, e.g., A(7) (a, c) < A(7) (a, b) in Fig. 2. Thus, A-approach will misclassify nodes a and c as siblings. Even when the number of probe packet increases, such bias cannot be eliminated completely. Because the estimated probability in Eq. (2) does not equal to the true probability unless the number of probe packets is infinite. In order to reduce the bias with a finite number of probe packets, we propose our hamming distance classification approach. 3.2. Hamming distance classification approach The hamming distance between nodes u and v is defined as the number of different bits between their sequences, which is given in the following equation, where ‘‘¯’’ is the exclusive-OR operator. n X ðmÞ H d ðu; vÞ ¼ X ðmÞ ð3Þ u Xv . m¼1
For instance, the hamming distances of each pair of sequences in Fig. 2 are computed. Different hamming distances between different node pairs can be used to determine which pair of nodes are siblings. As illustrated in Fig. 2, Hd (a, c) > Hd (a, b) is congruent with the relationship between the nodes because nodes a and b are siblings. However, as we have discussed in Section 3.1, A-approach failed to identify the relationship of nodes a, b and c with seven probe packets in Fig. 2. The hamming distance approach distinguishes the siblings and non-siblings with different values successfully. However, A(n) (i,j) does not give the differences between the pair of sequences with 7 probe packets in Fig. 2. We will discuss the reason that our hamming distance approach is superior to A-approach in Section 3.4. In the multicast network, the nearer two nodes are located, the more similar two ‘‘0–1’’ sequences they maintained are. The reason for such phenomenon is that the probe packets from the root to the receivers may pass many common links. If the receivers are siblings, the paths the probe packets pass from the root to their parent nodes are the same. Therefore, the correlation of two ‘‘0–1’’ sequences between a node and its siblings is greater than that of all other pairs of sequences between the node and its non-sibling node. In order to infer the topology of the multicast network, all the siblings need to be identified correctly. That is, we should find out the similarity of each pair of sequences.
Fig. 2. Comparison of Hamming distance (Hd (Æ, Æ)) and the previous A-approach (A(n) (Æ, Æ)) for each pair node.
H. Tian, H. Shen / Computer Communications 29 (2006) 1936–1947
The problem of identifying siblings in the multicast network can be stated as marking out the similarity and dissimilarity of all pairs of bit sequences. For a bit sequence, hamming distance is the simplest and most efficient method to identify the similarity and dissimilarity among different sequences. Therefore, we use hamming distance to identify siblings for multicast network topology inference. It should be noted that our proposed approach generally works well regardless of temporal correlation between the losses in actual networks. No matter when we send the probe packets and how many probe packets we use to infer the network topology, the hamming distance approach can always classify siblings correctly. In contrast, the previous A-approach is affected strongly by the temporal correlation of collected data. Different data collected in different time periods may cause different siblings classifications, especially when a small number of probe packets is used. 3.3. Description of the BHC algorithm We present the BHC algorithm based on the hamming distance classification approach in this subsection. BHC infers the multicast topology from the node with the maximum hop count and proceeds in a bottom-up fashion. Since true siblings must have the same value of hop count, the set of all receivers is classified to different sets according to their values of hop count at the beginning. In the procedure of identifying siblings, only comparison between the sets with the same value of hop count is necessary, which in turn, avoids redundant comparisons between nodes with different values of hop count. For the nodes with the same value of hop count, a node pair is identified as siblings if the hamming distance is not only minimal among all hamming distances of all node pairs in the node set, but also less than a given threshold. Thus, by repeatedly identifying each sibling pair from the node set with the maximal hop count to the node set with minimal hop count, the multicast tree can be reconstructed. Fig. 3 shows the characteristics of the BHC algorithm and its main differences from other algorithms. The detailed BHC algorithm is described as follows. Same as in our previous work [15], we associate each node v with a hop count v. hop. The v.hop values of leaf nodes can be obtained by simply reading the TTL values of the probe packets. For internal nodes, their v.hop values can
1939
be computed by degression from the v.hop values of leaf nodes during the topology inference procedure. 1. Input: The set of receivers R, number of probe packðiÞ i¼1;...;n ets n, observed sequences at receivers ðX k Þk2R ; 2. R 0 : = R, V 0 : = ;, L 0 : = ;, h = maxk2R (k.hop), Wm = ;, (m = 1,. . .,h); //V 0 is the set of discovered nodes; L 0 is the set of discovered links, Wm is a set of nodes with hop count value m, m is initialized as the maximum value of hop count for all nodes in R.// 3. for k 2 R, do 4. Wk.hop: = Wk.hop¨{k} //Classify the receivers into different groups according to their hop count values.// 5. while m > 1 do 6. while Wm „ ; do 7. Let u be the first element in Wm; search for v 2 Wm to minimize Hd (u, v), (u „ v); 8. if Hd (u, v) > dm then S = {u}, Set r to be u’s virtual parent node; //Initially, sibling nodes set S := ;, dm is a given classification threshold of level m. If minimal hamming distance is still greater than dm, u does not have siblings.// 9. else S = {u, v}, Set r to be u and v’s virtual parent node; //u and v are siblings, r is denoted as their virtual parent node.// ðiÞ 10. for i = 1, . . ., n do X ðiÞ r :¼ _l2S X l ; 11. r.hop: = m 1; 12. V 0 : = V 0 ¨S; Wm: = Wm n S; Wm1: = Wm1¨{r}; // The discovered two siblings are added to the discovered node set V 0 and excluded from original set Wm, their parent node is added to the node set Wm1.// 13. for each l 2 S, L 0 : = L 0 ¨{(r,l)}; 14. S = ;; 15. m: = m 1; 16. V 0 : = V 0 ¨{0}; L 0 : = L 0 ¨{0,r}; //Include root node and the link from root node to his child node to the discovered node set and link set.// 17. Output: Inferred topology (V 0 , L 0 ). First, all the receivers are classified into different node sets Wm (1 6 m 6 h) according to their values of hop count. Inference begins with identifying siblings in the node set with the maximum value of hop count. The hamming distances of each node pair in Wm are calculated. The node pair is identified to be siblings if its hamming distance is minimal and less than the threshold. Remove the siblings from the node set with the hop count being m and add
Fig. 3. Differences between BHC and HBLT, BLT.
1940
H. Tian, H. Shen / Computer Communications 29 (2006) 1936–1947
the parent node into the node set with hop count reduced by 1. The ‘‘0–1’’ sequence of the parent node is obtained by ‘‘OR’’ operation of those of the siblings. When all nodes in Wm are grouped decrease hop count value by 1. Repeat the same procedure among the nodes in the node set Wm1. The algorithm ends when the hop count becomes 1. It should be noted that Levenshtein distance [1] may also be used for determining the similarity and dissimilarity between two strings. While Hamming distance is used for strings of the same length, Levenshtein distance is defined for strings of arbitrary lengths and hence more sophisticated. Levenshtein distance can be regarded as a generalization of the hamming distance. In our BHC algorithm, the bit sequences required to be compared are of the same length because n probe packets are sent and each node in the network has a record about whether each probe packet has been received or not – if the ith probe packet is received by a node, the corresponding digit in the bit sequence of this node is 1, otherwise it is 0. Clearly, hamming distance suffices for our requirement. We therefore choose to use hamming distance for simplicity and efficiency. 3.4. Comparison on inference accuracy of both approaches Our algorithm and the previous algorithms can get the consistent result with the real multicast network as the number of probe packets increases to infinity. With a finite number of probe packets, we will show that our BHC algorithm can obtain accurate results with a higher probability, because the hamming distance is found to be superior to A(n) (i, j) used in [7–9,15] in many cases. Definition 1. Let s1 and s2 be two nodes that have a common parent node i, s3 be a node for which node i is not its parent, Wk is the node set with the hop count k. Define s (i) as follows: sðiÞ ¼ fðs1 ; s2 ; s3 Þ : 8s1 ; s2 ; s3 2 W k ; 1 6 k 6 hg. Definition 2. For (s1, s2, s3) 2 s (i), let DH (s1, s2, s3) be the difference of hamming distance between non-siblings and siblings, and DA (s1, s2, s3) be the difference of A(n) (Æ, Æ) between non-siblings and siblings. That is, DH ðs1 ; s2 ; s3 Þ ¼ H d ðs1 ; s3 Þ H d ðs1 ; s2 Þ; DA ðs1 ; s2 ; s3 Þ ¼ AðnÞ ðs1 ; s3 Þ AðnÞ ðs1 ; s2 Þ. Lemma 1. The sufficient conditions for correctly identifying nodes s1 and s2 as siblings are 0 < minðs1 ;s2 ;s3 Þ2sðiÞ DH ðs1 ; s2 ; s3 Þ and H (s1, s2) < dm. For A-approach, the same condition must be satisfied by replacing DH (s1, s2, s3) with DA (s1, s2, s3), and H (s1, s2) with A(n) (s1, s2). Lemma 1 holds because the hamming distance or A(n) (Æ, Æ) of a node and its non-sibling nodes should be greater than that of it and its siblings. We denote by n1si the number of probe packets transmitted from the root to node si successfully, and by n1si sj the
number of probe packets transmitted successfully from the root to both nodes si and sj at the same time, i = 1, 2, 3. Lemma 2. For (s1, s2, s3) 2 s (i), if inequality (4) holds, the hamming distance approach can identify the siblings while A(n) (Æ, Æ) cannot; if inequality (5) holds, A(n) (Æ, Æ) can identify the siblings while the hamming distance approach cannot; in all other cases, both approaches can identify siblings correctly. n1s s 1 1 ðns2 n1s3 Þ < n1s1 s2 n1s1 s3 6 11 3 ðn1s2 n1s3 Þ; 2 n s3
ð4Þ
n1s1 s3 1 1 ðns2 n1s3 Þ < n1s1 s2 n1s1 s3 6 ðn1s2 n1s3 Þ; 1 2 n s3
ð5Þ
Proof. Since H d ðs1 ; s2 Þ ¼ n1s1 þ n1s2 2n1s1 s2 and AðnÞ ðs1 ; s2 Þ ¼
n1s n1s 1
2
nn1s
1 s2
, we have,
DH ðs1 ; s2 ; s3 Þ ¼n1s1 þ n1s3 2n1s1 s3 ðn1s1 þ n1s2 2n1s1 s2 Þ ¼ðn1s3 þ 2ðn1s1 s2 n1s1 s3 ÞÞ n1s2 ;
ð6Þ
and DA ðs1 ; s2 ; s3 Þ ¼ ¼
n1s1 n1s3 n1s1 n1s2 n n1s1 s3 n n1s1 s2 n1s1 n n1s1 s2 ððn1s3 þ
n1s3 ðn1s1 s2 n1s1 s3 ÞÞ n1s2 Þ. n1s1 s3
ð7Þ
From Lemma 1, we know that nodes s1 and s2 will be identified as siblings if DH (s1, s2, s3) < 0 using the hamming distance approach for any (s1, s2, s3) 2 s (i). If (s1, s2, s3) results in DH (s1, s2, s3) < 0, the hamming distance approach will fail to identify s1 and s2 as siblings correctly. Similar conditions holds for A(n) (Æ, Æ) approach. Therefore, we can conclude that if any (s1, s2, s3) 2 s (i) results in DA (s1, s2, s3) < 0 while DH (s1, s2, s3) > 0, the hamming distance approach is superior to the A(n) (Æ, Æ) approach in siblings identification, and vice versa. Lemma 2 describes the complete conditions based on Eqs. (6) and (7). h Lemma 2 shows when the hamming distance approach or A-approach can identify nodes s1 and s2 as siblings correctly. If the probability that inequality (4) holds is greater than the probability that inequality (5) holds, the hamming distance approach works better than previous A-approach. However, obtaining the exact probabilities for inequality (4) and (5) to hold is very difficult because both probabilities vary with different network connections and conditions. Thus, we only analyze the cases that inequality (4) holds and inequality (5) holds, respectively, and give an example through which we can have a clear view on which approach has a better performance in inference accuracy. First, we should note that that in most cases, both the hamming distance approach and A-approach can identify siblings correctly according to our analysis on the different
H. Tian, H. Shen / Computer Communications 29 (2006) 1936–1947
receiving cases of nodes s1, s2 and s3. When links s1 and s2 are in similar conditions, the hamming distance approach can identify siblings correctly if A-approach can do so. However, we also find that sometimes A-approach cannot identify siblings correctly while the hamming distance approach can. When both nodes s1 and s2 receive almost all probe packets while node s3 loses many probe packets, and n1s1 s3 happens to be equal to n1s3 , the hamming distance approach can identify siblings correctly while A-approach cannot. The above is also true when both nodes s1 and s2 lose many probe packets while s3 receives almost all probe n1 packets, and ns11s3 < 12. In very few cases, the hamming diss tance approach3 may not identify siblings correctly while A-approach can. This might happen only when links s1 and s2 are in different conditions which result in dissimilar sequences on nodes s1 and s2. In a multicast network, only a few siblings links may exhibit completely different performances among all siblings-link pairs. Even if under this condition, it can be noted that in most cases the hamming distance approach can identify siblings correctly as A-approach does. Thus, from all cases discussed, we can see that the occurrence of the hamming distance approach outperforming A-approach is more frequent than that of the opposite situation. As to the exact probabilities that inequalities (4) and (5) hold, we suppose a multicast network with 5–27% links losing packets severely where the ratios can represent most cases in the real multicast networks. Less than 5% links suffering severe packet losses only occurs in some applications. Multicast networks with more than 10% links losing packets often result in unacceptable performance because most group receivers can be affected not to receive packets accurately. Therefore, the multicast network with 5–27% links losing packets severely as shown in Fig. 4 choose reasonable ratios of ill-performing links to represent general cases in practice. Different cases are discussed in this network and the probabilities of inequality (4) and (5) holding in each case are calculated, respectively. We find that the probabilities vary with different link status. As we have discussed before, the probabilities also vary with different network topologies. Fig. 4 illustrates clearly that the probability that the hamming distance approach
1941
succeeds but A-approach fails is greater than the probability in the opposite situation for this network. This conclusion may be extended to any network in general case. The simulation in the network given in Section 3.5 also supports this conclusion. Though the probabilities that inequalities (4) and (5) hold are different as the network condition varies, we have seen from the above analysis that with a finite number probe packets, the hamming distance approach is more likely to work out the accurate topology than A-approach. In another word, due to the greater probability of the hamming distance approach outperforming A-approach in siblings identification, we may conclude that the use of the hamming distance approach in the BHC algorithm can result in a better performance than use of A-approach in inference accuracy in our tested networks. 3.5. Simulation on the BHC algorithm In this section, we validate the BHC algorithm by comparing it with HBLT and BLT. As given in [9], BLT provides a good combination of inference accuracy and computational efficiency comparing among many other topology inference algorithms in [7–9]. HBLT proposed in [15,16] applies A-approach as BLT, however, it incorporates hop count into topology inference. So comparison among these three algorithms can sufficiently justify the superiority of the BHC algorithm. We implement all the algorithms in the network topology shown in Fig. 5 by network simulator ns. Node 0 is the sender, nodes 1–10 are receivers. All internal links are configured with different capacities ranging from 0.1 to 5 Mbit/ s. The link 0 fi 0 0 is set at 5 Mbit/s. The root node 0 generates probe packets in a 20 K–2 Mbit/s stream. Every probe packet comprises one UDP packet with 1000bytes. Due to the low capacity and heavy traffic load of links, multicast probe packets can be delayed or lost. Then for each receiver, the collected data is set to 1 if the probe packet is received, otherwise 0. Both algorithms work on the collected ‘‘0–1’’ sequences. From TTL field of the probe packets obtained from receivers, hop count required by BHC and HBLT can easily be obtained.
Fig. 4. Comparison on probabilities that inequality (4) or (5) holds in a certain multicast network.
1942
H. Tian, H. Shen / Computer Communications 29 (2006) 1936–1947
Fig. 5. A multicast network topology.
To describe how close the inferred topology is to the original physical network, we define similarity degree function as follows: Definition 3. Define similarity degree to be a Æ s + b Æ p, where s denotes the ratio of the number of nodes whose siblings are identified correctly to the total number of nodes, p denotes the ratio of the number of nodes whose parent nodes are inferred accurately to the total number of nodes. a, b are the weights of these two factors. When all the inferred descendants of a node are similar to the real descendant of the node, we say that the node is inferred correctly. Let a = 0.5 and b = 0.5, then the compared result is shown in Fig. 6. Figs. 6(a) and (b) are obtained by changing different links’ capacities. Both simulation results show that BHC requires fewer probe packets than HBLT and BLT to infer the accurate topology constantly. BLT shows the lowest
4. Link-level loss performance inference When we infer the multicast network topology by the BHC algorithm, a ‘‘0–1’’ sequence is obtained for each node in the network including those internal nodes. In this section we propose an approach to infer the loss rate for each link simultaneously when inferring multicast topology, taking advantage of these ‘‘0–1’’ sequences. 4.1. Approach to loss performance inference Multicast networks in the binary tree form are firstly considered. According to the BHC algorithm, we know the sequence of internal nodes can be deduced by the following equation. ðnÞ
ðnÞ
X i ¼ _l2RðiÞ X l .
ð8Þ
b 1
1
0.8
0.8
Similarity degree
Similarity degree
a
efficiency in topology inference. Since both HBLT and BLT applies A-approach, we can conclude that hamming distance classification approach used in BHC helps to infer the topology more efficiently and accurately. Figs. 6(a) and (b) also validate our analysis on siblings identification by the hamming distance approach and Aapproach. It shows that the occurrence of that the hamming distance approach outperforming A-approach is more frequent than that of the opposite situation. This supports the conclusion we have drawn in Section 3.4, that is, the probability that hamming distance approach outperforming A-approach should be greater than the probability of the opposite situation. This also indicates why BHC works out better performance than HBLT and BLT in our simulated network. In the followed simulation, we compare the performance of the BHC algorithm with previous algorithms in a larger network than above as show in Fig. 7(a). From Fig. 7(b), it is clear that BHC can infer the correct topology with the least number of probe packets.
0.6
0.4
0.6
0.4
BHC
BHC 0.2
HBLT
0.2
HBLT
BLT 0 1 10
2
10
3
10
Number of probe packets
BLT 4
10
0 1 10
2
10
3
10
Number of probe packets
Fig. 6. Similarity degree comparison among BLT, HBLT and BHC when a = 0.5, b = 0.5.
4
10
H. Tian, H. Shen / Computer Communications 29 (2006) 1936–1947
1943
Fig. 7. Similarity degree comparison among BLT, HBLT and BHC when a = 0.5, b = 0.5.
In the sequence of node i, a bit of 0 means the correspondent probe packet is lost on the path from the root to node i. Comparing the sequence of node i with that of its siblings j, if node j receives a probe packet, and i has not received it, we consider the probe packet is lost on link i. Thus, it can be determined that those bits of 0’s in the sequence of node i are caused by loss in link i if the correspondent bits in the sequence of node j is 1. Among those bits whose correspondent values of both node i and j are 0 (denote the number of these bits by n0ij ), we know that some bits are resulted by packets lost in the path from the root to their parent node (denote this number by n0ijp ), other bits are resulted by packets lost in both the links directly connecting the siblings i and j at the same time. We use c to denote the ratio of n0ijp to n0ij : c ¼
n0ijp n0ij
.
The probability of a probe packet transmitted through link i successfully can be estimated as the ratio of the number of accepted probe packets at node i to the number of accepted probe packets at its parent node. As mentioned previously, the number of probe packets reaching the parent node of siblings can be expressed as n1i þ n1j n1ij þ c n0ij , where n1i þ n1j n1ij denote the number of 1’s in ‘‘0–1’’ sequence of the parent node. Then ^pi can be expressed as the following equations. n1i ^ pi ¼ 1 ni þ n1j n1ij þ c n0ij ," n n n n X X X X ðkÞ ðkÞ ðkÞ ðkÞ ðkÞ Xi Xi þ Xj Xi ^ Xj ¼ k¼1
k¼1 n X
k¼1 ðkÞ
Let ps be the probability of transmitting probe packets successfully from the root to the parent node of node pair i and j as shown in Fig. 8. First, we give the following lemma to estimate ps on which calculation of the loss rate on link i is based. Lemma 3. ps can be estimated by the following formula ^ps ¼
ðnÞ
ð9Þ Because,
lim n1i =n ¼ ps pi ;
n!1
lim n1j =n ¼ ps pj ;
where and denote the numbers of probe packets transmitted successfully trough link i and through both links i and j, respectively; n0i and n0ij denote the numbers of lost probe packets on link i and on both links i and j, respectively.
n!1
Xi _ Xj
;
k¼1
n1i
n1ij
Pn ðmÞ ðiÞ m¼1 X ðmÞ ðjÞ m¼1 X Pn n m¼1 X ðmÞ ðiÞ X ðmÞ ðjÞ
Pn
A ði; jÞ ¼
n!1
n
ð11Þ
Proof. We knew that A (i, j) can be estimated by A(n) (i, j):
ð10Þ
þ c
n1i n1j . n n1ij
k¼1
!#
ðkÞ
Fig. 8. A transmission model.
lim n1ij =n ¼ ps pi pj ;
we have, lim AðnÞ ði; jÞ ¼ lim
n!1
n!1
n1i n1j ¼ ps . n n1ij
¼
n1i n1j . n n1ij
ð12Þ
1944
H. Tian, H. Shen / Computer Communications 29 (2006) 1936–1947
Therefore, with a finite number of probe packets we can n1i n1j to estimate ps . h use nn 1 ij
Lemma 4. If node i and j are siblings in the real multicast network, the loss rate on link i and j denoted by ^ ai and ^aj , respectively, can be estimated on both of their ‘‘0–1’’ sequences as follows: Pn ðkÞ ðkÞ Xi ^ Xj ^i ¼ 1 p^i ¼ 1 k¼1 a ; ð13Þ Pn ðkÞ k¼1 X j Pn ðkÞ ðkÞ Xi ^ Xj ^j ¼ 1 p^j ¼ 1 k¼1 . ð14Þ a Pn ðkÞ k¼1 X i Proof. Since nodes i and j are siblings in the real multicast network, the number of lost probe packets observed at i and j at the same time composes of those lost on the link from the root to i and j’s parent node and those lost on both links i and j. n0ij ¼ nð1 ps Þ þ nps ð1 pi Þð1 pj Þ. We denote c n0ij in Eq. (9) by C and equate it to the second factor of the above expression: C ¼ nps ð1 pi Þð1 pj Þ ¼ n0ij nð1 ps Þ. Since n0ij by its physical meaning can also be obtained by the following equation. n0ij ¼ n ðn1i þ n1j n1ij Þ; and ps can be estimated by ðn1i n1j Þ=ðn n1ij Þ according to Lemma 1, we can easily derive the following formula for C. C¼
n1i n1j ðn1i þ n1j n1ij Þ. n1ij
ð15Þ
Thus, replacing C in Eq. (9) with Eq. (15), we get the following equation to estimate pi. n1ij n1i ¼ p^i ¼ 1 ¼ ni þ n1j n1ij þ C n1j
Pn
ðkÞ ðkÞ k¼1 X i ^ X j Pn ðkÞ k¼1 X j
.
Likewise, pj can be estimated as follows: Pn ðkÞ ðkÞ n1ij Xi ^ Xj p^j ¼ 1 ¼ k¼1 . Pn ðkÞ ni k¼1 X i Thus, loss rates of the link i and j can be determined by Eqs. (13) and (14). h When n increases to infinity, lim n1ij =n ¼ ps pi pj ;
n!1
lim n1j =n ¼ ps pj .
n!1
Clearly, limn!1 p^i ¼ pi . Therefore, it can be concluded that the loss rate estimated by Lemma 4 is consistent with the real loss rate for each link as the number of probe packets goes to infinity.
Fig. 9. Evaluation on the inferred loss rates.
4.2. Algorithm on loss inference Since we have deduced the formula on the loss rate of a link with prior knowledge of the sequence maintained by the node, we can infer loss performance of all links in a multicast network. In the procedure of topology inference given in Section 3.2, the ‘‘0–1’’ sequence for each internal node is obtained. So it becomes possible to infer the link loss rate with the help of our deduced result. We therefore extend the BHC algorithm to infer topology and all links’ loss lates in the multicast network simultaneously. The extension can be done by modifying Step 8 and 9 in the BHC algorithm as follows: 8. if Hd (u, v) > dm then S = {u}, set r to be the virtual parent node of u,au = 0;//au is the loss rate of link u.// 9. else {S = {u, v}, set r to be the virtual parent node of u and v; Pn 9(a). P for i = 1 Pto nCalculate n1u ¼ i¼1 X ðiÞ u , n n ðiÞ ðiÞ 1 n1v ¼ i¼1 X ðiÞ , n ¼ X X ; i¼1 u uv v v n1 9(b). Calculate loss rates of link u and v, ^au ¼ 1 nuv1 , v n1uv ^av ¼ 1 n1 ;}Change Step 17 in BHC to u 0 0 17. Output: Inferred topology and loss rate ðV ; L ; ^ aÞ In order to evaluate the performance of the algorithm for link loss rate inference, we compare the inferred loss rates with the real statistical loss rates as follows. The simulation has been done in the same network as depicted in Fig. 5. We set link 7 to lose packets with a probability of 0.2, links 9 and 10 with probability of 0.5. From the simulation results shown in Fig. 9, it can be easily seen that the inferred loss rates approach to the real loss rates as the number of probe packets increases. 5. Topology and loss performance inference for general trees The general trees are considered in this section. We extend the BHC algorithm for topology inference and the approach for link loss rate inference to general trees.
H. Tian, H. Shen / Computer Communications 29 (2006) 1936–1947
1945
5.1. Topology inference for general trees
5.2. Loss performance inference for general trees
It is more complicated to infer the topology for general trees than the binary case. We introduce a threshold into grouping the siblings set S. The set S is grouped if the hamming distance between any pair of nodes in S is sufficiently close to being minimal. The grouping step starts by finding a pair of nodes {u, v} that has the minimum hamming distance in S, then adjoining further elements to it provided the following inequality is satisfied:
Based on the priori knowledge of the inferred topology and ‘‘0–1’’ sequences for each node discussed above, we extend the approach in Section 4.1 to infer loss performance for general trees. Similar to Lemma 4, the link loss rate of general trees can be estimated by the following Lemma.
H d ðu; v0 Þð1 Þ < H d ðu; vÞ.
ð16Þ
Thus, we replace line 9 of the BHC algorithm by the following steps so that topology inference for general trees can be performed.
Lemma 5. The loss rate of link s1 can be estimated by ^ as1 . ^as1 ¼ 1 ^ps1 ¼ 1
n1s1 s2 s3 sm ; n1s2 s3 sm
ð17Þ
where s1, s2, . . . , sm are siblings. Proof. As n increases to infinity, lim n1s1 s2 s3 =n ¼ ps ps1 ps2 ps3 ;
n!1
9a. else {S = {u, v}; 9b. while there exists v 0 2 WmnS such that Hd (u, v 0 )(1 ) < Hd (u, v) do 9c. S: = S [ {v 0 };} 9d. Set r to be the virtual parent node of all identified siblings in S. We also use classification threshold dm for identification of siblings in general trees in a similar way as used in the algorithm of Section 3.3. We can set dm to a given experience value for a network with binary tree topology due to the structure simplicity. However, for general trees, we set dm to be nk lg m in order to reduce the ratio of misclassification of siblings. Here m is the level of nodes computed from the root, n is the total number of nodes in the multicast network, and k is the estimated expected number of branches of the multicast network. We set dm to be nk lg m because we want dm to be linearly proportional to the number of nodes and to the logarithm of the level, and inversely proportional to the branches of each level in the multicast tree. The value may also need to be adjusted according to the real condition of the network. As presented in [9], the violation of the condition described in inequality (16) has the interpretation that the ancestor a (U) is separated from a ({u, v}) by a link with loss rate at least . The convergence of the inferred topology to the true topology is mainly influenced by . If only is less than the internal link loss rates, the inferred topology will be convergent to the true topology. However, the internal link loss rates are unknown in advance. A small value of is more likely to satisfy the above condition but at the cost of slow convergence. A large value of , on the other hand, is more likely to result in systematically removing links with small loss rates. Thus, it is convergent to a wrong topology. In order to choose an appropriate to obtain more accurate and complete topology practically, we will propose the scheme of incorporating the link loss rate into topology inference in Section 5.3 based on the loss performance inference for general trees.
lim n1s2 s3 =n ¼ ps ps2 ps3 .
n!1
Thus, limn!1
n1s
1 s2 s3
n1s
2 s3
¼ limn!1 ^pi ¼ pi .
h
As the number of probe packets goes to infinity, the loss rate estimated by ^pi is consistent with the real loss rate for each link in general trees. We have done the simulation for topology inference and loss rates inference in general trees with sizes ranging from 20 nodes to 200 nodes. Fig. 10 gives the average results on these general trees. 5.3. Loss rate-based topology inference As discussed in Section 5.1, parameter is used for grouping siblings in general trees, which requires to be less than all internal link loss rates such that the inferred topology can be correctly convergent to the true topology. However, the internal link loss rates are unknown in advance, which may result in wrongly inferred topologies containing only high loss rate links as discussed in [9] about failure inference. Thus, with the help of the approach to loss rate inference proposed in section 5.2, a scheme is built to adjust timely based on the inferred loss rates. In the procedure of grouping siblings, the loss rates of links are computed. Errors in grouping may exist due to an inappropriate choice of . So when the minimum loss rate among all links is less than , we change the value of to be the minimum loss rate, and group the siblings again according to the modified . The procedure is described as follows. 1. 2. 3. 4. 5.
Initiate , ^amin . While ^amin < do ¼ ^amin Group siblings according to inequality (16). Compute the loss rates based on inferred topology, denote the minimized loss rate as ^amin .
1946
H. Tian, H. Shen / Computer Communications 29 (2006) 1936–1947
Fig. 10. Simulation on a general tree network. (a) Topology inference by HBLT and BHC when a = 0.5, b = 0.5 (b) Comparison between inferred loss rates and real link loss rates of are 0.3 and 0.5, respectively.
In the above procedure, we initially set to be smaller than ^ amin that has a value between 0.5 and 1. So some pseudo siblings are grouped firstly, which will result in large loss rates, whose minimum value is less than . Then all the nodes will be grouped again with a smaller , some pseudo siblings are removed in this case, which consequently diminishes the minimum loss rate inferred in the newly grouped tree. The nodes are grouped again due to ^ amin < . The procedure is repeated until is adjusted to an appropriate value that satisfies the condition that is smaller than the loss rates of all links in the network. Thus, the inferred topology can quickly converge to the true topology with the appropriate . The following Lemma gives the relationship between and the link loss rate. Lemma 6. A small can remove some pseudo siblings, and result in decreased loss rates of the links. Proof. Assume that s1 is identified to have m siblings including itself and some possible pseudo siblings. Then, let ps1 ðmÞ denote the probability of successfully transmission on link s1 with the hypothetical m siblings. ps1 ðmÞ ¼
n1s1 s2 s3 sm . n1s2 s3 sm
If a smaller causes a node to be removed from its siblings set according to inequality (16), ps1 ðm 1Þ ¼
n1s1 s2 s3 sm1 þ x ; n1s2 s3 sm1 þ y
where x and y are integers, and x < y due to the removal of the node whose ‘‘0–1’’ sequence is much different from those of other nodes. The minimum hamming distance among all pairs of sequences maintained by the removed node and other nodes is far from those pairs among other siblings, which leads x to be less than y, and both are greater than 0. Thus, ps1 ðm 1Þ < ps1 ðmÞ.
So a smaller leads the adjusting procedure to remove some pseudo siblings and thus results in decreased loss rates of the links. h By adjusting the loss rate-based topology inference can thus be made more accurate than those inferred in [9,17].
6. Conclusion An improved algorithm for multicast network topology inference, binary hamming distance classification algorithm (BHC), has been proposed in this paper. With a finite number of probe packets, the topology inferred by BHC has been shown to be more accurate than those inferred by the previous algorithms. It has also been shown that BHC significantly outperforms the previous algorithms in efficiency according to the implementation results of the BHC, HBLT and BLT algorithms in our simulated networks. Since all algorithms discussed in this paper are based on end-to-end loss measurement, they may not work very efficiently when inferring topology for a multicast network with few losses. In this case, same as other work in the literature [7,9], only when the number of probe packets is very large, the inferred topology may approach to the accurate one. To avoid use of an excessive number of probe packets for topology inference, we hope to extend our approach to inference based on other types of measurements such as delay measurement in the future, because for multicast networks with few losses, delay-based topology inference would provide a more efficient method. Based on the data collected for topology inference in this paper, we have also proposed a novel approach to inferring the network-internal loss performance, making it possible for the first time to infer multicast topology and link loss rate simultaneously. The loss performance inferred by our approach has been shown to be consistent with the real loss performance in the multicast network.
H. Tian, H. Shen / Computer Communications 29 (2006) 1936–1947
Both approaches have been extended to the general trees. By incorporating approaches for loss performance inference based on topology inference, we have proposed a technique that can improve the inferred topology to one that is correctly convergent to the true topology. Acknowledgments This work was supported by the 21st Century COE Program of Ministry of Education, Culture, Sports, Science and Technology, and by Japan Society for the Promotion of Science (JSPS) Grant-in-Aid for Scientific Research under its General Research Scheme (B) Grant No. 14380139. References [1] M.J. Atallah (Ed.), Algorithms and Theory of Computation Handbook, CRC Press LLC, Boca Raton, FL, 1999, pp. 14–35. [2] R. Caceres, N.G. Duffield, J. Horowitz, F.L. Presti, D. Towsley, Loss based inference of multicast network topology, in: Proceedings of IEEE Conference on Decision and Control, Phoenix, AZ, 1999. [3] R. Caceres, N.G. Duffield, J. Horowitz, D. Towsley, Multicast-based inference of network-internal loss characteristics, IEEE Trans. Inform. Theory 45 (7) (1999) 2462–2480. [4] R. Careres, N.G. Duffield, H. Horowitz, D. Towsley, Statistical inference for internal link parameters in a network, in: Proceedings of the 1998 Annual Meeting of the American Statistical Association Annual Meeting. [5] R. Careres, N.G. Duffield, H. Horowitz, D. Towsley, T. Bu, Multicast-based inference of network internal loss characteristics: accuracy of packet loss estimation, in: Proceedings of INFOCOM, New York, 1999, pp. 371–379. [6] R. Castro, M. Coates, R. Nowak, Maximum likelihood identification from end-to-end measurements, Proceedings of the DIMACS Workshop on Internet Measurement, Mapping and Modeling, DIMACS Center, Rutgers University, Piscataway, NJ, 2002. [7] N.G. Duffield, J. Horowitz, F.L. Presti, D. Towsley, Multicast topology inference from end-to-end measurements, Advances in Performance Analysis 3 (2000) 207–226. [8] N. Duffield, J. Horowitz, F.L. Presti, Adaptive multicast topology inference, in: Proceedings of IEEE INFOCOM, Anchorage, Alaska, USA, April 2001, pp. 1636–1645. [9] N.G. Duffield, J. Horowitz, F.L. Presti, D. Towsley, Multicast topology inference from measured end-to-end loss, IEEE Trans. Inform. Theory 48 (1) (2002) 26–45. [10] N.G. Duffield, F.F. Presti, Multicast inference of packet delay variance at interior network links, in: Proceedings of IEEE INFOCOM, Tel Aviv, Israel, 2000, pp. 1351–1360. [11] B.N. Levine, S. Paul, J.J. Garcia-Luna-Aceves, Organizing multicast receivers deterministically according to packet-loss correlation, in: Proceedings of ACM Multimedia’98, Bristol, UK, 1998, pp. 201–210.
1947
[12] D. Li, D.R. Cheriton, Oters (on-tree efficient recovery using subcasting): a reliable multicast protocol, in: Proceedings of the 6th IEEE International Conference on Network Protocols (ICNP’98), Austin, TX, USA, 1998, pp. 237–245. [13] F.L. Presti, N.G. Duffield, J. Horowitz, D. Towsley, Multicast-based inference of network-internal delay distributions, IEEE/ACM Trans. Network. 10 (6) (2002) 761–775. [14] S. Ratanasamy, S. McCanne, Inference of multicast routing tree topologies and bottleneck bandwidths using end-to-end measurements, in: Proceedings of the IEEE INFOCOM, New York, USA, 1999, pp. 353–360. [15] H. Tian, H. Shen, An improved algorithm of multicast topology inference from end-to-end measurements, in: Proceedings of the 5th International Symposium on High Performance Computing ISHPCV, Tokyo, Japan, Oct. 2003, pp. 376–384. [16] H. Tian, H. Shen, Analysis on binary loss tree classification with hop count for multicast topology discovery, in: Proceedings of IEEE CCNC’04, Las Vegas, USA, 2004. [17] H. Tian, H. Shen, Hamming distance and hop count based multicast network topology inference, in: Proceedings of IEEE AINA’2005, Taiwan, China, 2005, pp. 267–272.
Hui Tian is currently a Lecturer in Manchester Metropolitan University, UK. She received B.Eng. and M.Eng. degrees in telecommunications from Xidian University, China, and has completed her Ph.D. work in Computer Science in Japan Advanced Institute of Science and Technology. Her research interest includes network performance evaluation, telecommunications and wireless sensor networks.
Hong Shen received his B.Eng. degree from Beijing University of Science and Technology, M.Eng. degree from University of Science and Technology of China, Ph.Lic. and Ph.D. degrees from Abo Akademi University, Finland, all in Computer Science. He is currently a professor in Japan Advanced Institute of Science and Technology and also Professor of Computer Science in Manchester Metropolitan University. Previously he was Professor of Computer Science at Griffith University, Australia. His main research interests lie in parallel and distributed computing, algorithms, high performance networks, data mining and multimedia systems. He has published over 200 papers, with more than 80 papers in international journals. He has served on editorial boards of 7 international journals, and chaired several international conferences.