Edge-based traffic engineering for OSPF networks

Computer Networks 48 (2005) 605–625 www.elsevier.com/locate/comnet Edge-based traﬃc engineering for OSPF networks q Jun Wang *, Yaling Yang, Li Xia...

Download PDF

852KB Sizes 2 Downloads 89 Views

Report

PDF Reader
Full Text

Computer Networks 48 (2005) 605–625 www.elsevier.com/locate/comnet

Edge-based traﬃc engineering for OSPF networks

q

Jun Wang *, Yaling Yang, Li Xiao, Klara Nahrstedt Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801-2302, United States Received 22 February 2004; received in revised form 12 October 2004; accepted 1 November 2004 Available online 22 December 2004 Responsible Editor: I. Matta

Abstract This paper proposes and evaluates a novel, edge-based approach, which we call the k-set Traﬃc Engineering (TE) method, to perform traﬃc engineering in OSPF networks by partitioning traﬃc into uneven k-traﬃc sets. The traﬃc partitioning and splitting takes place only at network edges, leaving the core simple. We theoretically prove that if k is large enough, the k-set TE method achieves the general optimal traﬃc engineering where full-mesh overlaying and arbitrary traﬃc splitting, such as in MPLS, have to be used. We give an upper bound of the smallest k that achieves such a general optimum. In addition, we provide a constant worst case performance bound if k is smaller than the optimal k. Finding the optimal traﬃc splitting and routing for a given k is NP-hard. Therefore, we present a heuristic algorithm to handle the problem. The performance of the k-set TE method together with the proposed heuristic algorithm is evaluated by simulation. The results conﬁrm that a fairly small k (2 or 4) can achieve good near-optimal traﬃc engineering. Overall, the k-set TE method provides a simple and eﬃcient solution to achieve load balancing in OSPF networks. It follows the ‘‘smart edge, simple core’’ design rule of the Internet. It is also able to keep ‘‘the same path for the same ﬂow,’’ which is desirable and beneﬁcial to TCP applications. Ó 2004 Elsevier B.V. All rights reserved. Keywords: Traﬃc engineering; OSPF; Edge-based; Traﬃc set; Mathematical programming/optimization

q This work was supported by NSF Grant under contract number NSF ANI 00-73802 and NSF CISE Grant under contract number NSF EIA 99-72884. Any opinions, ﬁndings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reﬂect the views of the National Science Foundation. * Corresponding author. E-mail addresses: [email protected] (J. Wang), [email protected] (Y. Yang), [email protected] (L. Xiao), [email protected] (K. Nahrstedt).

1. Introduction Traﬃc engineering is essential for todayÕs Internet Service Providers (ISPs) because of rapid growth of the network and increasing demands coming from end users and new applications. The major task of traﬃc engineering is to ﬁnd appropriate routing and traﬃc allocation schemes

1389-1286/$ - see front matter Ó 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.comnet.2004.11.008

606

J. Wang et al. / Computer Networks 48 (2005) 605–625

for given physical networks and user traﬃc demands, so that the traﬃc load is balanced and the overall network performance is optimized. One way to bring traﬃc engineering is to deploy new ﬂow-based connection-oriented protocols, such as the Multi-Protocol Label Switching (MPLS) protocol, where traﬃc engineering is easy to implement. However, the destination-based hop-by-hop routing protocol, such as the Open Shortest Path First (OSPF) protocol [1], is still the most commonly used intra-domain routing protocol in todayÕs Internet. On one hand, it is simple, robust, and highly scalable. But on the other hand, it is believed that OSPF may lead to congestion, hence suﬀering from bad performance if traﬃc engineering does not exist. Therefore, trafﬁc engineering in OSPF networks is extremely important and meaningful, and a good traﬃc engineering solution on top of OSPF can both improve network performance and leverage the widespread deployment of OSPF. Some of the latest research results show that the Equal Cost Multi-Path (ECMP) feature 1 in OSPF can help to achieve the traﬃc engineering purpose. Ideally, if we assume that traﬃc can be split arbitrarily, then by properly setting link weights [2], we can achieve load balancing in OSPF that is comparable to MPLS networks [3]. However, in reality, the ECMP only supports even traﬃc splitting, which is not enough to approximate the general optimal result as in MPLS. Therefore, Sridharan et al. proposed an enhanced method in [4], where only a subset of next hops of the equal cost paths between a source and a destination is used for distributing traﬃc. By carefully choosing the next hop subsets, this method mimics uneven traﬃc splitting and better approximates the general optimal result. Some other methods, such as in [5], try to achieve better traﬃc engineering by combining the OSPF ECMP and MPLS techniques together. We will summarize and compare the existing traﬃc engineering approaches in detail in Section 2.

1

The ECMP feature allows traﬃc to be distributed equally among multiple next hops of the equal cost paths between a source and a destination [1].

All existing approaches for OSPF traﬃc engineering are based on ECMP and link weight manipulation. Besides the even splitting constraint, another signiﬁcant disadvantage of ECMP is that the packets from a source to a destination may no longer be able to travel along the same path, even if they are from a single TCP ﬂow. Since multiple paths may have very diﬀerent delays and jitters (in order to result in multiple equal cost paths thus triggering ECMP, we may not be able to consider delays when we set up link weights), TCP ﬂows may suﬀer from bad performance due to packet out-of-order delivery. Furthermore, some routers in a network may not turn on the ECMP feature. Therefore, a big challenge is to achieve better traﬃc engineering in OSPF networks without the use of ECMP. We will solve this challenge in this paper by pushing traﬃc engineering decisions to the network edge. Our approach is inspired by the Diﬀerentiated Service (DiﬀServ) scheme [6,7]. Originally, the DiﬀServ scheme has been proposed as a cost-eﬀective and scalable solution to provide better end-to-end QoS to applications. The basic idea of DiﬀServ ﬁts in the ‘‘smart edge, simple core’’ design rule of the Internet very well; i.e., to push complex operations (such as packet classiﬁcation, proﬁling, advanced weighted scheduling, etc.) to the network edge and keep the core simple. On one hand, because edge nodes have detailed ﬂow information, they can perform ﬁner-grained control on packets and provide partitioning along the ﬂow boundaries. On the other hand, keeping the network core simple lets the system scale well. Apart from the original QoS goal of DiﬀServ, our goal is traﬃc load balancing. Our approach is called the k-set edge-based traﬃc engineering method (k-set TE method or simply k-set method for short). We intentionally use ‘‘traﬃc set’’ terminology instead of ‘‘traﬃc class’’ to diﬀerentiate our traﬃc partition from the service class concept in the QoS domain. In this paper, we deﬁne the traﬃc set as the traﬃc fraction that is split at the network edge for traﬃc engineering and load balancing purposes. As we will see later, the packets in one traﬃc set, ﬂowing from the source to the destination, follow one simple path and no further splitting is allowed along the path. This means, our

J. Wang et al. / Computer Networks 48 (2005) 605–625

approach partitions traﬃc into uneven k traﬃc sets according to certain ratios 2 and routes these k traﬃc sets in proper ways to balance the traﬃc load in the entire network. Since the traﬃc splitting is pushed to the edge where ﬂow information is available, we can perform uneven traﬃc splitting to get better near-optimal results, and also to follow strictly the rule of ‘‘simple core’’ in the Internet. More speciﬁcally, in order to implement the route diﬀerentiation and traﬃc splitting among multiple traﬃc sets, we take advantage of the Type of Service (ToS) ﬁeld [8] or the Diﬀerentiated Service (DS) ﬁeld [6,9] in IP headers and the QoS routing table extension in OSPF [2], where diﬀerent next hops for diﬀerent traﬃc sets can be recorded. Because our approach can be incorporated into existing protocols or their extensions, no substantial change is needed for deploying our approach except the packet classiﬁcation and traﬃc allocation at network edges. In this paper, we also prove that for any given network, there always exists a large enough k which achieves the general optimum of traﬃc engineering. 3 And, this optimal k is less than or equal to the total number of links in the given network. If k is ﬁxed and smaller than the optimal k, then how to partition and route traﬃc to achieve the optimal result under the k-set constraint (i.e., splitting only at edges and hop-by-hop routing for each traﬃc set) composes the k-set TE problem which is NP-hard. Note that this k-set optimum is not the general optimum, but the optimum under k-set constraint. We give a theoretical upper bound of the gap between the k-set optimum and the general optimum. Experimental results show that the k-set optimum is an excellent approximation to the general opti-

2 The traﬃc partition is packet-wise. However, since we perform the partition only at network edges, we can enforce the partition ratios and follow ﬂow boundaries at the same time. 3 In this paper, we deﬁne the general optimum of traﬃc engineering to be the best load allocation we can get by using the optimal general routing or explicit ﬂow-based routing as in MPLS. In this case, each ﬂow is optimally distributed over all paths between a source and a destination [3,4,10]. The general optimum is equivalent to the optimum that is obtained by using shortest paths with arbitrary traﬃc splitting [10]. Furthermore, it can be obtained by solving linear programs (see detail in Section 3.3).

607

mum even with a very small k. We also present a fast heuristic solution to approximate the k-set optimum based on DijkstraÕs algorithm and linear programming. Performance is evaluated via simulations. Results show that our k-set TE method is very eﬀective to approximate the general optimum even when k = 2 or 4, making our approach simple, eﬃcient, and practical. The rest of the paper is organized as follows. Section 2 summarizes and compares existing traﬃc engineering research, followed by the introduction to some fundamentals of traﬃc engineering in Section 3. Section 4 gives an overview of our kset TE method. Then, in Section 5, we formulate the k-set traﬃc splitting and allocation problem using mixed integer programming and propose our heuristic solution to the problem. The performance issues are investigated in Section 6 by simulation. Finally, Section 7 concludes this paper.

2. Related work As mentioned in Section 1, traﬃc engineering is of great importance in the Internet. Existing traﬃc engineering techniques can be roughly classiﬁed into two categories in terms of diﬀerent underlying routing paradigms they assume. One category is based on ﬂow-based explicit routing (also called constraint-based routing). Traﬃc engineering solutions in this category mainly focus on ﬂow-based networks, such as the MPLS networks [11–13]. The other category is based on destination-based hop-by-hop routing. Approaches in this category assume that the underlying networks use the traditional hop-by-hop forwarding paradigm, such as the OSPF networks [1]. The major diﬀerence between these two categories is the traﬃc engineering granularity. Methods in the ﬁrst category aim at ﬁnding optimal routing and traﬃc allocation schemes for individual ﬂows. Because each ﬂow can explicitly select its own paths, ﬁner control granularity and better results (with respect to balancing traﬃc load) can be obtained. However, as argued in [3,4,14], such methods have signiﬁcant limitations because of the disadvantages of the ﬂow-based protocols (e.g., MPLS) on which they rely. First, MPLS is not

608

J. Wang et al. / Computer Networks 48 (2005) 605–625

yet widely deployed. Second, MPLS is more complex and less robust (than OSPF). Third, MPLS is less scalable due to large size of routing table and state information. The approaches of the second category do not assume the existence of MPLS. They rely on the most widely used intra-domain routing protocols in todayÕs Internet, such as the OSPF or IS-IS. Forwarding decisions are solely based on the destination addresses in packetsÕ IP headers, and routing table construction is based on DijkstraÕs algorithm. In fact, if we look at each destination node and if there is no traﬃc splitting, the routes from all source nodes to the destination form a spanning tree rooted at the destination [15,16]. The existing approaches, as well as our k-set TE method, are compared in Table 1 with respect to diﬀerent criteria. In the table, ‘‘Ar-sp,’’ ‘‘Un-sp,’’ ‘‘Ev-sp,’’ ‘‘Ev-sp,ext.,’’ and ‘‘Lmt-sp’’ stand for Arbitrarily splittable, Un-splittable, Evenly splittable, Evenly splittable with subset selecting extension, and Limited splittable, respectively. The ﬂow-based approaches have more precise control and can achieve the general optimum of traﬃc engineering, but their scalability is low because of the ‘‘N-square’’ problem [10]. Some other disadvantages of the ﬂow-based approaches have already been discussed above. In the destinationbased category, the arbitrarily splittable approach, which is equivalent to the general optimal method in the ﬂow-based category, enjoys the ﬁnest con-

trol, but there is no existing protocol that supports it. The evenly splittable approach with subset selecting extension, proposed lately by Sridharan et al. in [4], has ﬁne-grained control and can approximate the arbitrarily splittable approachÕs performance. However, it has to explicitly change every routerÕs routing table to rule out some feasible next hops for each destination, thus introducing ‘‘hand-crafting’’ overhead. Moreover, since traﬃc may split at any location of the network, packets from the same ﬂows may follow very different paths, leading to out-of-order delivery problems that deteriorate the performance of TCP ﬂows. The original evenly splittable approach suffers from the same problem. The unsplittable approach does not have this problem, but it can not achieve good traﬃc engineering results because its traﬃc control is too coarse. Compared with all these existing approaches, in general, our k-set method has the following advantages: (1) Its traﬃc control granularity is tunable by changing the number of traﬃc sets that it uses. (2) It does not require any signiﬁcant changes to the existing protocols, so it is practical and applicable. (3) Due to the edge-only traﬃc splitting, it can enforce one ﬂowÕs packets to follow the same path, therefore eliminating the TCP performance deteriorating problem. (4) Finally, it follows the ‘‘smart edge, simple core’’ design rule of the Internet, therefore, it retains the same scalability and stability of the current Internet.

Table 1 Comparisons between diﬀerent traﬃc engineering approaches (including our approach) Flow based explicit routing

Destination based hop-by-hop forwarding

Traﬃc-set based hop-by-hop

Ar-sp

Un-sp

Ar-sp

Ev-sp, ext.

Ev-sp

Un-sp

Lmt-sp

Control granularity Applicability (Protocol support) Scalability Traﬃc from same ﬂow follows same path? Splitting location (edge or core nodes) Routing complexity

Finest Yes MPLS

Medium Yes MPLS

Fine No

Fine Yes OSPF

Medium Yes OSPF

Coarse Yes OSPF

Tunable Yes OSPF

Low No

Low Yes

Medium No

High No

High No

High Yes

High Yes

Any

None

Any

Any

Any

None

Edge only

P

NPC

P

NPC

NPC

P/NPC

NPC

Reference(s)

[11]

[11]

[10]

[4]

[3,4,10]

[1,2,15,17]

This paper

J. Wang et al. / Computer Networks 48 (2005) 605–625

3. Traﬃc engineering in OSPF networks: fundamentals

609

60

50

3.1. Network model A network is deﬁned as a directed graph G = (V, E), where V is the node set and E is the link set. Ve V is the edge node set which contains all edge nodes 4 in the network. All the other nodes are core nodes. We use (x, y) to denote a link from node x to node y. For any link (x, y) 2 E, c(x, y) represents its capacity and f(x, y) represents the amount of traﬃc load that actually goes through the link. Then the utilization 5 of link (x, y), denoted by u(x, y), is deﬁned as u(x, y) = f(x, y)/c(x, y). Furthermore, we use D to denote the traﬃc demand matrix among edge nodes. For any x, y 2 V, D(x, y) = dxy gives the traﬃc de-

4

Edge nodes are the nodes generating and absorbing traﬃc. Edge nodes have better information of individual ﬂows. Therefore, they are capable of performing ﬁne-grained control on traﬃc. 5 In some papers, link utilization is also called the relative congestion of that link. Note that here link utilization could be larger than 1 because the amount of traﬃc demand that is put onto a link could exceed the linkÕs capacity. When a linkÕs utilization is larger than 1, it means that the link is overloaded or congested, and some portion of the traﬃc on this link will be dropped. Note that how to handle such congestion is the task of congestion control or queue management, and is out of the scope of this paper. Our load balancing goal is to minimize the possibility that some links are overloaded due to unbalanced traﬃc allocation/routing in the network. Therefore, when we pre-compute the traﬃc allocation during the traﬃc engineering procedure, if a linkÕs utilization is close to or exceeds 1, we will assign a very large cost/weight on that link to discourage the further use of it.

40

Link cost

In this section, a network model will be presented. Based on the network model, some fundamentals of traﬃc engineering (TE) will be introduced, which include the commonly used cost and objective functions for quantitatively evaluating and comparing diﬀerent TE methods [3,4]. We will also deﬁne the general optimum of traﬃc engineering and introduce a method to obtain the general optimum based on linear programming. The general optimum will be used as a benchmark later when our k-set TE method is evaluated.

30

20

10

0

0

0.2

0.4

0.6

0.8

1

1.2

Link utilization u(e)

Fig. 1. Link cost u(u(e)) as function of link utilization u(e).

mand from x to y. If x and y are not edge nodes, i.e., x 62 Ve or y 62 Ve, then dxy = 0. We assume that D is given. 6 3.2. Cost function and optimization objective Being independent of the speciﬁc traﬃc engineering techniques, both cost function and optimization objective can be deﬁned in diﬀerent ways. For comparability, we use the commonly used objective and cost functions as in [3,4]. For any link e 2 E, suppose its capacity is c(e) and the total traﬃc load on it is f(e). (Then u(e) = f(e)/c(e) is the utilization on link e.) The objective function and the piece-wise linear cost function are deﬁned as follows: X U¼ uðuðeÞÞ; e2E

where for all e 2 E; uð0Þ ¼ 0 and 8 1 for 0 6 x < 1=3; > > > > > 3 for 1=3 6 x < 2=3; > > > < 10 for 2=3 6 x < 9=10; u0 ðxÞ ¼ > 70 for 9=10 6 x < 1; > > > > > 500 for 1 6 x < 11=10; > > : 5000 for 11=10 6 x < 1:

ð1Þ

6 How to get such traﬃc matrices is out of our research scope in this paper. Good references are [18–20].

610

J. Wang et al. / Computer Networks 48 (2005) 605–625

To better illustrate how the cost function evolves with link utilization, the evolution curve is shown in Fig. 1. The basic idea is that the higher the link utilization is, the more expensive the link becomes. We impose very heavy penalties on a link when its utilization approaches 1.0 to prevent the link from being overloaded. For our optimization objective, we ﬁrst sum up the link costs of all links, then minimize it. Note that diﬀerent optimal results may exist if we use diﬀerent traﬃc engineering techniques. For example, if we use ﬂow-based traﬃc engineering (such as in MPLS), we can achieve the best result, because ﬂow-based technique has the ﬁnest control and the smallest constraint in terms of how traﬃc is split and routed. 3.3. General optimum with arbitrary traﬃc splitting If we are allowed to split and allocate traﬃc freely among any paths between sources and destinations without any additional constraints, we can obtain the best traﬃc engineering solution that is called the general optimum (OPT for short) [3,4]. The total cost under the general optimum solution is denoted by U . Note that the general optimum will be the target that we want to approach and compare against when the k-set constraints are imposed later. Given a network G = (V, E), the edge node set Ve, the traﬃc demand matrix D = {dit}, and the cost and objective functions deﬁned above in Eq. (1), we can obtain the general optimum (U ) by solving the following linear program. Note that although the general optimum can be obtained theoretically in a polynomial time, in reality, it is not achievable in OSPF networks even with ECMP, because traﬃc in such networks can NOT be split and allocated freely. Find variables fijt, which satisfy X U ¼ MIN u ði;jÞ2E ij Xn Xn subject to fijt

fjit ¼ d it ; i 2 V ; j¼1 j¼1 t 2 V e ; i 6¼ t X uij ¼ fijt =cij ; t2V e

fijt P 0;

ði; jÞ 2 E;

ði; jÞ 2 E; t 2V; ð2Þ

uij P uij ;

ði; jÞ 2 E;

2 uij P 3uij ; 3

ði; jÞ 2 E;

uij P 10uij

16 ; 3

uij P 70uij

178 ; 3

uij P 500uij

ði; jÞ 2 E; ð3Þ ði; jÞ 2 E;

1468 ; 3

uij P 5000uij

16318 ; 3

ði; jÞ 2 E; ði; jÞ 2 E:

Note that the constraints in (3) deﬁne cost on each link, which is an implementation of the cost function deﬁned in Eq. (1). Solving this linear program gives a traﬃc allocation {fijt} on each link (i, j) such that the objective function is minimized. We can also obtain the corresponding link weights for constructing shortest path routing by solving the dual of the linear program [10]. However, we are only interested in the optimal cost U for comparison with our k-set TE method later.

4. k-Set TE method: overview Traﬃc can not be split and routed freely in OSPF networks. OSPF ECMP is not good enough to approximate the general optimum (OPT) because of the limitation of equal splitting. The basic idea of our k-set TE method is that (1) we virtually divide a given physical network into k overlays, (2) partition the traﬃc into uneven k trafﬁc sets only at edge nodes and identify these k trafﬁc sets by using the ToS/DS ﬁeld in IP heads, and (3) route the traﬃc sets independently over the k virtual overlays on top of the given physical network. Note that the traﬃc splitting and allocation happen only at edge nodes, but such splitting and allocation can be uneven among traﬃc sets to achieve better load balancing. Since the edge nodes have much less traﬃc to handle than the core nodes do, uneven traﬃc splitting is feasible at the network edge. (This is exactly the same argument as used by the DiﬀServ model for push-

J. Wang et al. / Computer Networks 48 (2005) 605–625

ing intelligence and complexity to network edge.) Moreover, since the edge nodes have ﬂow information, they can partition traﬃc more precisely following ﬂow boundaries. No further traﬃc splitting or re-allocation is allowed in the core. If we look at only one traﬃc set at a time, say Traﬃc Set 1, it is routed exactly the same way as in an OSPF network without ECMP splitting. That is, for every destination node t, the routes that Traﬃc Set 1 travels from all the sources to t form a shortest path spanning tree rooted at t. Fig. 2 shows simple examples of the k-set TE method and the general optimum (OPT) method. In the given original topology, s and t are the only two edge nodes. Suppose the traﬃc demand is from s to t. Then, the OPT method with arbitrary splitting can use at least ﬁve distinct routes to split the traﬃc. The routes are hs, a, ti, hs, b, ti, hs, c, ti, hs, c, b, ti, and hs, c, d, ti. There are two splitting locations: s and c (a core node). If we use the kset TE method, then the possible results are shown in the two bottom sub-ﬁgures, with k = 2 and 4, respectively. In the case where k = 2, we have to choose only two distinct routes to perform traﬃc engineering. One possible choice is hs, a, ti and hs, c, ti. Similarly, if k = 4, then we can split traﬃc into four routes: hs, a, ti, hs, b, ti, hs, c, ti, and hs, c, d, ti, as shown in the sub-ﬁgures. In both cases of k-set TE method, we can see that only one splitting location, s, exists, which is an edge node. Note

s

c

d

b

t Original topology

a

s

c

d

b

t

a

s

k-set method (k = 2)

c

d

b

t

a

s

OPT method (arbitrary splitting)

c

d

b

t

a

k-set method (k = 4)

Location of traffic splitting

Fig. 2. Simple example to show k-set TE method and OPT method.

611

that in the case of k = 4, it looks like a split happening at node c, but it does not. This is because the two traﬃc sets, routed independently on hs, c, ti and hs, c, d, ti, have already been partitioned and allocated by node s before they reach c. That is, although the two traﬃc sets physically share the same link (s, c), they are separated virtually. From this example, we can observe that the OPT method (with arbitrary splitting) may achieve the best TE result because it can split traﬃc into more streams and at any place in a network, thus having ﬁnest control granularity. We can also imagine that the larger k we use, the better result the k-set TE method may achieve. Several questions may arise. Is it possible for the k-set TE method to achieve the same result of the general optimum? If so, how large k should be? If a small k is used, then what is the best result that the k-set method can achieve? We call this best result the k-set optimum and denote its cost by U k . (Note that this k-set optimum is not the general optimum, but the optimum under k-set constraint.) Finally, what is the performance gap between the k-set optimum and the general optimum (i.e., the diﬀerence between U and U k )? We will focus on these questions next in Section 5. Brieﬂy, our target of traﬃc engineering is the general optimum U . As we have discussed in Section 3.3, U can be achieved theoretically by solving a linear program. Since this method requires arbitrary traﬃc splitting, it is infeasible in real OSPF networks because only even traﬃc splitting is allowed in such networks. By using our k-set TE method for a given k, we expect that U k is as close to U as possible. Actually, we will show in Theorem 1 (Section 5) that if k is suﬃciently large, then the k-set TE method can achieve the general optimum (i.e., U k ¼ U if k is suﬃciently large). Otherwise if k is not suﬃciently large, then we will study the gap between U and U k , and will give a worst-case constant bound of U k =U in Theorem 2. Our bound is tight and is the same as the bound given by Fortz and Thorup in [3]. Furthermore, if k is small, we will show that searching for the k-set optimum U k is NP-hard. Therefore, we have devised a heuristic algorithm to approximate U k for a small k (Section 5.3). We denote the result of

612

J. Wang et al. / Computer Networks 48 (2005) 605–625

Fig. 3. Roadmap of the k-set TE method.

b k . That is, U b k is the dithe heuristic algorithm by U rect approximation to Uk . Since the gap between U and U k is very small in real networks (Section bk 5.2), we expect that through approaching U k , U 7 is also a good approximation to the general optimum U (our ﬁnal target). Finally, we summab k in Fig. rize the relationship between U , U k , and U b 3. Uk can be regarded as a bridge from U k to U . 5. k-Set traﬃc engineering method In this section, we will elaborate on the k-set TE method by answering the questions raised in the previous section (Section 4). We will ﬁrst formulate the problem of how to achieve the k-set optimum for a given k, which is called the k-set TE problem and is proven to be NP-hard. Then, we will study the questions associated with the number of traﬃc set, k. Finally, we will present our heuristic solution to the k-set TE problem, followed by some implementation issues and discussions.

constraint, the problem arises as how we split trafﬁc at each source node and how we route these k traﬃc sets, so that the objective U, deﬁned in Eq. (1), is minimized. We use U k to denote the minimal U under the k-set constraint. (Note that U k is optimal under the k-set TE method and it is not necessary to be the general optimum, U .) In the next section, we will determine the number of traﬃc sets, k, and the gap between U k and the general optimum U achieved by using the TE method proposed in [10] where traﬃc is arbitrarily splittable. The k-set TE problem is NP-hard. In terms of the NP-hardness, let us consider a special case of the k-set TE problem, where k = 1 and there is only one destination node. Even this special case is NP-hard because it is exactly a variation of the minimal-cost Steiner tree problem, which is a well-known NP-hard problem. The k-set TE problem can be mathematically formulated as a mixed integer programming problem. Given an IP network G = (V, E) and the demand matrix D = {dstjs, t 2 V}, 8 and assuming V = {1, 2, . . . ,n} is the set of nodes, we deﬁne the following three groups of decision variables in Eq. (4), where T at denotes the spanning tree for Traﬃc Set a, rooted at t and covering all edge nodes in the network G: d ait ¼ trafficðdemandÞallocation to Traffic Set a at edge node i; ( xaijt ¼

k X

d ait ¼ d it ; 8i; t 2 V e ;

a¼1

1; ði; jÞ 2 E; t 2 V e ; ði; jÞ 2 T at ; 0; otherwise;

fijta ¼ traffic fraction of Traffic Set a on link ði; jÞ for destination t; 8ði; jÞ 2 E; t 2 V e : ð4Þ

5.1. k-Set TE problem formulation The k-set TE problem is the fundamental problem behind our k-set TE method. Under the k-set 7 The performance will be evaluated in Section 6 by simulations.

Note that xaijt are binary variables, and fijta ¼ 0 if xaijt ¼ 0. Finally, based on the objective and cost functions given in Eq. (1), the k-set TE problem is formulated as follows:

8

If nodes s, t are not edge nodes, then dst = 0.

J. Wang et al. / Computer Networks 48 (2005) 605–625

Find variables d ait , xaijt , and fijta , which satisfy X U k ¼ MIN u ð5Þ ði;jÞ2E ij subject to Xn xa ¼ 1; j¼1 ijt

ði; jÞ 2 E; t 2 V e ; i 6¼ t; a ¼ 1; . . . ; k;

Xn

fa

j¼1 ijt

n X

fjita ¼ d ait ;

ð6Þ

i 2 V ; t 2 V e;

j¼1

a ¼ 1; . . . ; k; Xk

da a¼1 it

fijta P 0;

¼ d it ;

i 2 V ; t 2 V e;

ð8Þ

d ait P 0; ði; jÞ 2 E; t 2 V e ; a ¼ 1; . . . ; k;

fijta ¼ 0 if xaijt ¼ 0; xaijt 2 f0; 1g; uij ¼

ð7Þ

ði; jÞ 2 E; t 2 V e ; a ¼ 1; . . . ; k;

ði; jÞ 2 E; t 2 V e ; a ¼ 1; . . . ; k;

k XX fijta : cij t2V e a¼1

ð9Þ

Here we omit the link cost constraints that are similar to Eq. (3). In Eq. (5), we let uij = u(uij). Constraints (6) state that every node has only one outgoing link (next-hop) to forward each traﬃc set. Constraints (7) are ﬂow conservation constraints. Both (6) and (7) together enforce the tree structure towards each destination for each traﬃc set. Constraints (8) enforce the proper traﬃc splitting at each edge node. Eq. (9) deﬁne the utilization on each link. 5.2. How many traﬃc sets do we need? In this subsection, our focus is on how the value of k aﬀects the performance of the k-set TE method. Intuitively, the more traﬃc sets we use, the better result we get. However, due to the scalability and applicability factors, this number should be kept small. Then, natural questions are: (1) Can we achieve, using the k-set method with a certain k, the general optimal result of arbitrary traﬃc splitting? If so, what the optimal k should be? (2) How many traﬃc sets do we need to achieve a reasonably good sub-optimal result? We will address these questions as follows.

613

First, theoretically, we prove that the k-set TE method can achieve U (the general optimum of arbitrary traﬃc spitting) if k is suﬃciently large. Theorem 1. For any given topology G = (V, E) and traffic matrix D, there always exist large enough kÕs and corresponding traffic allocation strategies such that the k-set method achieves the general optimum U (defined in Eq. (2)). Furthermore, the smallest value of such optimal kÕs should be less than or equal to jEj. That is, we can always find a k 6 jEj such that U k ¼ U . Proof. The proof is given in Appendix A. h Although theoretically we have proved that an optimal k can always be found to achieve U k ¼ U , this k might be too large to be practical. If given a ﬁxed small k, since U k is the best approximation to U by using the k-set TE method, we would like to know that U k =U (i.e., the gap between the k-set optimum and the general optimum) is bounded by a constant. Theorem 2. Given a fixed k (k is less than the value that achieves the general optimum), if the cost and objective functions in Eq. (1) are used, then U k =U 6 5000, and the upper bound is tight. Proof. The proof is given in Appendix B.

h

The constant bound that we have obtained in Theorem 2 is a worst case upper bound in theory, which equals the bound given by Fortz and Thorup in [3]. In practice, the diﬀerence between U k and U may not be that big. In what follows, we will test the actual gap between U k and U in real networks by solving both the mixed integer program in Eqs. (5)–(9) and the linear program in Eq. (2). We generate a network topology using BRITE [21] topology generator with the Waxman model. The topology consists of 20 nodes, and 14 of them are edge nodes (70% of the total). The number of links is 80. Link capacities are randomly generated in [10,1024]. The traﬃc matrix is randomly generated using Method 1 in Section 6.1. In summary, the setup is the same as that of simulation Case 1 in Section 6.1 (see Table 2),

614

J. Wang et al. / Computer Networks 48 (2005) 605–625

Table 2 Setup of the three simulation cases

Case Case Case Case a

1 2 3.1 3.2

# Nodes

# Edge nodes

# Links

Link capacities

Traﬃc matrix

k

100 100 200 200

70 70 100 100

400 400 800 800

Unif(10,1024) Fixed to 100a Unif(10,1024) Unif(10,1024)

Unif(1,maxCap(x, y)) (Method 1) Unif(1,maxCap(x, y)) (Method 1) Unif(1,maxCap(x, y)) (Method 1) Randomly, with hot spots (Method 2)

2, 2, 2, 2,

4 4 4 4

The exact number here is not important because we will scale the traﬃc matrix accordingly.

5

x 10

*

*

Cost (normalized by OPT, in logarithm): log ( φ / φ*)

2.5

OPT: 1-set: 2-set: 4-set:

φ φ φ φ

3

1 * 2 * 4

Cost ( φ )

2

1.5

1

0.5

0 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Total traffic demand (normalized by the total capacity)

(a)

0.4

OPT 1-set 2-set 4-set

1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Total traffic demand (normalized by the total capacity)

(b)

Fig. 4. Performance comparison between general optimal solution and k-set method, with k = 1, 2, 4 (20 nodes, 14 edge nodes, 80 links). (a) Cost in real values and (b) normalized by OPT, in logarithm.

except for the smaller network size. We use the ILOG CPLEX [22] as the optimization solver. Fig. 4 shows the results of the experiments. The x-axis represents the overall input traﬃc load that is scaled and normalized by the total amount of capacity (sum of capacities of all links). In Fig. 4a, the curves of U , U 2 and U 4 are collapsed together. To show the diﬀerences clearer, in Fig. 4b, we illustrate the normalized cost (by U ) in logarithm (i.e., the curves represent log U k =U , k = 1, 2, 4). Even so, the diﬀerences are still very diﬃcult to recognize. In fact, our data shows that U 4 is almost the same as U . The results clearly show that in a realistic network with a small size, a fairly small k, say 2 or 4, is able to achieve very good performance as long as we can ﬁnd the optimal solutions to the k-set TE problem. However, since this problem is NP-hard in nature, it is not

practical to obtain such optimal solutions for large-scale networks. Therefore, we will propose a heuristic algorithm next in Section 5.3 to search for good near-optimal solutions quickly. Its performance in large-scale networks will be studied by simulation in Section 6. 5.3. Heuristic solution to k-set TE problem In the previous subsection, we have shown that if we ﬁnd the optimal solutions to the k-set TE problem, then the k-set TE method indeed approaches the general optimal traﬃc allocation very well, even when k is very small, for example, k = 2, 4 (only 1 or 2 ToS/DS bits in the IP header). But for large-scale networks, it is not practical to ﬁnd such optimal solutions to the k-set TE problem because it is NP-hard. Therefore, we would like to

J. Wang et al. / Computer Networks 48 (2005) 605–625

design a heuristic solution to this problem in this subsection. Our heuristic solution consists of two steps. The basic idea is ﬁrst to search for k nearoptimal overlays each for one traﬃc set, then to optimize the traﬃc allocation among these overlays. Step 1: Search for the k near-optimal virtual overlays (routing solution) each for one traﬃc set. Given a topology G = (V, E), the edge node set Ve and traﬃc demand matrix D = {dij}, the following pseudo-code presents the algorithm sketch. Since we use a basic Dijkstra-based shortest path algorithm, the Inverse Capacity Shortest-Path algorithm (InvCap for short), as our search engine, we call our algorithm the k-set-InvCap in Algorithm 1. Algorithm 1. k-SET-INVCAP (G, Ve, D, k) {Find k virtual overlays} B W

0 {Init bandwidth usage matrix B = {bij} to 0} {1/cij} {Init link weight to reciprocal of link capacity: 1/cij} for a = 1 to k do Pa DIJKSTRA (G, Ve, W) {Find all paths for ath overlay} D UPDATE-LINK-USAGE(B,2a ,Pa) {Subtract this overlay from network} W {1/(cij bij)} end for

In Algorithm 1, B = {bij} and W = {wij} represent the link bandwidth usage matrix and the link weight matrix, respectively (recall that bij is the bandwidth usage on Link (i, j) and wij is the weight of Link (i, j)). The function call DIJKSTRA (G, Ve, W) returns a virtual overlay among all edge nodes, consisting of shortest paths among them in terms of the given link weight matrix W. The function call UPDATE-LINK-USAGE (B,D 0 ,P) increases each linkÕs bandwidth usage by routing the given demand matrix D 0 onto the given overlay P. That is, for any edge nodes s and t, there exists a path pst 2 P, and we increase each on-path linkÕs bandwidth usage by the given demand from s to t (i.e., d 0st ). As shown in the algorithm, we in fact divide the traﬃc demand into k matrices in an exponential way:

D¼

615

D D D D þ þ þ k 1 þ k 1 : 2 4 2 2

At each time, we ﬁnd an overlay and update the link usage by deploying corresponding fraction of the traﬃc demand onto that overlay. The algorithm we use to ﬁnd the overlay at each iteration is simply the DijkstraÕs algorithm with link weight equal to reciprocal of its residual bandwidth; i.e., if wij is the weight of (i, j), then wi,j = 1/(cij bij). Note that the computation of the weight matrices is solely based on the link capacities {cij} and the traﬃc demand matrix D, which is static and has nothing to do with the traﬃc allocation. Therefore, the static weight matrices can be assigned to the links and be used to construct the routing table at each node in a decentralized manner (see more detailed discussion in Section 5.5.3). Step 2: Determine the traﬃc allocation among the k traﬃc sets based on the virtual overlays (routing solution) found in Step 1. Now we have found the k near-optimal virtual overlays for traﬃc sets. What we want next is to ﬁnd the best traﬃc allocation among these traﬃc sets. That is, Given ﬁxed xaijt , ﬁnd variables d ait , fijta , which satisfy X b k ¼ MIN u U ði;jÞ2E ij subject to n Xn X a f

fjita ¼ d ait ; i 2 V ; t 2 V e ; a ¼ 1; . . . ; k; ijt j¼1 j¼1

d ait

fijta

P 0; P 0; ði; jÞ 2 E; t 2 V e ; a ¼ 1; . . . ; k; Xk d a ¼ d it ; i 2 V ; t 2 V e ; a¼1 it

fijta ¼ 0 if xaijt ¼ 0; ði; jÞ 2 E; t 2 V e ; a ¼ 1; . . . ; k; uij ¼

X t2V e

Xk

fijta : a¼1 c ij ð10Þ

Again, we omit the link cost constraints in Eq. (10) which are similar to Eq. (3). Note that Eq. (10) is a linear, not a mixed integer program because xaijt are ﬁxed constants (0 or 1) now. At this step, we ﬁnd the traﬃc fd ait g for each node i. Note Pk allocation a that a¼1 d it ¼ d it , dit is the total traﬃc demand from i to t. If i is a core node, then fd ait g is always

616

J. Wang et al. / Computer Networks 48 (2005) 605–625

equal to 0, because no traﬃc is generated at i, i.e., dit = 0 for any destination t. The network operation should enforce every edge node to comply with the traﬃc allocation fd ait g.

5.5. Implementation issues and discussions 5.5.1. Use of bits in ToS/DS ﬁeld The ToS ﬁeld was ﬁrst deﬁned in RFC 791 [8], which describes one entire byte (8 bits) in an IP header. LetÕs write it as [b0b1b2b3b4b5b6b7]. The three most signiﬁcant bits of the ToS ﬁeld, [b0b1b2], are called the precedence that deﬁnes the priority or importance of an IP packet. For example, Precedence 7 ([b0b1b2] = [1 1 1]) refers to ‘‘Network Control,’’ Precedence 6 ([b0b1b2] = [1 1 0]) refers to ‘‘Internetwork Control,’’ and Precedence 0 ([b0b1b2] = [0 0 0]) refers to ‘‘Routine.’’ The rest of the ToS ﬁeld is used as follows: [b3] speciﬁes delay (0 = Normal; 1 = Low), [b4] speciﬁes throughput (0 = Normal; 1 = High), [b5] speciﬁes reliability (0 = Normal; 1 = High), and [b6b7] is reserved for future use. Since the one-byte ToS ﬁeld has been almost completely unused since it was deﬁned almost 20 years ago, DiﬀServ was proposed and the ToS ﬁeld was redeﬁned and superseded by the Diﬀerentiated Service (DS) ﬁeld to support multiple service classes [6,9]. Basically, the DiﬀServ standard maintains backward compatibility with RFC 791, but allows more eﬃcient use of [b3b4b5]. That is, Precedence 7 and 6 remains the same; Precedence 5 is redeﬁned as the ‘‘Express Forwarding (EF);’’ Precedences 4–1 deﬁne four ‘‘Assured Forwarding (AF)’’ classes; and Precedence 0 identiﬁes the best eﬀort traﬃc. In addition,

5.4. Example Fig. 5 illustrates an example of how the heuristic algorithm works. In the original topology, there are totally 6 nodes, 3 of which are edge nodes (s1, s2, t). Each link has capacity of 1 except c(k, t) = 2. d s1 t ¼ 1 and d s2 t ¼ 1 are the only traﬃc demands. At Step 1, we ﬁrst use the Inverse Capacity Short Path (InvCap) scheme to ﬁnd the overlay for Traﬃc Set 1, which is shown in the upper ﬁgure in the middle. Then we update the bandwidth usage on each link of the Traﬃc Set 1 overlay by increasing the amount of 1/2. On the ‘‘residual’’ topology, we apply the InvCap once again and ﬁnd the second overlay for Traﬃc Set 2, which is shown in the bottom ﬁgure in the middle. As we can see, both overlays have a tree structure. At Step 2, we optimize the traﬃc allocation using the two overlays we found at Step 1. By solving the linear program given in Eq. (10), we have: d 1s1 t ¼ 2=3, d 2s1 t ¼ 1=3, d 1s2 t ¼ 2=3, and d 2s2 t ¼ 1=3, as shown in the right hand side graph. The total cost is 3uð2=3Þ þ 4uð1=3Þ ¼ 5 13. It is easy to verify that this is exactly the general optimum.

Residual capacity Link capacity

1

s1

i k

1

s1

1 2

k 1

t

s1

j

1

1 j 1 Overlay for Traffic Set 1

i k

Original topology

s2

1 s2 t

t j

d s22t = 1 3

t j

Overlay for Traffic Set 2 Step 1: searching for overlays

i 2 3

2 3 d =

s2 s1

d s11t =

k

s2 t

1

s2

1

1/2

i

1

d s21t = 1 3

1

1/2

Traffic allocation: d s11t = 2 3 2 S1 to t, Traffic Set 2: d s1t = 1 3 1 S2 to t, Traffic Set 1: d s2t = 2 3 S1 to t, Traffic Set 1:

S2 to t, Traffic Set 2:

d s22t = 1 3

Step 2: optimizing traffic allocation on given overlays

Fig. 5. Example of how k-set heuristic algorithm works.

J. Wang et al. / Computer Networks 48 (2005) 605–625

[b3b4] deﬁnes further priority granularity to AF classes by specifying diﬀerent dropping probabilities. Finally, [b5] is always 0 and [b6b7] is still ignored. As we can see, both the ToS and the DS deﬁnitions ignore the last two bits ([b6b7]). We can use these two bits to identify the traﬃc sets in our kset TE method. On one hand, since these two bits are not interpreted even by DiﬀServ enabled routers, our k-set TE method will not interfere with DiﬀServ or any other existing router conﬁgurations. On the other hand, since we focus only on an intra-domain environment, conﬁguring routers to recognize the traﬃc sets (the unused bits in ToS/ DS ﬁeld) is implementable. Finally, although only 2 bits can be used (supporting up to 4 traﬃc sets), we will show later in Section 6 that 4 traﬃc sets are suﬃcient. Another choice is to use [b3b4b5] when [b0b1b2] = [0 0 0]; i.e., we use [b3b4b5] to identify traﬃc sets only for best eﬀort packets. This is because these three bits are not (or rarely) used by best eﬀort traﬃc. Now, we can only split best eﬀort traﬃc to achieve load balancing. However, in sight of the dominant amount of best eﬀort traﬃc in todayÕs Internet, it is still a feasible choice. 5.5.2. Heuristic algorithms for the k-set TE problem In Section 5.3, we proposed one heuristic algorithm for the k-set TE problem. Actually, since the routes (overlays) and traﬃc allocation are precomputed by the network operator [3], we may apply more advanced heuristic search techniques, such as simulated annealing, Tabu, or genetic algorithms, to handle the problem. However, according to our simulation results, the performance of the heuristic algorithm is not that critical. As long as the resultant k overlays are fairly good, the overall performance of the k-set TE b k ) will be suﬃmethod (with respect to the cost U ciently close to the general optimum U . This is due to the optimization of the traﬃc allocation at Step 2, after the k overlays are found. 5.5.3. Traﬃc allocation and routing table computation Our k-set TE method pre-computes traﬃc allocation (splitting ratios at each edge node) after

617

ﬁnding k virtual overlays (each for one traﬃc set). The network operator should distribute such splitting ratios to edge nodes. We suppose that each edge node is responsible for enforcing such splitting ratios. Core nodes do not need to know the traﬃc allocation because there is no traﬃc splitting there at all. Their responsibility is to classify packets into traﬃc sets and forward them to corresponding next hops based on both the destination addresses and the ToS/DS ﬁeld. This multi-ﬁeld packet classiﬁcation and routing is feasible due to the latest advances in IP router design [23,24]. There are two ways to handle the routing table construction. Remember that all the routes are actually pre-computed by the network operator. So the ﬁrst way is to distribute the routing tables directly to each node in the network. However, this centralized method may not scale well. And, it may not handle link failures well, because if failures happen, all the routing tables need to be recomputed centrally by the network operator and re-distributed to all the routers. In light of the drawbacks of the centralized way of constructing routing tables, the second way is based on a decentralized mechanism. Note that in our k-set TE method, we ﬁnd routes/overlays before we allocate traﬃc onto the routes/overlays. 9 And more importantly, only static weights are used to ﬁnd the routes/overlays. As we can see in Algorithm 1, only the link capacities {cij} and the traﬃc matrix D = {dij}, which are assumed to remain static, are needed to determine the k sets of link weights to ﬁnd the k overlays and the associated routes. Therefore, the network operator can assign the k sets of weights (which are all static) to the links in the network. The link weights are broadcasted among nodes as in a classic OSPF network. The only diﬀerence is that k weights are associated with each link instead of just one in the classic OSPF network. Then each node can reconstruct the routes for each overly (by shortest path searching based on an appropriate

9

This is opposite to the method proposed in [10] where traﬃc allocations are ﬁrst determined and then link weights are resolved by solving associated linear programs.

618

J. Wang et al. / Computer Networks 48 (2005) 605–625

set of weights) and build its own routing tables accordingly. This decentralized method can handle link failures as the classic OSPF network does. Note that this method does require a consistent tie-breaking law for every node to reconstruct the original overlays/routes, but this requirement is not diﬃcult to meet. For example, node IDÕs can be used for this purpose, assuming that they are unique in the same network domain. 5.5.4. Coexistence with legacy nodes In the k-set TE method, we assume that all nodes can interpret the ToS/DS ﬁeld in packetsÕ IP headers and ﬁnd correct next hops based on both the destination addresses and the bits. But for legacy nodes, they may not be able to do so. If there are legacy nodes in a network, we should use the k-set TE method more carefully. One way to handle this issue is to add more constraints into our linear programs to accommodate such legacy nodes. The performance might not be as good as in a network where no legacy node exists, though. Another interesting issue associated with the coexistence is how to ﬁnd critical nodes on which we deploy our new method to obtain maximal performance gain, if we are allowed to deploy our k-set method only on part of the nodes in a network. We will do more research on both issues in our future work.

6. Performance evaluation of the heuristic k-set TE method 6.1. Simulation setup We use the topology generator BRITE [21] to randomly generate our network topologies, where the Waxman model is used and the nodes are placed according to the heavy-tail distribution. Edge nodes are randomly selected from the entire node set. Link capacities are randomly generated between the interval [10, 1024]. The traﬃc demand generation among all the edge nodes is more complicated. We use two diﬀerent methods in our study: one is based on random demand generation using uniform distributions;

the other uses random demand generation with modeling of hot spots [3]. The second method is supposed to be more realistic and have larger variation. Method 1. In this method, for each pair of edge nodes x and y, we ﬁrst perform a Dijkstra-based routing to ﬁnd the capacity of the widest path from x to y, denoted by maxCap(x, y). Then the traﬃc demand from x to y is randomly generated within [1.0, maxCap(x, y)] following uniform distribution. Method 2. We use the demand generating method proposed by Fortz and Thorup in [3]. Brieﬂy, for each node x, we generate two random numbers Ox, Dx 2 [0,1]. Further, for each pair (x, y) of nodes, we pick a random number C(x, y) 2 [0,1]. Then the traﬃc demand from x to y is determined by aOx Dy C ðx;yÞ e dðx;yÞ=2D ; where a is a scaling parameter, d(x, y) is the Euclidean distance from x to y, and D is the largest Euclidean distance between any pair of nodes. ‘‘The Ox and Dy model that diﬀerent nodes can be more or less active senders and receivers, thus modeling hot spots on the net’’ [3]. The factor e d(x, y)/2D models the fact that there is higher demand between close pairs of nodes. This method generates traﬃc matrices with higher variations. To simulate diﬀerent traﬃc demand load, we ﬁrst use one of the above methods to generate a base traﬃc matrix. Then we scale the base traﬃc matrix up or down by diﬀerent ratios to generate the diﬀerent traﬃc matrices with the load that we want. The same scaling scheme was also used in [3] and [4]. Since the distribution of link capacities is critical in traﬃc engineering, we design the ﬁrst two simulation cases by varying the assignment of link capacities. In both cases, the total number of nodes is 100, 70% of which are edge nodes. The edge nodes are randomly selected from the entire node set. The number of links is 400. In the ﬁrst case, the link capacities are randomly generated between the interval [10, 1024]. We use this case

J. Wang et al. / Computer Networks 48 (2005) 605–625

to mimic a very heterogeneous link capacity environment. In the second case, we let all the links be equally capacitated, which is an extreme case of very high homogeneity. During the proof of Theorem 2, we suspect that our approach prefers heterogeneous networks and might not work very well in a homogeneous environment. However, the simulation results show clearly that our k-set TE method performs consistently well in both cases, even when k is set to be small. We test k = 2 and k = 4 for each case. When k = 4, the k-set TE method almost achieves the general optimum U (where arbitrary traﬃc splitting is allowed). In the third simulation case, we increase the number of nodes, edge nodes, and links to 200, 100, and 800, respectively, to study the performance in a larger scale network. In this case, we also test the performance under two diﬀerent trafﬁc matrix setups. Table 2 summarizes the three simulation cases that we will focus on. To evaluate the performance, we are interested in not only the overall cost of the network U (which is the objective to optimize), but also the maximum and variance of link utilization. We ﬁnd that besides U, the variance of link utilization is also a good metric to evaluate how well a TE approach performs. A good approach should be able to balance traﬃc among all links, hence the utilization variance should be small. Again, we use ILOG CPLEX as a tool to solve the traﬃc allocation linear program at Step 2 in Section 5.3. 6.2. Simulation results and analysis 6.2.1. Case 1. A heterogeneous environment with link capacities randomly generated within [10, 1024] Fig. 6 shows the simulation results of Case 1. The x-axis represents the overall input traﬃc load that is scaled and normalized by the total amount of capacity (sum of capacities of all links). In this case, ﬁrst we observe that the hop-count shortest path method (‘‘Hop-cnt SP’’ in Fig. 6) performs extremely bad. The max link utilization goes beyond 1, even when the traﬃc demand is very light. We trace the result and ﬁnd that, there are

619

a few ‘‘stringent’’ links in the network that have very small capacities but are in critical positions; i.e., they are on shortest paths between some edge nodes. Therefore, even when traﬃc load is light, these stringent links are congested, making the overall cost and max link utilization large. This result demonstrates the eﬀectiveness and necessity of traﬃc engineering. Fig. 6a presents the cost U versus scaled traﬃc load. For comparison, we also show the result of the Inverse Capacity OSPF (InvCap) in the graph. Actually, it equals to the k-set method with k = 1. All curves have the exponential-like shape because of the cost function we use (Eq. (1)). As we can see, our k-set method (k = 2, 4) constantly outperforms both Hop-cnt SP and InvCap. Moreover, when we set k = 4, it approximates the general optimum. Even when k = 2, its performance is still acceptable. Fig. 6b and c show the maximum and variance of link utilization, respectively. They also conﬁrm the eﬀectiveness of the k-set method very well. Lastly, remember that the extra overhead incurred by the k-set method is only 1 or 2 bits in IP header if k = 2 or 4. 6.2.2. Case 2. A homogeneous environment with equal link capacities We design this simulation case because we suspect that the k-set method might degrade if link capacities are homogeneously distributed. In this case we consider the very extreme case where all links have the same capacity. Then the Hop-count SP and the InvCap are equivalent because all links have the same weight in terms of the reciprocal of their capacities. This is conﬁrmed in Fig. 7 where both methods produce the same curves with respect to all metrics. Similarly, the k-set method successfully approximates the general optimum under all traﬃc load conditions with respect to the cost, max link utilization, and the variance of link utilization. All results in the three sub-ﬁgures verify that the k-set method is still eﬀective even when links have homogeneous capacities. 6.2.3. Case 3. A larger scale network with link capacities randomly generated within [10, 1024] We study performance issues on larger scale networks in this part. The topology setup is the

620

J. Wang et al. / Computer Networks 48 (2005) 605–625 5

6

x 10

5

Cost (φ )

4

3

2 OPT Hop-cnt SP InvCap 2-set 4-set

1

0 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Total traffic demand (normalized by the total capacity)

(a) 2.5

2 OPT Hop-cnt SP InvCap 2-set 4-set

1.8 1.6 Variance of Link Utilization

Maximum link utilization

2

1.5

1

OPT Hop-cnt SP InvCap 2-set 4-set

0.5

0 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Total traffic demand (normalized by the total capacity)

(b)

1.4 1.2 1 0.8 0.6 0.4 0.2

0.4

0 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Total traffic demand (normalized by the total capacity)

(c)

Fig. 6. BRITE 100 nodes, 70 edge nodes and 400 links, very heterogeneous link capacities. (a) Cost (U), (b) maximal link utilization and (c) variance of link utilization.

same as in Case 1 except that the number of nodes, the number of edge nodes and the number of links are increased to 200, 100, and 800, respectively. Link capacities are randomly picked in [10, 1024]. In order to examine the performance of the k-set TE method under diﬀerent traﬃc distributions, we design two sub-cases (Case 3.1 and Case 3.2) where diﬀerent traﬃc demand generation methods are used. In Case 3.1, traﬃc demand between each pair of edge nodes, x and y, is randomly generated within a certain interval [1.0, maxCap(x, y)] uniformly (see Method 1, Section 6.1); while in Case

3.2, the more sophisticated scheme is used, with larger variations and the modeling of hot spots (Method 2, Section 6.1). Fig. 8a and b depict the simulation results for Case 3.1 and Case 3.2, respectively. As we can observe from both ﬁgures, the k-set TE method achieves very close results to the general optimum (OPT) when k is set to 2 or 4, no matter which trafﬁc distribution is used. We can then conclude that our k-set TE method performs consistently well even in large-scale networks under diﬀerent traﬃc distributions.

J. Wang et al. / Computer Networks 48 (2005) 605–625

621

5

2

x 10

OPT Hop-cnt SP InvCap 2-set 4-set

1.8 1.6 1.4

Cost ( φ)

1.2 1 0.8 0.6 0.4 0.2 0 0

0.05 0.1 0.15 0.2 0.25 0.3 0.35 Total traffic demand (normalized by the total capacity)

0.4

(a) 2.5 OPT Hop-cnt SP InvCap 2-set 4-set Variance of Link Utilization

Maximal Link Utilization

2

1.5

1

0.05

OPT Hop-cnt SP InvCap 2-set 4-set

0.5

0 0

0.1

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Total traffic demand (normalized by the total capacity)

Total traffic demand (normalized by the total capacity)

(b)

(c)

0.4

Fig. 7. BRITE 100 nodes, 70 edge nodes and 400 links, equal link capacities. (a) Cost (U), (b) maximal link utilization and (c) variance of link utilization.

6.2.4. Some discussions (1) An interesting observation from Figs. 6b and 7b is that, the general optimal method (OPT) does NOT always achieve the optimal max link utilization. But it is reasonable because the general optimal method is optimal only with respect to the overall network cost U, not the max link utilization. Actually, Fig. 9 illustrates a very simple but convincing example. If every link has capacity 1 and the only traﬃc demand from s to t is 1, then the general optimal TE solution is to put 1/3 traf-

ﬁc onto hs, a, ti and 2/3 onto hs, ti. The total cost is 2 (U = 2u(1/3) + u(2/3) = 2) and the max link utilization is 0.667. However, if we put 1/2 traﬃc onto both paths, then we have total cost of 2.5 (U = 3u(1/2) = 2.5), but we get a lower max link utilization 0.5. In fact, not only the max link utilization, if we consider some other metrics, such as the average link utilization, then the OPT method does not necessarily achieve optimal results. (2) All simulation results show that the k-set method can achieve performance very close

622

J. Wang et al. / Computer Networks 48 (2005) 605–625 6

8

7

8 OPT Hop-cnt SP InvCap 2-set 4-set

6

6

5

5

4

3

2

2

1

1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

OPT Hop-cnt SP InvCap 2-set 4-set

4

3

0 0

x 10

7

Cost (φ)

Cost (φ)

6

x 10

0 0

0.1

Total traffic demand (normalized by the total capacity)

0.2

0.3

0.4

0.5

0.6

0.7

Total traffic demand (normalized by the total capacity)

(a)

(b)

Fig. 8. Larger scale network: BRITE 200 nodes, 100 edge nodes and 800 links, heterogeneous link capacities, with diﬀerent traﬃc distributions. (a) Cost (U): the traﬃc demands among edge nodes are randomly generated according to a uniform distribution. (b) Cost (U): the traﬃc demands among edge nodes are randomly generated with hot spots [3] (larger variations).

Optimal w.r.t. cost 0.333

s

a 0.667

Non-optimal w.r.t. cost

0.333

0.5

t

s

a 0.5

0.5

t

Fig. 9. Example to show why general optimal method does not achieve optimal max link utilization.

to the general optimum when k = 4. This strongly suggest that in real OSPF networks, k = 4 is large enough to obtain good nearoptimal performance and no larger k is needed. (3) Finally, in terms of the route computation and traﬃc allocation, the k-set method enjoys very low overhead. Actually, at Step 1 (searching overlays), it takes only k times as much time as the DijkstraÕs algorithm to converge; at Step 2 (traﬃc allocation), it also takes only polynomial time to solve the linear program. In our 100-node cases, typically it takes seconds for both steps to converge. The worst case of Step 2 takes several minutes. Our computation platform was the SUN Blade-1000 workstation with a 600 MHz SPARCV9 CPU.

7. Conclusion and future work Traﬃc engineering is crucial for load balancing and QoS provisioning in todayÕs Internet, especially in OSPF networks. Almost all existing approaches rely on the ECMP feature of OSPF and split traﬃc anywhere in a network. In this paper, we propose a novel, edge-based traﬃc engineering approach for OSPF networks, which classiﬁes and splits traﬃc only at the network edge instead of in the core. The performance issue was addressed both theoretically and by simulation. The results showed clearly that our approach performs well even with a very small number of traﬃc sets (2 or 4). Finally, we conclude our work by presenting our contributions in this paper: (1) We presented a new approach to traﬃc engineering in OSPF networks. It is simple, ﬂexible and eﬃcient. It strictly

J. Wang et al. / Computer Networks 48 (2005) 605–625

follows the ‘‘smart edge, simple core’’ design rule of the Internet. It is also beneﬁcial to TCP applications because it keeps ‘‘same path for same ﬂow’’. (2) We proved that our k-set approach achieves the general optimum if k is large enough. On the other hand, if k is given small, we provided a constant worst-case performance bound. (3) We developed a new heuristic algorithm to solve the k-set traﬃc allocation and routing problem behind our new approach. Simulation results conﬁrmed its eﬀectiveness. For the future work, we would like to investigate the two-fold coexistence problem with legacy nodes in a network. On one hand, we will modify our approach to accommodate legacy nodes where the new approach is not deployed yet. Performance will be evaluated in the new scenario. On the other hand, if we are allowed to choose only some of the nodes in a network to deploy our new approach, we will ﬁnd critical nodes to achieve maximal performance gain. Acknowledgment The authors would like to thank William Yurcik, National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign, for his valuable comments and discussion. Appendix A. Proof of Theorem 1 One straightforward observation is given in Lemma 1. Lemma 1. If we can use k traffic sets to achieve the general optimal traffic allocation (i.e., U k ¼ U ), then we can always use any k 0 > k traffic sets to achieve the general optimum, too. This is because we can pick up any set of traﬃc and split it further into two traﬃc sets virtually. But physically, these two traﬃc sets follow the same routes and the overall traﬃc allocation remains intact. Thus, the general optimum is still achieved by the k + 1 traﬃc sets. We can iterate this splitting process until we reach k 0 traﬃc sets.

623

Then, we will show that the k-set TE method can achieve U (the general optimum of arbitrary traﬃc splitting) if k is large enough as follows. Proof of Theorem 1. First, we prove the single destination case. Assuming the general optimal traﬃc allocation is given and there is only one destination node t, then we can easily construct a traffic graph (t-graph) by eliminating links without trafﬁc allocation from the original topology. We then perform the following iterative operations. We pick up any source node, say s1, and ﬁnd a path from s1 to t. Such path must exist because otherwise the ﬂow conservation rule is broken. Along the path we ﬁnd the ‘‘bottleneck’’ link(s), say e1, where the trafﬁc allocation is the smallest among all the on-path links. We assign the path and the trafﬁc allocation to a trafﬁc set, say Trafﬁc Set 1, and reduce the same amount of trafﬁc allocation on each on-path link (so the ﬂow conservation rule still holds after the reduction). The ‘‘bottleneck’’ link is reduced to 0 so it is eliminated from the t-graph. We can iterate this process until all links are eliminated in t-graph. At this time we have achieved the general optimal trafﬁc allocation by using the trafﬁc sets we have found. As we can see, at each iteration we assign a new trafﬁc set and eliminate at least one link. Since there are at most jEj links in the t-graph, so the number of trafﬁc sets we have used must be smaller than or equal to jEj. Now let us consider multiple destinations. Since we are under the destination-based routing paradigm, traﬃc allocation and routing for diﬀerent destinations does not interfere with each other. Therefore, we can repeat the above process for one destination at a time. Then, we may ﬁnd that for destination ti, k(ti) trafﬁc sets are needed to achieve the general optimum. Therefore, by Lemma 1, max{k(ti)} must be able to achieve the general optimum for the entire network. Finally, because k(ti) 6 jEj, we can guarantee to ﬁnd a k, k 6 jEj, which achieves the general optimum. h Actually, we can use another way to prove the multi-destination case by adding a virtual destination node T to the t-graph and repeat the proof for the single destination case.

624

J. Wang et al. / Computer Networks 48 (2005) 605–625 All the |E| edges are used to balance the traffic from s to t.

s

t

TE with arbitrary splitting

Only k edges (out of |E| ) can be used to balance the traffic from s to t.

s

t

k-class TE method

Fig. B.1. Worst case scenario where only k out of jEj edges can be used by the k-set TE method to balance traﬃc.

Appendix B. Proof of Theorem 2 Now we prove that there is a worst case performance bound of the k-set TE method and the bound is tight. We prove it only for the singlesource single-destination (SSSD) case, because all other cases can be converted into the SSSD case by adding one virtual source S, one virtual destination T, and a set of virtual links to connect all sources and destinations to S and T, respectively. Proof of Theorem 2. We construct an SSSD case shown in Fig. B.1, where all links have the same capacity, c. Assuming the cost and objective functions given in Eq. (1) are used, and the total demand from s to t is d. In the general optimal trafﬁc engineering with arbitrary splitting, the total cost will be

d d d U ¼ jEju ¼ ; P jEj cjEj cjEj c because according to Eq. (1), the smallest value of the piece-wise slope of the cost function is 1 (i.e., u 0 (x) P 1 and u(0) = 0). Similarly, we can calculate the cost if the k-set TE method is used:

d d d U k ¼ ku 6 5000k ¼ 5000 ; kc kc c because according to Eq. (1), the largest value of the piece-wise slope of the cost function is 5000 (i.e., u 0 (x) 6 5000 and u(0) = 0). Then we can easily ﬁnd out that the worst case performance bound is 5000, i.e.,

U k 6 5000: U It is easy to verify that this is the worst case scenario, because if the links have diﬀerent capacities or are shared by multiple paths, then our k-set TE method can obtain better results by choosing better paths (with larger capacities). Finally, since the worst case scenario is reachable, we have proved both the upper bound and the tightness of the bound. h References [1] J. Moy, OSPF Version 2, RFC 2328, April 1998. [2] G. Apostolopoulos, D. Williams, S. Kamat, R. Guerin, A. Orda, T. Przygienda, QoS Routing Mechanisms and OSPF Extensions, RFC 2676, August 1999. [3] B. Fortz, M. Thorup, Internet traﬃc engineering by optimizing OSPF weights, in: Proceedings of IEEE INFOCOM 2000, Tel-Aviv, Israel, 2000. [4] A. Sridharan, R. Guerin, C. Diot, Achieving near-optimal traﬃc engineering solutions for current OSPF/IS-IS networks, in: IEEE INFOCOM 2003, San Francisco, CA, 2003. [5] A. Riedl, Optimized routing adaptation in IP networks utilizing OSPF and MPLS, in: Proceedings of IEEE ICC 2003, Anchorage, Alaska, 2003. [6] S. Blake et al., An Architecture for Diﬀerentiated Services, RFC 2475. [7] K. Nichols, V. Jacobson, L. Zhang, A Two-bit Diﬀerentiated Services Architecture for the Internet, RFC 2638. [8] Internet Protocol, RFC 791. [9] F. Baker, D. Black, S. Blake, K. Nichols, Deﬁnition of the Diﬀerentiated Services Field (DS Field) in the IPv4 and IPv6 Headers, RFC 2474. [10] Y. Wang, Z. Wang, L. Zhang, Internet Traﬃc engineering without full mesh overlaying, in: Proceedings of INFOCOM 2001, 2001.

J. Wang et al. / Computer Networks 48 (2005) 605–625 [11] Y. Wang, Z. Wang, Explicit routing algorithms for internet traﬃc engineering, in: Proceedings of ICCCNÕ99, 1999. [12] D. Awduche et al., Requirements for Traﬃc Engineering Over MPLS, RFC 2702, September 1999. [13] J. Wang, S. Patek, H. Wang, J. Liebeherr, Traﬃc engineering with AIMD in MPLS networks, in: Proceedings of the Protocols for High-Speed Networks Workshop, 2002, pp. 192–210, citeseer.nj.nec.com/512747.html. [14] S. Yilmaz, I. Matta, On the scalability-performance tradeoﬀs in MPLS and IP routing, in: Proceedings of SPIE: Scalability and Traﬃc Control in IP Networks, 2002. [15] J. Wang, K. Nahrstedt, Hop-by-Hop Routing Algorithms For Premium Traﬃc, ACM Computer Communication Review 32 (5) (2002) 73–88. [16] J. Wang, K. Nahrstedt, Hop-by-hop routing algorithms for premium-class traﬃc in DiﬀServ networks, in: Proceedings of IEEE Infocom 2002, 2002. [17] J. Wang, L. Xiao, K.-S. Lui, K. Nahrstedt, Bandwidth sensitive routing in DiﬀServ networks with heterogeneous bandwidth requirements, in: Proceedings of IEEE ICC 2003, Anchorage, Alaska, 2003. [18] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, Deriving traﬃc demands for operational IP networks: Methodology and experience, IEEE/ACM Transactions on Networking 9 (2001) 265–278. [19] Y. Zhang, M. Roughan, C. Lund, D. Donoho, An information-theoretic approach to traﬃc matrix estimation, in: Proceedings of ACM SIGCOMMÕ03, Karlsruhe, Germany, 2003. [20] Y. Zhang, M. Roughan, N. Duﬃeld, A. Greenberg, Fast accurate computation of large-scale IP traﬃc matrices from link loads, in: Proceedings of ACM SIGMETRICSÕ03, San Diego, CA USA, 2003. [21] A. Medina, A. Lakhina, I. Matta, J. Byers, Brite: an approach to universal topology generation, in: Proceedings of the International Workshop on MASCOTS Õ01, Cincinnati, Ohio, 2001. [22] ILOG CPLEX, in: http://www.ilog.com/products/cplex/. [23] P. Gupta, S. Lin, N. McKeown, Routing lookups in hardware at memory access speeds, in: IEEE Infocom 1998, San Francisco, vol. 3, 1998, pp. 1240–1247. [24] P. Gupta, N. McKeown, Packet Classiﬁcation on Multiple Fields, ACM Computer Communication Review 29 (1999) 147–160.

Jun Wang received the B.S. and M.Engr. degrees in computer science and technology from Tsinghua University, Beijing, China, and the Ph.D. degree in computer science from the University of Illinois at UrbanaChampaign in 2003. From 2003 to 2004, he was a postdoctoral associate with the National Center for Supercomputing Applications (NCSA) and the Department of Computer Science

625

at University of Illinois at Urbana-Champaign. He is currently a senior security engineer at NCSA, University of Illinois at Urbana-Champaign. His research interests include computer networks and data communications, network survivability and security, network QoS, multimedia systems, and distributed systems.

Yaling Yang received the B.S. degree in telecommunications from the University of Electronic Science and Technology of China, Chengdu, Sichuan, China, in 1999. She is currently a Ph.D. student at University of Illinois at Urbana-Champaign. Her research interests include network QoS and resource management.

Li Xiao (SÕ02) received his B.S. and M.Eng. degrees in automatic control from Tsinghua University, Beijing, China. He also received the M.S. degree in computer science from the University of Illinois at Urbana-Champaign, where he is currently a Ph.D. candidate. His research interests are computer networks and data communication, with focus on network routing, QoS and network resilience.

Klara Nahrstedt is an associate professor at the University of Illinois at Urbana-Champaign, Computer Science Department. Her research interests are directed towards multimedia middleware systems, quality of service(QoS), QoS routing, QoS-aware resource management in distributed multimedia systems, and multimedia security. She is the coauthor of the widely used multimedia book ÔMultimedia: Computing, Communications and ApplicationsÕ published by Prentice Hall, recipient of the Early NSF Career Award, the Junior Xerox Award, and the IEEE Communication Society Leonard Abraham Award for Research Achievements. Since 2001, she is the editor-in-chief of the ACM/Springer Multimedia Systems Journal, and the Ralph and Catherine Fisher Associate Professor. He received her BA in mathematics from Humboldt University, Berlin, in 1984, and M.Sc. degree in numerical analysis from the same university in 1985. She was a research scientist in the Institute for Informatik in Berlin until 1990. In 1995 she received her Ph.D. from the University of Pennsylvania in the Department of Computer and Information Science.

Edge-based traffic engineering for OSPF networks

Edge-based traffic engineering for OSPF networks

Recommend Documents