ITC 18 / J. Charzinski, R. Lehnert and P. Tran-Gia (Editors) 9 2003 ElsevierScience B.V. All rights reserved.
Optimal LSP Placement with
QoS C o n s t r a i n t s
11
in D i f f S e r v / M P L S
Networks J. M. Garcia ~, A. Rachdi~and O. Brun ~* ~LAAS - CNRS, 7 Avenue du Colonel Roche, 31077 Toulouse Cedex 4 E-mail: { brun, jmg, marachdi }~laas.fr A key problem in the design of communication networks is the planning of bandwidth allocation to communication demands. Shortest path routing does not lead to good network performances due to the difficulty to obtain a good load-balancing in the network. The ability to define LSP routes in MPLS core networks allows traffic engineerings approaches and precise optimisation of bandwidth utilisation and QoS. Moreover the optimisation process can also provides optimal backup routes in case of link failures. In this paper we propose a new 3-level optimisation approach: ILSP (Iterative Loading Shortest Paths), OLS (Optimal Load Sharing) and ACO (Ant Colony Optimisation). This method is applied to the problem of optimal placement of primary and backup LSPs in DiffServ/MPLS networks.
1. I n t r o d u c t i o n With the recent evolution of the Internet into a global communication infrastructure, there is a growing need to support more sophisticated services than the traditional besteffort service. In order to meet a broad range of Quality-of-Service (QoS) requirements, the next generation Internet will combine the Multi Protocol Label Switching technique (MPLS) and the Differentiated Services architecture (DiffServ). Together, MPLS and DiffServ provide scalable QoS support by aggregating individual IP flows into aggregate flows and then providing different service levels to those aggregate flows according to packets class of service. In MPLS, labels are attached to IP packets at an MPLS edge router. These labels are then used by core network MPLS routers to send the IP packets on predefined paths, called Label Switched Paths (LSP), according to their QoS requirements. The LSP between two routers can be the same as the layer 3 hop-by-hop route. The LSP can also be obtained using constraint-based routing which considers not only network topology, but also resource availability of the links and possibily other policies specified by the network administrator (e.g. avoid satellite links for delay-sensitive flows). The ingress MPLS *This work was supported by the "R~seauNational de Recherche en T~l~communication" (RNRT) within the OPIUM and ESQUIMAUX projects.
12 router can also specify an explicit route (ER-LSP) for the LSP. Finally, the ingess router can map different traffics onto different LSPs depending on the Forwarding Equivalence Class (FEC) and on the class of service (COS) The ability to setup off-line computed ER-LSPs is one of the most useful features of MPLS since it allows to route traffics on optimal paths taking into account not only administrative constraints, but also traffic engineering rules and QoS requirements. In this paper, we address the problem of the optimal LSP placement. This problem can be stated as follows: 9 G i v e n a network of N nodes and M links represented by a graph G = [X, U], the link bandwidth capacities C~ for all u E U and a set of K LSP placement demands, where each demand k = 1 , . . . , K is defined by its source node s(k), its destination node t(k), its class of service c(k) and its bandwidth requirement d k, 9 F i n d one and only one route for each demand k = 1 . . . K so that the bandwidth requirements of the demands and the administrative constraints are simultaneously satisfied within the resource capacities of the links. In other words, if we let p(k) be the number of paths in G having endpoints s(k) and t(k), and P~ be the ith path between source node s(k) and destination node t(k), for all i -- 1 . . . p(k), then the problem amounts to finding a set of mono-routing decision variables xjk E {0,1} such that constraints on link capacity and flow conservation are satisfied: K p(k) E
k=l j=l
p(k)
k d k <_ C ~
,
u=l..
.
M
and
~xj
j=l
k dk = dk
,
k = 1
...
K
(1)
where we have used the same notation to denote the path pjk and the vector pk defined by p k ( u ) = l if u E pk and 0 otherwise. This combinatorial problem is known to be NP-hard [1]. Several heuristics were developed to solve similar problems for virtual paths placement in ATM networks [2,4-6,3,7,1]. With respect to these prior works, the main advantage of the 3-level LSP placement algorithm presented in this article is that it takes into account the LSPs QoS requirements and provide an even distribution of traffic on network links. Moreover, another important feature of this algorithm is that in case of allocation failure of one or more LSP (lack of bandwidth), explanations and correcting measures are provided. 2. T h e T h r e e - L e v e l L S P P l a c e m e n t A l g o r i t h m Let x k = (Xjk)j=l...p(k) and x = (xk)k=l...K. The vector x represent a solution of the problem. Moreover, let yk _ X~j pk(u)xj k d k and y~ = x~K=I yk. Then yk is the capacity used by LSP k on link u, while y~ denotes the whole capacity used by all LSPs on link u. We define the cost of a solution x as follows: M
-
F(v,,, G ) u--1
(2)
13 where the function F is given by (A1 and A2 > > A1 are two adjustable parameters):
F(y,C)-{AlY2 Aly2
+A2(y-C)
if 2 if
y<_C y>C
(3)
The problems amounts to find a solution x which minimize F(x) and satisfy the constraints. The algorithm has three main steps: ILSP, OLS and ACO.
2.1. Iterative Loading Shortest Paths (ILSP) The efficiency of any LSP placement algorithm will be greatly reduced by the huge number of paths connecting the end-points of each demand in a real-sized network. Therefore, we propose to select only a subset of candidate paths for each LSP k = 1 . . . K. Let "Pk be the set of candidate paths for LSP k. The general idea of the path selection algorithm we have used is to progressively load the network with fractions of demands in R iterations (e.g. R = 10). At each iteration j, a candidate path P~, is found for each LSP k. At the beginning of each iteration j, we have t2j = {k I k = 1 . . . K}. During iteration j, the following steps are repeated K times: 1. Select a LSP k E f~j according to a uniform distribution, 2. Route demand dk/R on its shortest path pjk according to the metric defined by F(yu, Cu) (Dijkstra's algorithm). This path is the path which leads to the smallest cost increase. 3. Do Pk := T'k U {pk, } and ftj "= aj - {k}. 4. Update link loads Yu by propagating demand dk/R along path pk,, i.e. increment Yu by dk/R for all link u E P~,. This path selection algorithm allows to greatly reduce the number of paths since it yields at most R paths by demand (the same path can be discovered several times). 2.2. O p t i m a l l o a d - s h a r i n g r o u t i n g (OLS) The optimal load-sharing routing can be formulated as follows: M
F(x) = ~ F(yu, Cu)
Minimize
u--1
subject to"
(4)
p(k) k--1
E j=l
Xj
k>0
Xj__
,
j=l
k--1. ...
. .
K
p(k) , k = l
...
K
where p(k) - 7)kl has been computed in the ILSP step. The problem 4 can be solved using any feasible directions method, for instance the projected gradient algorithm. It is very easy to show that iteration q of the projected gradient algorithm proceeds as follows:
a(q + xj
1) = xj(q) k
{
0C
- L, (p(k) - l) ox]
(5) iCj
14 The projected gradient algorithm allows to determine very efficiently the optimal loadsharing routing solution ( ~ ) (optimal with respect to the subset of selected paths). However, this solution is not a feasible solution of the LSP placement problem since LSPs are routed on several routes. Nevertheless, this load-sharing solution provides a good starting point to an ACO algorithm. Moreover, it allows to reduce the combinatory of the problem by removing the paths pk such that =kxj- 0 from the set of candidate paths Pk.
2.3. Ant Colony Optimization (ACO) The foraging behavior of real ants has been used by several authors to define a new population-based heuristic approach to combinatorial optimization problems, allowing positive feedback (equivalent to pheromone) to be used as the primary search mechanism [8-10]. In our ACO algorithm, ants recursively build solutions to the LSP placement problem by a random exploration of the solution space. The initial probability distribution of ants is initialised by the optimal load-sharing values x~ found in the OLS step. During iteration t, each ant 1 = 1 , . . . , m builds a solution in K steps by routing demands on a LSP by LSP basis, starting from a randomly chosen LSP and moving from LSP to LSP according to a probabilistic transition rule. More precisely, during iteration t ant 1 iterates on the following steps until all flows are routed (at the beginning of iteration t,
~ [ = {k l k = l . . . K } ) .
1. Random selection of LSP k E gt~ to be routed according to a uniform distribution. 2. Random selection of a path to route LSP k:
Plant l routes LSP k on path pk] = p(k)
(6)
i=1
where:
9 rlk = 1/~ue~k F(yu + d k, Cu) is the increase in cost due to the routing of LSP k on path P~. This heuristic information represents the heuristic desirability of routing LSP k on path pk.
9 Tk(t) is the amount of virtual pheromone trail associated to the routing of LSP k on path pjk. Pheromone trail is updated at the end of each iteration t and is intented to represent the learned desirability of routing LSP k on path pk. It reflects the experience acquired by ants during problem solving. The two adjustable parameters a and/3 control the relative weights of trail intensity ~'k(t) and visibility r/k. 3. Update of link loads by propagating the demand d k along the randomly selected path
P/it(t).
15 Each ant 1 repeats K times the previous steps to build a solution. Once all ants have build a solution, the amounts of virtual pheromone trail associated to the routing decisions are updated in the following way:
m Tk (t + 1) -- (1 -- p) Tk (t) + ~ ATk'' (t)
(7)
l=1
where p is the coefficient of decay of the trail intensity (to ensure an efcient solution space exploration), and AT~ 't (t) is the quantity of pheromone associated by ant 1 to the routing of LSP k on path pjk; It depends on the cost of the solution build by this ant:
i Q/EMr(
dq, Cu)
E
if
Pf:'(t)= pk (s)
0
otherwise
where the denominator is the cost of the solution build by ant 1 and Q is a parameter (as noted in [8] for the TSP problem, the value of Q only weakly influences the final result; it is set to the value of the cost of the optimal load-sharing routing solution). Initially, the pheromone trail T~(0) are set to To ~ , where To is a small positive constant and --kxjis the fraction of LSP k routed on path pjk by the optimal load-sharing solution. 3. M P L S QoS M o d e l w i t h DiffServ R o u t e r s The previous model can be extended to a problem where LSPs are subject to QoS constraints. The QoS constraints described below deal with end-to-end delays and end-to-end losses in a DiffServ-capable domain. Let us denote c = 1 . . . 4 the four traffic classes that are scheduled on DiffServ router interfaces. Class c = 1 corresponds to the EF (Expedited Forwarding) DiffServ traffic class. Classes c = 2, 3, 4 correspond to DiffServ classes AF1, AF2, AF3 (Assured Forwarding) and Best-Effort.
h : m .... \
~
- l
}
-
]
OutputInterface2 Output Link2
Figure 1. Queueing model of a network resource.
In the following we assume that 4 queues are used for each interface. As depicted in figure 1, packets arriving at a router are classified according to their MPLS label and/or to their
16 DSCP field (Differentiated Service Code Point) and forwarded to the appropriate queue of the output interface. The queues have different scheduling priorities. Class 1 packets are forwarded to the first queue which is scheduled with priority queueing (PQ). Classes 2, 3 and 4 packets are scheduled using the W F Q algorithm with respective weights a2, a3, a4. Let demand k belongs to a class c traffic. Let T k and B k be the end-to-end delay and loss for demand k. Under an independance assumption, T k and B k can be computed in terms of delays and losses on each router interface belonging to path j as follows:
Tk = E Xjk E T~c and J
B ~ = 1 - ~ xjk I I
uel%k
J
(1 - B~)
(9)
uePf
If T ~ and B~ denote respectively the bounds on end-to-end delay and loss (maximum acceptable values) for class c, then T k and B ~ have to satisfy 9 T k _< T ~(k) and B k _< ~ ( k ) . Let y~C = ~kec ~j/~eP~ k xjk d k denote the total rate of the traffic belonging to class c and going through link (or interface) u and y~ = y~ + y~ + y3 + y4 the total traffic rate on that link. Let z~ denote the average load of the queueing system associated to class c and link interface u (average number of packets), and let zu = zu1 + z~2 + z~3 + z~4 be the total average load of interface u. Recent measures on traffic aggregation (voice and video codecs) within the EQUIMAUX projet have shown that a superposition of more than 15 traffics can be well approximated by Poisson traffics under nominal utilisation (less than 40%). Such an approximation is well suited to core network modeling. Therefore, in the following, simple M / M / 1 / N formulas will be used to evaluate z~ and z~ and the corresponding values T~. In order to compute packet loss probabilities, a rough approximation is to assume a global buffer. Let N~ be the buffer size of link u; we get:
B~ = Bu = (1 - p) . pN/(1 -- pN+l) Vc
where p = yu/C~
(10)
Now, let p~ - 1 _ y~(1 1 - Bu)/C~, and ~ 9 = y~,(1 - Bu)/C~ for i = 2 , 3, 4, where C~r = C~ _ y~ is the residual bandwidth for W F Q classes. If we define ~ = ~2~=2 4 P'u and"
22'3'4 : ~ / ( 1 - ~ )
,
22,3 = (~2 + p-~)/(1 _ p~ -_2 - p~) ~3
2~'4 = (Z~ + ~ ) / ( 1 - Z~ - ~ 4 ) , 2~ ,4
=
(~ + ~)/(1
~3
then the delay for Priority class 1 traffic at interface u is: yul
i c - z~/y~,
i where z~1 - p,~/(1 - p~)
and the delays for W F Q classes at interface u is given by (see [11,12]):
T~ = z~/y~ 2 2 with
-_2 z~2 = o~2pu/(1 - ~ 2 ) + ( 1 - c~2) [~2,3,4_ ~3,4]
(11)
17 T2
3 u3 with z~3 = ct3~3/(1 zu/y
=
T: :
4 4 z~/y~
with
z~4 =
_~4 a4p~/(1
Pu)W3 _~_ (1
-
- Pu) -_4 +
- oL3) [ 2 2,3,4 - 2 2'4]
(1 __ OL4) [22,3,4 -- 22'3]
(12)
New penalty cost functions similar to the capacity cost function F(y, C) are introduced in the criterion to take into account QoS on end-to-end delay (D~) and loss (L~) for each class c = 1, 2, 3, 4. The new QoS optimisation problem to be solved can be stated as follows: M
Minimize
K
K
F(x) - ~ F(y~, C~) + ~ Dc(k)(T ~(k), -Tc) + ~ Lk(B k, -~c) u=l
subject to:
k=l
k=l
(13)
p(k) E
X jk
_ 1
,
k-1..
K
j=l X jk_ _> 0
j-1
...
p(k) , k = l
...
K
The gradient of functions Fc(T c, ...) with respect to routing variables xjk , in the projected gradient step of the algorithm is not developed here. 4. B a c k u p r o u t e s of p r o t e c t e d L S P s Some LSPs can be protected. For such LSPs, it is not only required to find optimal primary routes but also backup routes. The primary and the backup routes have to be node-disjoint. In this section, we describe an extension of the ILSP-OLS-ACO algorithm which allows the optimal placement of primary and backup LSPs in DiffServ/MPLS networks. The ILSP procedure still proceeds in R iterations. But now the steps done at iteration j are the followings (f~j - {k I k = 1 . . . K} at the beginning of iteration j)" 1. Select a LSP k E ftj according to a uniform distribution, 2. Find the shortest path P~, according to the metric defined by F(yu, Cu) for demand dk/R, 3. Do ftj := ftj - {k}. 4. If LSP k is not protected, then Pk := Pk U {P~, }. If it is protected, check the existence of at least one node-disjoint path and add P~, to Pk only if such a path exists. 5. If the path P~, has been added to Pk, update link loads yu by propagating demand dk/R along path P~,. At the end of this modified ILSP procedure, each LSP has at most R candidate paths. For each candidate path of a protected LSP, there exist a node-disjoint path. However, it may happen for some protected LSPs that no candidate path has been selected because at each iteration j no disjoint path was found for P~,. In such a case, those LSPs cannot
18
be protected and they are not routed (they are not considered in the following). Once the set of candidate paths have been built for each L S P , the OLS and ACO steps of the algorithm proceed as descibed in sections 2.2 and 2.3. Thus, an optimal primary path pk is computed for each LSP. After that, a new procedure is introduced to compute the backup routes. During this procedure, the following steps are done : 1. Sort links u by decreasing number of protected bandwidth (or another criterion). The protected bandwidth of link u is defined as the sum of the bandwidth requirements of all the protected LSPs routed throught link u (i.e. u E p,k). 2. For each link u considered in the above mentionned order, do: 2a. Remove link u from the network (breakdown simulation), 2b. Remove all LSPs routed throught link u from the network, i.e. decrement yv by dk for all link v E p,k for each LSP k such that u E p,k. 2c. Let f u be the set of protected LSPs removed from the network due to the failure of link u. Sort flu by decreasing LSP priority as a first criterion, and by decreasing bandwidth requirement as a second criterion. 2d. For each protected LSP k E flu considered in the above mentionned order, do: - Find the shortest path ~k according to the metric defined by F(yv, Cv) for demand d k (using Dijkstra's algorithm). The path ~k becomes the backup route for LSP k, while pk is its primary route. Nk
- Update link loads Yv by propagating demand d k along path Pj,. - Do gt~ := f u - ( k } . 2e. Depropagate protected LSPs concerned by the failure of link u from backup routes. 2f. Propagate all LSPs concerned by the failure of link u on their primary path p,k. At the end of the above greedy algorithm, it is guaranteed that a backup route has been found for each protected LSP. This algorithm is very fast and allow to estimate the cost of each link failure. Moreover, QoS constraints of LSPs can also be taken into account to compute LSP backup routes using the model described in section 3. 5. R e s u l t s 5.1. B a n d w i d t h A l l o c a t i o n R e s u l t s In this section, we present the results obtained on several testbed networks with the ILSPOLS-ACO algorithm without optimizing QoS of LSPs. These results are compared with those obtained using the pCALC algorithm (also known as C S P F - Constrained Shortest Path First). pCALC is an online algorithm which dynamically computes a path for each demand one at a time. It uses informations on available resources obtained using extensions of classical IGP such as ISIS-TE or OSPF-TE. The pCALC algorithm first prune off the links that do not have the required bandwidth or affinity. Then it computes a shortest path in the remaining topology between ingress and egress routers. If several shortest paths are available, the path selected is the one with the largest residual bandwidth. Results obtained with pCALC have been obtained by sorting LSPs by decreasing
19 priority as a first criterion, and by decreasing bandwidth requirement as a second criterion and by routing the LSPs in this order. Table 1 presents the testbed networks we have used for these comparisons. Results obtained are given in table 2. The first two columns of table 2 give the cost of the solution as defined by eq. 2 and the computation time required to obtain it. The other two columns give the number of LSP placement failures and the total residual bandwidth. From the results of table 2 it can be seen that the ILSP-OLS-ACO approach is more efficient than pCALC to place LSP and moreover provides more residual bandwidth that will be available for new LSP placement. Table 1 TestBed Networks. Test1 Test2 Test3 Test4 Test5 Test6
# Nodes 5 15 15 30 30 30
~ Links 6 21 22 75 75 75
# LSPs 5 49 49 92 435 947
~ Variables 7 102 93 383 1154 3048
# Solutions 4 ~ 1012 ~ 1012 ~ 1050 ~ 10 TM > 10333
Table 2 Comparison of ILSP-OLS-ACO with pCALC algorithm. Cost Test1 Test2 Test4 Test5 Test6
9.3 2010.7 5014.9 3754.3 17425.1
ILSP-OLS-ACO Time # not Total (sec) routed Residual BW 0.02 0 2 1.11 0 327 12.29 0 1280 100.09 0 1328 124.28 0 5570
pCALC Cost 9.3 1952.6 5999.5 4866.9 18025.5
Time (sec) 0.02 0.05 0.20 0.58 1.20
~ not routed 0 3 3 0 68
Total Residual BW 2 388 1244 1324 7170
5.2. B a n d w i d t h Allocation Results with QoS We optimised the new cost function with QoS constraints on delays and used the Test2 testbed network. Results are presented in table 3 and show that the ILSP-OLS-ACO algorithm with the new cost function (delay upper bounds for the four traffic classes) improves the number of LSPs that satisfy the QoS constraints. Table 3 Introduction of QoS in the ILSP-OLS-ACO algorithm.
Test2 Test5
I L S P - O L S ' A C O (no QoS) ~ L S P out avg. out Total Res. of QoS of QoS Bandwidth 26 5.7% 327 41 1.25% 3525
ILSP-OLS-ACO with QoS ~ L S P out avg. out Total Res. of QoS of QoS Bandwidth 8 1.9% 307 22 1.5 % 3495
5.3. Backup routes In this section, we present the results obtained on test3 network with the ILSP-OLSACO algorithm in case of a single link failure (without optimizing QoS of LSPs). We assume that all LSPs are protected LSPs. Let us consider two link failures. The links have been chosen in such a way that the number of LSPs carried by the link is maximum.
20 We compare the cost of the ILSP-OLS-ACO algorithm with the cost of the pCALC algorithm. The cost of the pCALC algorithm is computed by routing all the LSPs in the reduced network (without the failed link) while the cost of ILSP-OLS-ACO is computed by re-routing only the LSP concerned by the failure on their backup routes. Table 4 show that the re-routing solution provided by ILSP-OLS-ACO allows to obtain more residual bandwidth than the complete routing done by pCALC. Table 4 Comparison of ILSP-OLS-ACO with pCALC algorithm in case of link failure. Link 1 Link 2
Cost 1729.5 1953.2
pCALC Total Residual BW 501 481
ILSP-OLS-ACO Cost TotM Residual BW 1487.2 524 1816.4 500
6. Conclusion We have presented a new approach to the LSP placement problem and experimental results obtained on several networks. The proposed 3-level optimisation method provides an efficient approach to mono-routing problems allowing LSPs placement with QoS constraints. Moreover, a better utilization of network resources can be achieved compared to a greedy approach like pCALC : more LSPs are placed and the residual capacity after placement is increased. The proposed algorithm also allows the computation of backup routes for the protected LSPs which provide an efficient re-routing solution in case of link failure. REFERENCES
1. C. Frei and B. Faltings, Bandwidth allocation heuristics in communication networks ALGOTEL'99, 53-58, Roscoff, France, 1999. 2. S. Cosares and I. Saniee, An optimization problem related to balancing loads on S O N E T rings Telecommunication Systems, 3:165-181, 1994. 3. M. Minoux, Planification d court et ~ moyen terme d'un rdseau de tdldcommunications Annales des T~l~communications, Vol. 29, No 11-12, 1974. 4. G.R. Ash Dynamic Routing in Telecommunications Networks McGraw Hill, 1998. 5. P. Chanas et al. Routing Virtual Paths in A TM Networks J. of Heuristics, 1999. 6. P. Chanas Rdseaux A TM: conception et optimisation PhD thesis, CNET, 1998. 7. S. Kirkpatrick, C.D. Gelatt and M. P. Vecchi Optimization by simulated annealing Science 220(4598):671-680, 1983. 8. E. Bonabeau, M. Dorigo and G. Theraulaz Swarm Intelligence - From Natural to Artificial Systems Oxford University Press, 1999. 9. D. Come et al. Eds. New Ideas in Optimization McGraw Hill, 1999. 10. R. Schoonderwoerd, O. E. Holland, J. L. Brutten and L. J. M. RothKrantz Ant based load balancing in telecommunication networks Adaptative Behaviour, 5(2), 1997. 11. C. Bockstal, JM Garcia Moddlisation de Rdseaux IP Multiservices 4~me Congr~s des Doctorants, EDSYS, 2003. 12. JM Garcia et al. Moddlisation des Mdcanismes DiffServ dans les Coeurs de Rdseaux MPLS RNRT Research project ESQUIMAUX, LAAS Research Report, 2002.