Optical Switching and Networking 5 (2008) 170–176 www.elsevier.com/locate/osn
A novel approach to provision differentiated services in survivable IP-over-WDM networks Smita Rai a , Lei Song b , Cicek Cavdar c , Dragos Andrei c , Biswanath Mukherjee c,∗ a Cisco Systems Inc., San Jose, CA 95134, United States b Yahoo! Inc., Sunnyvale, CA 94089, United States c University of California, One Shield Ave, Davis, CA 95616, United states
Received 4 December 2007; received in revised form 21 December 2007; accepted 24 January 2008 Available online 16 February 2008
Abstract IP-over-WDM networks are starting to replace legacy telecommunications infrastructure and they form a promising solution for next-generation networks (NGNs). Survivability of an IP-over-WDM network is gaining increasing interest from both the Internet research community and service providers (SPs). We consider a novel static bandwidth-provisioning algorithm to support differentiated services in a survivable IP-over-WDM network. We propose and investigate the characteristics of both integer linear program (ILP) and heuristic approaches to solve this problem. In the heuristic method, we propose backup reprovisioning to ensure network resilience against single-node or multiple-link failures. Illustrative examples compare and evaluate the performance of the two methods in terms of capacity-usage efficiency and computation time. c 2008 Elsevier B.V. All rights reserved.
Keywords: Differentiated service provisioning; IP-over-WDM networks; Survivability; ILP
1. Introduction Internet Protocol (IP) is becoming the convergence layer for packet-based integrated services for voice, Internet and video data. And given the increasing penetration of optical fiber infrastructure using wavelengthdivision multiplexing (WDM), IP-over-WDM networks are starting to replace legacy telecommunications infrastructure; and they form a promising solution for next-generation networks (NGNs). Noting that a fault in such a network, such as a fiber cut, can lead to huge ∗ Corresponding author.
E-mail addresses:
[email protected] (S. Rai),
[email protected] (L. Song),
[email protected] (C. Cavdar),
[email protected] (D. Andrei),
[email protected] (B. Mukherjee). c 2008 Elsevier B.V. All rights reserved. 1573-4277/$ - see front matter doi:10.1016/j.osn.2008.01.007
data and revenue loss, survivability of an IP-over-WDM network is gaining increasing interests from both the Internet research community and service providers (SPs), in order to support important services and to ensure resilience against network failures. 1.1. Related work The IP-over-WDM architecture is an attractive solution for the future Internet. Several recent works investigate aspects related to this architecture: the survivable provisioning in IP-over-WDM networks is tackled in [1–4]. The work in [1] proposes a multilayer protection scheme for IP-over-WDM networks, with the aim of achieving a tradeoff between blocking performance and signaling overhead. The authors propose strategies based on traffic requests and network
S. Rai et al. / Optical Switching and Networking 5 (2008) 170–176
policy. Also, it proposes an adaptive approach, which provides limited protection, while considering network performance and signaling. The work in [2] tackles the problem of survivable routing in IP-over-WDM networks. In order to provide failure restoration at the IP layer, the IP topology must be mapped on the WDM topology such that, if a failure occurs in the optical layer, the IP topology is still connected: this is called a survivable mapping. The authors first introduce a method for proving if a survivable mapping exists or not, then show how to trace and strengthen the vulnerable areas in the topology, and finally give a scalable algorithm to find a survivable mapping. In [3], the authors propose a scheme based on recovery at the WDM layer, where backup resource sharing between the IP and WDM layers is used to improve network utilization. In [4], two classes of services – Fully Protected (FP) and Best-Effort Protected (BEP) – are provided to endusers. The BEP traffic runs over the “excess” bandwidth in the core network, such that it does not affect the service offered to FP-users.
171
between routers were then handed off to the Layer1 design, where it was combined with the wavelength demands in the traffic matrix. As a simplifying assumption for this study, we segregated the Best-Effort (BE) IP traffic and the Guaranteed-Bandwidth (GB) IP traffic on separate 40 Gbps connections. This was done with little effect on utilization. This obviated the need to do any Layer-3 design for the GB IP traffic. 1.3. Organization The rest of the paper is organized as follows. Section 2 describes IP layer design and aggregated traffic distribution. WDM layer design is studied and explained in Section 3, where a static bandwidthprovisioning algorithm is proposed to support differentiated services, and both ILP and heuristic implementations are presented. We compare the two approaches via simulation in Section 4. Section 5 concludes the paper. 2. IP layer design and traffic distribution 2.1. IP layer design
1.2. Our contribution Our study considers three typical categories of differentiated services in an IP-over-WDM network [5], namely wavelength traffic, Guaranteed-Bandwidth (GB) IP traffic, and Best-Effort (BE) IP traffic. The percentage of IP traffic is 75%, which is very representative in modern networks. We assume 40 Gbps fiber trunks in the WDM layer. In addition, we make the following assumptions: (i) Static demands. (ii) Traffic demands are aggregated at their ingress nodes and are provisioned with wavelength lightpaths. (iii) IP utilization levels were assumed: For GuaranteedBandwidth (GB) services, the data rates “remain(s) close to the contracted amount [5]”. Consequently, we assume that 100% utilization is possible. For Best-Effort (BE) traffic, we start from the target utilization of 85% (which is a typical value in a practical network) for core IP network engineering. Our approach contains a coordinated inter-layer IP/OL (OL = optical layer) restoration strategy that relies on rapid backup reprovisioning to deal with failures and traffic surges. In this study, the designs of the Layer-1 (WDM) and Layer-3 (IP) networks were done separately. The Layer3 design was done first; the resulting 40 Gbps trunks
Both wavelength and GB IP traffic demands are supported by wavelength provisioning. Shortest-path routing is employed. The IP layer design for the BestEffort (BE) traffic component of the network load combines shortest-path routing with optimized cutthrough lightpath determination. We use a typical US nationwide network (24 nodes and 43 bi-directional links) in the study. The network topology is shown in Fig. 1 with link length (in kilometers) marked. For a given BE traffic matrix, all node-pair demands are routed on the shortest path through the network and their paths and path loads are recorded. A hop-by-hop network design is then generated. This assumes that each demand passes through each router at all intermediate sites on its path (this would result in a network where all lightpaths were single hop). From the hop-by-hop design, the aggregate traffic load on every fiber span in the network is determined. 2.2. Traffic distribution We assume a 40-40-20 traffic distribution pattern, which is representative in modern networks [5]. A 40-40-20 distribution corresponds to 40% traffic between “large” nodes (e.g., big cities), 40% traffic between “large” and “small” nodes, and 20% traffic between “small” nodes. Adjustments were made to
172
S. Rai et al. / Optical Switching and Networking 5 (2008) 170–176
Fig. 1. An example of nationwide network.
accommodate the nodal degree distribution. For a source–destination pair demand, it may need multiple wavelengths (e.g., 1, 2, 4, or 8 wavelengths), which has a maximum value of 8 in this study. An aggregate point-to-point demand set is then created by assuming that the demand between nodes A and B is proportional to the product of the bandwidth entering/leaving the network at the two nodes based on their “size” (“large” or “small”) and then adjusting for the “large”/“small” distribution. In determining the actual traffic distribution, the number of wavelength services and amount of GB IP services is first determined. Traffic demands are then assigned to node pairs, starting with the largest services first. Services in a given restoration class (1:1 shared-path protection in our study) [5] are assigned only if both nodes were of appropriate nodal degree. The remaining node pairs are assigned BE IP traffic according to the aggregate distribution.
Essentially, the idea is to associate a vector with each link in the network, identifying the number of backup wavelengths to be reserved on this link to protect against failures of other links. The link vector ve for link e can 0 be represented as an integer set, {vee | ∀e0 ∈ E, 0 ≤ 0 vee ≤ λ(e0 )}, where E is the set of links; λ(e0 ) specifies 0 the number of wavelengths on link e0 ; and vee specifies the number of working lightpaths that traverse link e0 and are protected by link e (i.e., their corresponding backup lightpaths traverse link e). The link vector captures the necessary information on the sharing potential offered by each link through a simple data structure. The number of wavelengths which need to be reserved for backup lightpaths on link 0 e is thus ve∗ = max∀e0 {vee }. Therefore, using the link vector, we can simply reserve ve∗ wavelengths on link e as backup wavelengths. We apply the link-vector technique in the proposed bandwidth-provisioning algorithm.
3. WDM layer design
3.3. ILP approach
3.1. Problem statement
We develop a mathematical formulation of the WDM layer design problem using the link-vector technique, and the formulation turns out to be an integer linear program (ILP). Our ILP model is much simpler and more efficient than previously-developed models for shared-path protection [7]. Our model is scalable for larger networks (24 nodes) compared to the previous model (10 nodes). In our formulation, the number of variables grows with the product of the number of links and the number of node pairs requesting connections. In comparison, in the ILP formulation in [7], variables are possible routes connecting each node pair so that the number of variables tends to grow exponentially with network connectivity and dimension. Therefore,
We are given a graph G = (V, E) and the number of wavelengths available on each link λ : E → Z + . We may have other functions such as cost/distance defined on links, C : E → Q + . Our aim is to route connection requests between node pairs (s, d) to guarantee failure recovery and to maximize sharing of backup bandwidth. 3.2. Link-vector technique to maximize backup sharing The idea of link vector has been widely applied in various studies (e.g., the conflict vector in [6], etc.) to identify the sharing potential between backup paths.
S. Rai et al. / Optical Switching and Networking 5 (2008) 170–176
173
Table 1 ILP notations Equals 1 if working path for traffic between (s, d) is routed through link (i, j). Traffic demand between nodes (s, d). Number of wavelengths. Maximum backup capacity to be reserved for sharing in the backup pool on link (i, j). (Equivalent to ve∗ in link-vector technique.) Amount of backup capacity to be reserved on link (i, j) to protect working paths crossing through link (x, y). (Equivalent 0 to vee in link-vector technique.) Equals 1 if backup path for traffic between (s, d) is routed through (i, j) when working path crossing through (x, y) fails.
pisdj Λsd W Bi j xy
Ni j
δisdjhx yi
the number of variables is much smaller in our model than in previous work. We define pHops and bHops as the number of working and backup hops that need to be assigned for a given static traffic demand set. The set includes traffic of three categories (described above) and follows 40-40-20 distribution pattern. Note that we assign equal weights to primary capacity and backup capacity in the objective function. In practice, weights can be assigned flexibly to show the impacts of any of three bandwidth allocations — working bandwidth, backup bandwidth, or both. To formally state the problem, we use the notations in Table 1. Objective: Minimize 0.5 pHops + 0.5 bHops Subject to: 1. Primary path flow conservation over the physical topology: X pssdj = Λsd j
X
sd pid = Λsd
i
X
sd pis =
X
sd pik
X
i
X
pdsdj = 0
j
=
i
pksdj ,
if k 6= s, d.
j
2. Capacity constraint: Primary and backup loads cannot exceed the capacity of a link (i, j): X pisdj + Bi j ≤ W. s,d
3. Backup path flow conservation over physical topology if link (x, y) fails: X δssdjhx yi = pxsdy j
X
sd sd δidhx yi = px y
i
X
sd δishx yi =
X
sd δikhx yi
X
i
X i
δdsdjhx yi = 0
j
=
j
δksdjhx yi ,
if k 6= i, j.
Backup and primary paths need to be link disjoint: X δisdjhi ji = 0. ∀(s, d) : i, j
4. Calculate required backup capacity on link (i, j) when link (x, y) fails: X xy Ni j = δisdjhx yi . s,d
5. Calculate maximum backup capacity needed (maximum backup load) on link (i, j): xy
∀(x, y) : Bi≥j Ni j . 3.4. Heuristic approach Considering the scalability problem of an ILP approach for large networks, we propose a heuristic approach with enhanced handling against both link failures and single-node failures. In our heuristic algorithm, on the basis of the link vector, we tweak the cost of links to maximize backup sharing. After we find a primary path lw , we assign a low cost to those links 0 (say, e) for which for all e0 εlw , νee < νe∗ . This link is an ideal candidate for serving on the backup path, since it allows the connection provisioned on lw to share the pool of backup wavelengths on e, without necessitating an increase in the number of backup wavelengths. While finding paths, we also take into account the current load on a link along with its distance, as a cost metric, so that we can route around heavily-loaded links. The computational complexity of the algorithm is O(|E|2 ). More details of the algorithm can be found in [8]. This step-by-step approach to calculate primary and backup paths may potentially fall into trap situations. To handle this case, we use a k-disjoint-path algorithm [9] to find a set of diverse paths (k = 2 in our study). We modify cost on links to denote the current load on the link as well as its distance. Note that, when we find k diverse paths simultaneously, we cannot use different cost metrics simultaneously on different links, hence we cannot find backup paths sharing as much capacity as
174
S. Rai et al. / Optical Switching and Networking 5 (2008) 170–176
possible with other primary paths, but this approach is needed to resolve “traps” [8]. Since the connection requests are assumed to be static for this network study, we choose the order of provisioning as follows: A. Choose wavelength services to provision first in the order of the number of wavelengths required. Since all wavelength services, whether they request 2, 4, or 8 wavelengths, must be routed together, we choose to route them first. Among the wavelength services, the requests for 8 wavelengths are routed first, followed by 4 and 2 wavelengths. B. Choose guaranteed bandwidth IP requests to find primary and backup paths. C. Route the Best-Effort IP traffic and provide primary and backup paths. Within this ordering, a connection requiring more backups will be given higher priority than that requiring fewer numbers of backups. The above technique guarantees link-disjointedness. There may be problems when a node fails in the following two scenarios: A. The primary and backups share a node but no links. This shared node will have degree ≥4. B. Two or more primaries with a node in common (but link disjoint, node degree ≥4) share a wavelength on a link for their backups. In this case, the failure of this node will lead to contention for the backup wavelength. Problem occurs when the shared node serves as transit on all the primaries; if it is a source or destination, then we need not reserve more wavelengths. For connections with single-failure guarantees, we find node-disjoint primary and backup paths. For other connections, with say x-failure guarantee, we find x + 1 link-disjoint paths. We simulate all single-node failures and see if there is enough backup capacity to resolve contention for wavelengths, which had been previously reserved, assuming only link failures. Note that, for connections with higher protection requirements than a single failure, since there is no node with degree ≥6 in the network, at least one path will survive, in the case of a node failure, and we can reprovision the backups to fortify the connection against other failures. In this case, the resource overbuild (spare capacity ratio) [5] will be lower, since the only nodes common to link-disjoint paths will be nodes with degree ≥4. And by simulating their failures, we would also take out the primaries (as well as backups) originating/terminating through them, giving us room for backup re-assignment.
We perform wavelength assignment on the working paths after the routing phase. For the backup paths, the wavelength to be assigned depends on the exact failure location, since connections share a pool of backup wavelengths on any link. We propose to study wavelength-allocation algorithms for backup paths as part of future work. We note the following key features of our design, which can offer very good quality of service: A. The fill factor of Best-Effort IP traffic on a wavelength is 85%. B. We provide single-failure guarantee for Best-Effort IP traffic at the wavelength layer. C. By simulating node failures and backup reprovisioning, we use capacity much more efficiently than a static approach. D. Simultaneous link failures are supported. For node failures, backup-bandwidth reprovisioning is necessary. 4. Illustrative numerical examples In this section, we evaluate through simulation the performance of both ILP and heuristic implementations of the proposed static bandwidth-provisioning algorithm. The network topology shown in Fig. 1 is used in this study. The performance results are similar for different network topologies (not shown here) and load inputs. The traffic from the three categories is randomly generated with different bandwidth granularities following the 40-40-20 distribution pattern. Traffic demands are uniformly distributed among all valid source–destination pairs in each distribution pattern. We assume that the percentage of wavelength traffic is 25% and the rest are IP traffic [5]. Among IP traffic, GB and BE demands follow 50-50 distribution. All traffic demands are provided 1:1 shared-path protection service. We aim to statically provision all traffic with no packet loss in the network. Therefore, each fiber is allowed to add wavelengths when necessary to ensure zero blocking probability. We observe that the maximal number of wavelengths on a link ranges from 21 to 128 based on different demand set sizes. Table 2 shows the execution times of ILP and heuristic approaches with different demand sets (i.e., with different network loads). The result indicates that, as expected, the ILP achieves better performance on link-load balancing compared to the heuristic approach, especially at heavy loads. However, it suffers from long execution time. When the traffic has 200 demands, ILP approach was found to take about 59 min,
S. Rai et al. / Optical Switching and Networking 5 (2008) 170–176
175
Fig. 2. Primary capacity length of ILP and heuristic methods. Table 2 Execution times of ILP and heuristic approaches Demand set size
Execution time (ILP) (s)
Execution time (heuristic) (s)
40 60 80 100 120 150 200
159 288 557 879 856 760 3005
<1 <1 <1 <1 <1 <1 <2
while the heuristic takes much shorter time to obtain solutions on a platform with a 2 GHz Pentium-4 processor and 1 GB RAM. Next, we investigate the resource-usage efficiency of the two methods. We use capacity length (product of number of used wavelengths on a link and the link length in km) of all links to evaluate capacity optimization performance. Capacity length indicates overall bandwidth resource allocation in a network. Figs. 2–4 show the primary capacity length, backup capacity length, and total capacity length of the two methods with different demand sets, respectively. We observe that ILP offers better capacity savings than the heuristic. It can be seen from Fig. 2 that the ILP achieves 48%–54% improvement in bandwidth allocated for primary paths over the heuristic. The results are similar for backup bandwidth (26%–52%, Fig. 3) and total wavelength (44%–50%, Fig. 4) allocations. It is straightforward to see that heuristic is fixed-alternate-routing-based so its performance is not as good as that of the ILP where routing is not limited by the number of candidate routes. Therefore, ILP can achieve global optimization and consume less resources compared to the heuristic.
Furthermore, the ILP outperforms the heuristic even more when the network load gets heavier. One possible reason is that the order of routing connections will affect capacity lengths in the heuristics as demands are routed sequentially. The impact of demand order on capacity length may become heavier when more demands need to be routed, i.e., when the network gets more heavily loaded. Again, we observe that the ILP and heuristic approaches demonstrate tradeoff between optimal bandwidth utilization and computation complexity. 5. Conclusion This work is intended as an early study on survivability of IP-over-WDM networks. Based on a typical IP-over-WDM backbone mesh network and differentiated traffic categories, we studied a novel static bandwidth-provisioning algorithm to support three categories of demands — wavelength traffic, Guaranteed-Bandwidth (GB) IP traffic, and BestEffort (BE) IP traffic. We applied the powerful link-vector technique for backup-path provisioning to achieve maximal backup sharing potential. Both ILP and heuristic approaches were presented and discussed. In the heuristic, we further investigated the effects of rapid backup reprovisioning to support 100% failure recovery against single-node or multiplelink failures. We presented illustrative examples to compare the ILP and heuristic approaches in terms of network capacity usage and computation time. Future works will study possible improvement in the ILP’s performance by allowing differentiated protection (1:2 and 1:3 shared-path protection schemes) and corresponding comparisons with the heuristic method.
176
S. Rai et al. / Optical Switching and Networking 5 (2008) 170–176
Fig. 3. Backup capacity length of ILP and heuristic methods.
Fig. 4. Total capacity length of ILP and heuristic methods.
Another interesting topic is to optimize the bandwidth on a link to avoid over-utilizing or congesting links in a network. Acknowledgements We gratefully acknowledge the comments from the editors and reviewers which served to improve this paper. References [1] K. Ratnam, L. Zhou, M. Gurusamy, Efficient multi-layer operational strategies for survivable IP-over-WDM networks, IEEE Journal on Selected Areas in Communications 24 (8) (2006) 16–31. [2] M. Kurant, P. Thiran, On survivable routing of mesh topologies in IP-over-WDM networks, Proc., IEEE Infocom 2 (March) (2005) 1106–1116.
[3] L. Lei, A. Liu, Y. Ji, A joint resilience scheme with interlayer backup resource sharing in IP over WDM networks, IEEE Communications Magazine 42 (1) (2004) 78–84. [4] A. Nucci, N. Taft, C. Barakat, P. Thiran, Controlled use of excess backbone bandwidth for providing new services in IP-over-WDM networks, IEEE Journal on Selected Areas in Communications 22 (9) (2004) 1692–1707. [5] PIP for DARPA’s CORONET. http://www.darpa.mil/sto/ solicitations/CORONET/index.htm, 2006. [6] G. Mohan, C.S.R. Murthy, A.K. Somani, Efficient algorithms for routing dependable connections in WDM optical networks, IEEE/ACM Transactions on Networking 9 (5) (2001) 553–566. [7] L. Sahasrabuddhe, S. Ramamurthy, B. Mukherjee, Fault management in IP-Over-WDM networks: WDM protection versus IP restoration, IEEE Journal on Selected Areas in Communications 20 (1) (2002) 21–33. [8] C. Ou, J. Zhang, H. Zang, L.H. Sahasrabuddhe, B. Mukherjee, New and improved approaches for shared-path protection in WDM mesh networks, IEEE/OSA Journal of Lightwave Technology 22 (5) (2004) 1223–1232. [9] R. Bhandari, Survivable Networks: Algorithms for Diverse Routing, Kluwer Academic Publishers, 1999.