Simulation Modelling Practice and Theory 12 (2004) 263–285 www.elsevier.com/locate/simpat
Scalable network resource management for large scale Virtual Private Networks Wei Yu a
a,*
, Jian Wang
b
Department of Computer Science, Texas A&M University, College Station, TX 77840, USA b Nokia Mobile Phone, Irving, TX 75039, USA
Received 15 April 2003; received in revised form 12 November 2003; accepted 19 February 2004 Available online 5 June 2004
Abstract A VPN (Virtual Private Network) is a private data network that uses a non-private data network to carry traffic between remote sites. There is currently significant interest in the deployment of VPN service across IP backbone facilities. In this study, we focus on designing a scalable IP-based VPN architecture and applying the distributed scheme to efficiently manage the VPN network resource. Specifically, our technologies include the following: (1) We present a scalable VPN architecture. In terms of the VPN maintenance cost and average end-to-end delay measured by average tunnel lengths, the different virtual network topologies are evaluated. (2) We take the distributed approach to solve the VPN bandwidth allocation problem. Several heuristic algorithms to select ‘better’ route are proposed to improve the successful VPN request rate with certain QoS (quality of service) requirement. We conduct extensive performance evaluations on different systems and algorithms. The evaluation shows that in terms of admission probabilities, the distributed VPN bandwidth allocation protocol based on local status information and partial of network information can perform well, which is close to theoretical optimal result. We note that the optimal solution has the single failure and scalability issues for the large VPN deployment. 2004 Elsevier B.V. All rights reserved. Keywords: Virtual Private Networks; Quality-of-service; Network management
*
Corresponding author. Address: TBA, 4608 Dalrock Drive, Texas, TX 75024, USA. E-mail address:
[email protected] (W. Yu).
1569-190X/$ - see front matter 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.simpat.2004.02.002
264
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
1. Introduction In this paper, we address issues related to the scalable VPN architecture and efficient VPN network resource management. VPN is a private data network that uses a non-private data network to carry traffic between remote sites. In other words, a VPN is defined as an overlay network built on the top of a public network infrastructure and provides a set of geographical dispersed users with a dedicated private network. The need for mechanisms to allow organizations to collaborate dynamically, efficiently, and securely is motivated by many applications, such as, military joint task of finishing critical mission, Internet multimedia cooperation, and so on. These applications need participates in different geographical areas to have a simple network topology so that the whole network can be easily and securely managed. For this reason, VPN service will be highly required in the near future by next generation network applications. As more and more enterprises are migrating their data network with VPN service, it is expected to support a very large number of VPNs in a single IP service provider network. To support large enterprise, VPN service also needs to support a wide range of branch offices in a single VPN network, depending on the size and structure of enterprise organization. Clearly, traditional pipe-based VPN design methodology, which deploys the VPN by setting up VPN pipes among all branches, is not scalable to the large VPN system deployment, as maintaining too many VPN pipes causes much maintenance overhead for VPN network management. Besides these, VPN service needs to be dynamic-driven, because the VPN members may join/leave the system frequently [1,2]. In this sense, the traditional pipe-based VPN design methodology also has some scalability limitation to support dynamic VPN service. As the VPN user requests certain QoS measured by the network bandwidth and network delay, VPN service needs to have the capability to provide enough QoS guaranteeing mechanisms. With above reasons, the scalable VPN topology architecture and efficient network resource management are two most important issues for the global deployment of VPN service. In this study, we will focus on designing scalable VPN topology architecture and adopting the distributed approach to efficiently manage the VPN network resource. Specifically, our technologies include the following: (1) We present a scalable VPN architecture. Different virtual topologies are analyzed and evaluated in terms of VPN maintenance cost and end-to-end packet delay measured by average tunnel length for all VPN source/destination pairs. (2) We adopt the distributed approach for the VPN bandwidth allocation. A critical step for setting up the VPN is to select a ‘best’ route so that the QoS requested by VPN users can be guaranteed. At the same time, network can be maximally utilized by accepting VPN requests as many as possible with limited network resources. Several route selection algorithms are evaluated with comparing to near optimal and shortest path baseline algorithms. Beside these results, we also discuss the issue related to run-time intra-VPN and inter-VPN resource multiplexing schemes, which can dynamically adapt to changing traffic condition and achieve better network resource utilization. The rest of paper is
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
265
organized as follows: in Section 2, we present a brief survey of related work. In Section 3, we study a scalable VPN architecture with analyzing different virtual network topologies. In Section 4, we define the VPN bandwidth allocation problem, compare several different approaches, and propose our protocol and algorithms. The performance evaluation in terms of admission probability is presented in Section 5. We summarize the paper and discuss the future work in Section 6.
2. Previous work In the VPN development history, Frame relay/ATM-based VPN is the first generation solution for the VPN service. This technology is based on multiplexing and switching cells transported on virtual channel (VC) connection. In turn, virtual path (VP) connection allows for joint handling of bundled VCs and can serve as an effective way of reducing complex signaling and management tasks in a core network layer. Lots of work has been done for the resource management in the ATM-based VPN architecture, for example, Su et al. [3] exploits statistical multiplexing-based alterative routing approach to improve system utilization, Medi et al. [4] studies the network design model for providing the QoS by adopting reconfigurable networks to dynamically adapt to the network traffic load, and Lazar et al. [5] considers a multi-user VPN network sharing by non-cooperative users and proposes the game theory-based approach to provide certain fairness of network bandwidth resource. Although ATM-based VPN provides the virtual connection among hosts and branch networks with guaranteed QoS, the overhead of management and the inefficiency of network resources hamper this approach to be widely deployed [6]. In the last several years, many studies about IP-based VPN have been done in IETF [1,2,7,8]. The goal of the IP-based VPN is to provide VPN users with a service comparable to a private dedicated network established with leased lines. Thus, VPN service providers need to address the QoS and security issues associated with deploying the VPN over common IP infrastructure. On one hand, MPLS and Cisco TAG for fast packet forwarding schemes have made it possible to provide customers with enough guaranteed QoS in the IP-based network [9,10]; on the other hand, the substantial progress of IP security has also made it possible to provide the customers with a level of security comparable to what offered by a dedicated lease line [11]. In order to provide fast VPN feature delivery, lots of research work has been done, such as, X-bone, DVPN [12], and work in [13] provide the automatical configuration for the VPN, VPOE [14] and CRATOS [15] provide an network virtual computation environment by extending the traditional network-based VPN, Lim et al. [16] and Braun et al. [17] propose a VPN architecture based on the centralized bandwidth broker to meet QoS requirements, Villela et al. [18] presents a programmable spawning networks architecture which is capable of profiling, managing distinct virtual networks with easy plug-in module called Genesis programmable kernel, and [19,20] studies the hose model to provide users with several advantages for the QoS management (ease of specification, flexibility, and multiplexing gain).
266
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
Distributed algorithms have been widely used in the distributed system and network design. For example, Gerster et al. [21] presents distributed algorithms to find VP layout in ATM network, Cidon et al. [22] explores three traditional distributed algorithms used in the network system, and Ramaswami et al. [23] proposes a distributed algorithm to control the optical network.
3. Scalable overlay VPN architecture 3.1. Architecture design As we mentioned, the benefit of VPN is to provide users with a transparent and secured network to make the network management easy. In the following sections, we adopt following notations: PE denotes the ISP (Internet Service Provider) access device and CE denotes the user access device. The reason for dividing access devices into these two parts is that both the user and service provider can have the chance to control the VPN service. Both PE and CE devices are VPN control devices and the size of VPN is measured by the total number of PE and CE devices in the system. A generic VPN architecture needs to satisfy following four basic requirements: (1) Scalability: VPN maintenance cost should have comparatively slow increasing curve with increasing the network size. (2) Reliability: the network should have the capability to recover in case the fault happens. (3) Security: the information lost and hijack should be prevented as much as possible. (4) Availability: network and user access devices should have some mechanisms to satisfy the end-to-end QoS requirement. Above generic requirements must be addressed during the system design phrase. Due to the space limitation, we only focus on the most important scalability issue–– VPN topology design. We consider following three candidate topologies: binary hypercube (BH ), bidirectional ring (BR), and full-mesh (FM) as our basic study. We also consider the hierarchical scheme, which is scalable to support large network size. In this sense, all VPN PE/CE devices can be assigned to different layers and each layer can deploy its own network topology. We illustrate this by following example: consider a large company network with over 1000 branches all over the world. Suppose only one layer full-meshed topology is adopted, each branch needs to manage 999 virtual tunnels connecting to all other branches. The overhead of VPN management is too high, as updating each branch will cause the information to be propagated to all other branches. The two layer hierarchically meshed scheme is more scalable in terms of comparatively smaller number of manageable tunnels shown in Fig. 1. In this figure, the PE of ISP interconnected by full-mesh or partial-mesh through the virtual tunnels and each branch CE device only connects to the nearest PE. In this sense, the PE can aggregate the tunnels from different connecting CEs. This layer concept also adopts the similar idea as VP/VC in ATM world which is an effective way of reducing complex signaling and management tasks in a core network layer. In Fig. 1. All links connecting to neighbor devices are just like the VC and the virtual connection between two CE devices are just like the VP.
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285 CE
CE
CE
PE
CE
CE
267
CE
PE
CE PE
PE
CE PE
PE
Tunnel
CE
CE CE
CE
Fig. 1. Hiearchical VPN architecture.
From the VPN maintenance consideration, this hierarchical architecture has other benefits: if any new member joins the VPN, its CE device only needs to register the nearest PE by using some authentication protocols. After new member is accepted, the PE forwards the updated routing information to other PEs in the same VPN by VPN routing protocol. Thus, the VPN member updated information can easily propagate to other VPN members at comparatively low cost measured by total propagation messages in the VPN network. As we mentioned early, the size of VPN measured by the number of PE/CE devices in the system can be large and single ISP needs to support large number of VPNs measured by number of customers. As the VPN is constructed by virtual tunnels connecting all PE/CE devices, the cost of tunnel maintenance is the major overhead for the VPN management with following reasons: (1) The PE/CE devices associated with the virtual tunnel need to do extra tunnel maintenance work, such as, setting and managing the timer for the keep-alive message, monitoring run-time link bandwidth, maintaining the reserved network resources. (2) The PE/CE device needs to maintain not only the tunnel but also the VPN itself, such as, VPN routing table maintenance, and so on. For these reasons, we assume that tunnel maintenance cost is the major cost for the VPN system, which will be discussed in the following section. 3.2. Virtual topology design 3.2.1. Topology analysis VPN is an overlay network which is constructed by a number of network devices in the network. The link connecting VPN network nodes (PE/CE devices as we mentioned) in the VPN topology is called virtual tunnel, which is constructed by a number of physical network devices shown in Fig. 2. In this figure, the VPN has three nodes (node 1, 2 and 3) and two virtual tunnels (denoted by bold lines): one is between node 1 and 2, the other is between node 1 and 3. These two tunnels are constructed by a series of physical network nodes, i.e., ð1; A; C; F ; E; 2Þ and ð1; A; C; F ; G; 3Þ. These two tunnels have the same weight (5) measured by the physical hop length of the tunnel.
268
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
Fig. 2. Network example.
For simplicity, we define the VPN virtual topology G ¼ ðE0 ; V 0 Þ, where E0 denotes network virtual tunnel sets, and V 0 denotes the set of VPN PE/CE nodes. In this virtual network topology, the network path between the source and destination is constructed by a set of virtual tunnels. As an example shown in Fig. 2, the network path between node 2 and 3 is constructed by two tunnels shown with the bold line. Thus, we define the VPN tunnel length as the number of tunnels which connects two VPN source/destination pair nodes. For example, the VPN tunnel length is 2 for VPN node pair 2/3 and VPN tunnel length is 1 for VPN node pair 1/2, respectively. Based on above analysis, we define following performance metrics for the topology design. • VPN maintenance cost (denoted by L). It represents the total number of tunnels in the VPN, which contributes the major VPN maintenance cost for the reasons as we mentioned in the last section. • Average tunnel length of all the VPN end-to-end paths (denoted by T ). In order to correctly forward the packet between different tunnels, the nodes except the source and destination in the network path with hop number larger than two have to do the packet mapping operations between different tunnels, e.g., encapsulation, decapsulation, encryption, and decryption. These kinds of packet processing are time-consuming and have the major impact on the packet end-to-end delay. We assume that all tunnels have the same delay weight (i.e., 1). Therefore, minimizing the end-to-end message delay can be simplified to minimize the tunnel length of the path from the source to destination. We also assume a uniform distribution of the traffic for the VPN network for simplicity. Thus, minimizing the average end-to-end message delay is just simplified to minimize the average tunnel length of the VPN network. • LT product. It gives a cost-benefit ratio, as L represents the cost of VPN maintenance and 1=T represents the benefit-network delay performance. Thus, the maximum benefit per unit cost can be achieved by minimizing the LT product.
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
269
We note that it is natural that the benefits does not increase the same order as costs increase for some complex systems. The simple version LT provides general relationship between the cost and benefit for the topology analysis in this paper. More complex cost-benefit formula evaluation can be extended by our future work. As we adopt the hierarchical approach for the VPN topology design, the whole VPN devices can be assigned to different layers and each layer can have multiple clusters (each cluster includes a number of VPN devices). To illustrate this, we give one two-layer hierarchical example with 20 nodes: these 20 nodes are grouped into four groups and each group has five nodes. All these four groups construct a layer 1 topology with four clusters and each group contributes a leader. Total four leaders construct a layer 2 topology with a single cluster. We also define the traffic localization factor R as the probability of both source and destination of the packet within the same cluster. In this sense, the smaller R means that most of VPN traffic needs to go through different cluster. With the binary hypercube (BH ) (shown in Fig. 3 as a 3-dimensional topology), bidirectional ring (BR), and fully connected mesh (FM) as our basic topologies for the cluster, the system configuration can be presented by 3-tuple hT ; T ; Ki, where T 2 fBH ; BR; FM; /g, and / denotes not-applicable, and K represents the maximum layer of the hierarchical topology. For example, hBH ; /; 1i denotes that the network topology has only one layer with BH topology. Generally, we consider following combinations: (1) Single layer topology: hBH ; /; 1i, hBR; /; 1i, hFM; /; 1i. (2) Hierarchical topologies with two layers: hBH ; BH ; 2i, hBH ; BR; 2i, hBH ; FM; 2i. We assume that the sizes of VPN and each cluster are 2D and 2d , respectively. • hBH ; /; 1i. As the VPN has 2D nodes, each node connects to D nodes as neighbors, thus L ¼ D2D1
ð1Þ
T can be described as ( ), D X T ¼ iCDi 2D i¼0
Fig. 3. Three dimension binary hypercube.
ð2Þ
270
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
• hBR; /; 1i. As the ring topology has 2D virtual links, thus L ¼ 2D
ð3Þ
T ¼ 2D2
ð4Þ
• hFM; /; 1i. As the full mesh topology has 22D virtual links, thus L ¼ 2D
ð5Þ
T ¼1
ð6Þ
• hBH ; BH ; 2i. In this configuration, the BH is deployed at both layers, thus L ¼ d2D1 þ ðD dÞ2Dd1
ð7Þ
T ¼ RD1 þ ð1 RÞð2D1 þ D2 Þ
ð8Þ
where, D1 ¼ d=2 and D2 ¼ ðD dÞ2Dd1 =ð2Dd 1Þ. • hBH ; BR; 2i. In this configuration, the BH and BR are deployed at first and second layers, respectively. Similarly we have L ¼ 2Dd d2D1 þ 2Dd T ¼ RD1 þ ð1 RÞð2D1 þ D2 Þ
ð9Þ ð10Þ
where D1 ¼ d=2 and D2 ¼ 2Dd2 . • hBH ; FM; 2i. In this configuration, the BH and BR are deployed at first layer and second layers, respectively. Similarly we have, L ¼ 2Dd d2d1 þ 22ðDdÞ
ð11Þ
T ¼ RD1 þ ð1 RÞð2D1 þ D2 Þ
ð12Þ
where, D1 ¼ d=2 and D2 ¼ 1. 3.2.2. Performance result In this section, we evaluate the system performance for different topologies in terms of L, T , and LT metrics defined above. All the data are generated with different network size. Figs. 4–6 show the data on the performance of L, T and LT , respectively. With these figures, we have following observations: the FM topology has the largest cost (L performance) and best benefit (T performance). However, BR topology has the lowest cost (L performance) and worst benefit (T performance). Although both L and T performance of BH are not the best, its LT performance is much better than both BR and FM topologies. The reason is that the BH does the better trade-off between the cost and the tunnel number of the path. Because of this, the BH should be the first choice for the VPN topology design.
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
271
VPN Maintenance Cost L 25
Cost
20 15 10 5 0 1
3
5
7
9
11
13
15
Log(N) - Number of Nodes BR_1
FM_1
BH_1
Fig. 4. VPN maintenance cost L for single layer network topology.
Average Tunnel Hops
Average Tunnel Hop NumberT 10 8 6 4 2 0 1
3
5 7 9 11 Log(N) - Numberof Nodes BR_1
FM_1
13
15
BH_1
Fig. 5. Average tunnel length for single layer topology.
LT Product 25
LT
20 15 10 5 0 1
3
5
7
9
11
13
15
Log(N) - Number of Nodes BR_1
FM_1
BH_1
Fig. 6. LT product performance for single layer topology.
Figs. 7 and 8 show the LT performance of the two-layer topology with different traffic localization factor R. With these figures, we have following conclusions:
272
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
LD Product (R=0.9) 16 14 12
LT
10 8 6 4 2 0 1
3
5
7
9
11
13
15
Log(N)- Number of Network Node B H_B H_2
B H_B R_2
B H_FM _2
Fig. 7. LT product performance for two layer topology ðR ¼ 0:9Þ.
LD Product (R=0.5) 60 50
LT
40 30 20 10 0 1
3
5
7
9
11
13
15
Log(N)- Number of Network Nodes B H_B H_2
B H_B R_2
B H_FM _2
Fig. 8. LT product performance for two layer topology ðR ¼ 0:5Þ.
(1) The LT performance of hBH ; BH ; 2i outperforms other topology configurations consistently for different R. (2) The value of LT increases consistently with the increase of network size for all topology candidates. The reason is obvious: a larger
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
273
network size implies more tunnels, which causes longer average tunnel path for all the source/destination pairs in the system.
4. VPN bandwidth allocation problem In last section, we discuss the scalable VPN architecture from the VPN overlay topology consideration. In order to achieve the guaranteed QoS, VPN users needs to request some network resources, such as, network bandwidth, CPU resource, and so on. In this section, we will study the VPN bandwidth allocation problem, which is the most important issue in the VPN resource management. We assume the traffic is CBR (constant bit rate) for simplification purpose. Generally, we can easily adopt the traffic engineering research results, such as, service curve theory [24], statistical QoS [25], to efficiently support the VBR (variable bit rate) traffic. 4.1. Problem definition We define the physical network as a topology G ¼ ðE; V Þ, where E denotes the network link set with size e, and V denotes vertex set with size v (denote the set of VPN PE/CE devices in the VPN network). Edge uv from node u to node v in set E has the link bandwidth Buv . We assume the G0 ¼ ðE0 ; V 0 Þ denotes a VPN overlay topology on top of network G, where V 0 is the subset of V and E0 is the virtual tunnel set constructed by elements in E. The vertexes in set E0 includes three categories: VPN source, VPN destination, and VPN core. All these vertexes in set E0 need to maintain the VPN routing table. In this paper, we only consider the network bandwidth as the VPN QoS requirement. Thus, the VPN traffic requirements can be described by traffic matrix defined by VPN source and destination pairs, i.e., the VPN bandwidth request from node i to node j is defined as fij . The quantity fij ðu; vÞ represents the amount of the VPN traffic from node i to P node j on edge uv. We define the value of total traffic flow on edge uv to be f ðuvÞ ¼ ij fij ðuvÞ. The real VPN bandwidth demand is feasible, if f ðuvÞ 6 Buv for all edges uv, where Buv is the physical link bandwidth for the link uv. Our objective is to maximally satisfy the VPN requests with the link bandwidth constraints (remark: statistical multiplexing can play an important role in the selection of the best overlay topology to utilize the network bandwidth resource efficiently). We will discuss different approaches to solve this problem in the next section. 4.2. Algorithms 4.2.1. Optimal and near optimal solutions Linear programming is a theoretical optimal algorithm for this problem. Consider a network with m PE/CE devices and n source/destination pairs, we have a total Pm2 k¼0 ðm 2Þ!=ðm 2 kÞ! number of possible paths, which denote all possible
274
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
virtual tunnels in the P network. In order to solve this linear programming problem, we must introduce m2 k¼0 m!=ðm 2 kÞ! variables to solve it. However, with the size of VPN network increasing, the total number of variables will increase greatly and exceed the computation capability. The concurrent multi-commodity flow problem involves simultaneously shipping several different commodities from the sources to the destinations in a single network so that the total amount of flow going through each edge is no more than its capability. Therefore, the instance of this problem can be formatted as an instance of the concurrent multi-commodity flow problem. We introduce one well-known fast approximation algorithms called LMPSTT [26]. The basic idea of this algorithm (shown in Appendix A) is (1) Find an initial solution by using the max-flow algorithm. (2) From the initial solution, reroute the appropriately chosen fraction of ‘bad’ commodity on to the edge of a minimal cost flow and generate a closer optimal solution. However, both the optimal and near optimal solutions are time-consuming and only appropriate for the centralized control. This approach requires a centralized agency to perform the VPN bandwidth allocation for the entire system. The main advantage of this scheme is its simplicity and easy implementation. But, it has vulnerability of single failure––if the centralized agency goes down, the whole system will crash. Another problem is the scalability. For a large network, the centralized agency could become the bottleneck, such as, inaccurately collecting global network information because of network delay, and so on. Thus, while this mechanism may be sufficient for small networks, it does not scale adequately to be the solution for the large VPN system. The VPN bandwidth allocation can be dynamical, as each VPN member can disjoin the network dynamically. When a new user joins the VPN, it will trigger the VPN bandwidth allocation. The centralized approach is also not scalable for the dynamic VPN bandwidth allocation, as it requires a centralized agent to continuously monitor the run-time network bandwidth information with much management overhead. 4.2.2. Distributed approach In order to overcome the scalability problem raised in the centralized solution, we design and analyze a number of heuristic algorithms by using the distributed approach. In this section, we propose the distributed protocol to solve this problem, which adopts the parallel probing scheme [27,28]. The basic idea of our distributed protocol is described as follows: when a source is trying to setup a VPN tunnel with a destination, a source starts with a route discovery by flooding a requesting message containing a request id. While the requesting message is flooded, each node adds it own address and available bandwidth in a path table within the requesting message. To eliminate duplicated copies of the requesting message, upon receiving it, a node discards the request, if it finds its own address already recorded in the route or it has propagated an earlier copy with the same request id. When the destination receives a requesting message, it may have different choices, such as, it may simply reverse the recorded route to perform the tunnel reservation, it may set a timer, wait for more route requesting messages containing
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
275
other routes information for the same request, and select a ‘better’ route among these available multiple routes to reserve the tunnel, and so on. For each VPN½i, there is a traffic matrix from a set of source nodes to a set of destination nodes. We assign the priority ID for different VPNs, i.e., N with highest priority and 0 with lowest priority. In the traffic matrix of one VPN, we can also assign a priority to each pair of source and destination nodes, i.e., the pair of nodes requiring more bandwidth with higher priority. Therefore, VPN½i½j denotes jth virtual tunnel for ith VPN with following properties: if i1 P i2 , then, the priority if i1 6 i2 , then, the priority if i1 ¼ i2 , j1 P j2, then, the priority if i1 ¼ i2 , j1 6 j2 , then, the priority
of VPN½i1 ½ P VPN½i2 ½ ;
of VPN½i1 ½ 6 VPN½i2 ½ ; of VPN½i1 ½j1 P VPN½i2 ½j2 ; of VPN½i1 ½j1 6 VPN½i2 ½j2 .
Before describing the protocol, several assumptions are listed: (1) When one static traffic bandwidth requirement has been determined, it has been assigned with a priority ID, which denotes the priority of the VPN. (2) All node pairs can probe the traffic request messages within the network at anytime; therefore, the router node in the network can receive several requirement messages at a given time slot. (3) In one router, if there are several request messages coming at the same time, the message with highest priority will be considered first for the bandwidth reservation. (4) When node sending the tunnel request receives the fail message or time-out, it starts a new timer and retry. Basically, the proposed protocol for setting up VPN tunnels, consists of three phases: (1) probing phase; (2) route selection phase; (3) acknowledgement phase. The probing phase routes the message to the destination by adopting the distributed routing algorithm at each network node, the route selection phase performs different route selection algorithm at the destination node, and acknowledgement phase performs the resource reservation. The detail description of the protocol is listed with following variable definitions: cid½x; y: yth tunnel in xth VPN; S, T : source and destination node; K: node K sends the probe information; BðI; J Þ: the current available link bandwidth from node I to J ; I: network node with ID I (1 6 I 6 N , where N is number of network nodes in the network); RðI; tÞ: the node I routing table entry indexed by t, which records the neighborhood nodes of node I; Path: records the sequence of nodes along the path from the source node to the current node; Probe½K; QoS; S; T ; cid½x; y, Path, B): node K’s probe message for QoS requirement for xth VPN yth tunnel requirement from the source node S to the destination node T , Path records the information about the sequence of nodes along the path and B records the sequence of link available bandwidth along the path; L: Counter to define the maximum number of routes, which destination will wait for before starting the reserve phase.
276
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
VPN Setup Protocol: For I-th network node Begin While true do Block until receive several message in the given interval, Switch (received messages) Case 1: Probe[K, QoS, S, T, cid[x,y], path, B] // probe phase If I not equal T and I has not forwarded a previously received probe message (I is not in Path), then For every node J not in R(I, t), J does not equal K do If forward condition (I, J, QoS) is true then Add the node I into Path information and link available bandwidth B(I,J) into B Send J a probe[I, QoS, S, T, cid[x,y], Path, B] after a ran dom delay Endif Endfor //Path selection phase Else if I ¼ T and it is the probe[cid] received by T then If DP SPðNÞ algorithm is configured While (receiving N-th prob[cid] message) Choose the route with minimal hop length based on probe[Path] information Reverse the sequence of nodes recorded in Path Send the ack[I, Qos, S, T, cid, Path] to the next hop recorded in probe[Path] If DP WPðN Þ algorithm is configured While (receiving N-th prob[cid] message) Choose the path with maximal bandwidth based on probe[Path] information Reverse the sequence of nodes recorded in Path Send the ack[I, Qos, S, T, cid, Path] to the next hop recorded in probe[Path] If DP SW(L) algorithm is configured While (receiving L-th prob[cid] message) Choose the path with maximal weight based on probe [Path] information
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
277
Reverse the sequence of nodes recorded in Path Send the ack[I, Qos, S, T, cid, Path] to the next hop recorded in Probe[Path] Else Discard the probe Case 2:ack[K, QoS, S, T, cid[x,y], Path] //acknowledge phase According to received message set, node I reserves bandwidth according to priority of cid[x,y] and current link bandwidth capacity constraints. If link I(K) has enough resources for connection cid[x,y] then Reserve resources for connection cid[x,y] by updating the B(i,k); If I! ¼ S then Delete current node I from the Path information Send ack[I, Qos, S, T, cid[x,y], Path] to the next hop record at Path Else The connection cid[x,y] has been successfully established Endif Else //connection cid[x,y] is rejected //due to the failure of resource //reservation Get the next hop K from Path Send the failure[I, QoS, S, T, cid[x,y], Path to K Endif Case3:failure[K, QoS, S, T, cid[x,y], Path] If I! ¼ T then Get the next hop K from Path Forward the failure[I, QoS, S, T, cid[x,y], Path] to K Endif Endswitch Endwhile End In above VPN setup protocol, the route selection algorithms performed by the destination node is the key factor for the overall successful rate of the VPN tunnel requests. We propose following heuristic algorithms: (1) hDP SP ; N i: The destination waits N probe messages for the same request and selects the route with the shortest hop length from all N potential paths. (2) hDP WP ; N i: The destination waits N
278
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
probe messages for the same request and selects the route with the widest bandwidth from all N potential paths, denoted by maxðminðPi Þ, where the minðPi Þ is the minimum bandwidth value of the node in the route Pi . (3) hDP SW ; N i: The destination waits for N probe messages for the same request and chooses the path with the maximal weight Wi for route Pi by following function. Wi ¼ max jð1 aÞ minðPi Þ abHi j
ð13Þ
where b is the constant, Hi denotes the number of hops for the path Pi , and a is the weight to be used to adjust the weight between network bandwidth and length of the path. 4.2.3. Protocol analysis The VPN setup protocol described in last section has the following properties: Theorem 1. If the tunnel setup protocol establishes a tunnel, the route of the tunnel must be loop free. Proof. Let P be the set of routes received by destination T and Pi 2 P is selected as the best route by our proposed heuristic algorithms. By the construction of algorithms, the path established for the tunnel must be Pi . If Pi has a loop, there will be a node i 6¼ T on the Pi which receives and forward Pi twice. This contradicts the construction of algorithm that a node forwards a probe at most once. h Theorem 2. If the high priority tunnel cannot be set up, the low priority tunnel for the same source/destination pair required at the same time slot cannot be setup. Proof. Suppose VPN½i1 ½j1 cannot setup a tunnel to satisfy its bandwidth requirement. In our algorithm, each node cannot assign bandwidth reservation to this priority, so there is no path for the low priority tunnel. h Theorem 3. Given a tunnel with QoS requirement, our protocol will find a path P from source S to destination T for the highest priority VPN, if there is such a path exists in a given interval. Proof. Let P be a path from S ! M ! ! T and the bandwidth of P is larger than Qos requirement. Every node on P receives at least one probe from the highest priority VPN, as the network node from upper link will successfully forward the tunnel request. When the tunnel request arrives, it sends a probe to S to initiate the distributed tunnel set up protocol, so S receives at least one probe. Let ðI; J Þ be one link in P , if I receives a probe then J must also receive the probe. This is because bandwidth of ðI; J Þ is larger than QoS requirement and its priority is the highest. Thus, the forwarding condition is satisfied, which means when I receives its first probe, it will forward the probe message to J . Therefore, T will eventually receive a probe and an available path P can be found. h
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
279
Theorem 4. The time complexity is Oð2lÞ units of time, where l is the length of the tentative path. The total message for one VPN is Oðnðe þ 2lÞÞ, where n is the tunnel numbers of one VPN. Proof. The above algorithm takes a single message round-trip time to establish a connection. If we assume that in normal conditions it takes at most one unit of time for a message to traverse one link including the buffering and processing time at nodes. Then the time-complexity is Oð2lÞ units of time, where l is the length of the tentative path. The algorithm sends at most one probe per link in the subnet consisting of all paths from the source to the destination. The total number of probes sent is thus bounded by e, where e is the total number of links in the network. There are at most one ack and one failure message for each link on the tentative path. The total number of ack and failure messages is thus bounded by 2l, where l is the length of the tentative path. Hence, the message complexity (number of control message) of the algorithm is Oðe þ 2lÞ for a single tunnel. Therefore, the total message for one VPN is Oðnðe þ 2lÞÞ, where n is the tunnel number of one VPN. h 4.3. Discussion––run-time resource multiplexing The VPN design problem mentioned above is based on the static traffic matrix, which is used at the VPN setup time. During the system run-time, several mechanisms can be adopted to utilize the network resource more efficiently: (1) Link tunnel aggregation: some virtual tunnel can perform the traffic aggregation according to the VPN topology property. For example, when the PE connects several CEs with different tunnels, it can aggregate tunnels into one tunnel and the high traffic utilization can be achieved by using the statistical traffic modelling theory. (2) IntraVPN network resource multiplexing: within the same VPN, some virtual tunnels may light-loaded and some may over-loaded according to dynamic run-time customer traffic, the traffic on the over-loaded routes can be rerouted to the lightloaded routes. Therefore, the overall user QoS admission probability for the single VPN can be achieved higher. This scheme is called the intra-VPN resource multiplexing. (3) Inter-VPN network resource multiplexing: for different VPNs, some VPNs may light-loaded and others may over-loaded. The over-loaded VPNs can borrow some bandwidth from light-loaded VPNs and the higher overall network resource utilization can also be achieved. This scheme is called the inter-VPN resource multiplexing. For the Intra-VPN multiplexing, each VPN node needs to collect more information within the VPN. For example, in order to make rerouting feasible, some node needs to know the current bandwidth utilization status of tunnels connected to other VPN nodes. For the Inter-VPN multiplex, the nodes in certain VPNs need to keep some information of other VPNs, so the traffic load balance at different VPNs can also be achieved. All these two schemes require the VPN tunnel rerouting protocol, which can be extended by the distributed protocol as we discussed in the last section.
280
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
5. Experimental evaluation In this section, we evaluate the performance of the system which uses our proposed distributed tunnel setup protocol with different route selection algorithms. We will first describe the experimental model and then report performance results. 5.1. Experimental model • Network. The network considered in our experiments is MCI ISP backbone network. Fig. 9 shows the topology of the network. There are 19 nodes that are interconnected by links. Link bandwidth capacity is assumed to be 100Mbits. Each node is assumed to be the VPN PE/CE router, which connects the customer network. • Traffic model. We assume that VPN requests form a Poisson process. We also assume that each request statically has 1Mbps bandwidth. The source and destination pairs of the VPN request are chosen randomly. In the following simulation, we generate two VPNs: one has higher priority and the other has lower priority. • Evaluation systems: a 2-tuple. hA; N i is used to represent evaluation systems, which perform our distributed tunnel setup protocol with different route selection algorithms. In particular, A represents the destination selection algorithms, where A 2 ðDP SP ; DP WP ; DP SW Þ. N indicates the maximum number of routes for which the destination will wait before starting the tunnel reservation phase for the same request. For example, hDP SP 2i represents a system in which SP (shortest path) algorithm is used and destination waits two routes before starting the reservation. In other words, two routes received by destination for the same tunnel request and the route with shortest hop number will be selected. • Baseline system. For the comparison purpose, we will also consider the two baseline systems. (1) SSP (static shortest path) system: In this baseline system, the route selection always selects the shortest path between the source and destination pair for each VPN tunnel request. Note that this system is different from DP SP , where the route of SSP is statically calculated according the topology information. It is easy to see that the network congestion is more likely to occur in this system. We expect the system running the distributed protocol will out-perform this system (SSP). (2) Max flow system (MFS): In this baseline system, the tunnel
Fig. 9. Network simulation topology.
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
281
setup procedure is assumed to have the perfect global dynamic information (GDI) on the network status. This system can run the optimal algorithms, such as, maximal concurrent flow (shown in Appendix A). Obviously, the performance of this system is ideal. • Performance metrics. (1) Admission probability (AP): it defines as the probability that VPN requests are admitted in the system. Higher the AP value, the better the performance. (2) The counter for destination waits (N ): Recall that in our system, we allow destination to wait until N th route for the same request has been received. N denotes the factor that destination collects dynamic information from the network. 5.2. Experimental results In this subsection, we report the performance result along with observations. Due to the limitation of space, we only present a limited number of cases here. However, we find that the conclusions we draw are generally held for many other cases we have evaluated. Figs. 10 and 11 show the AP performance data versus the VPN tunnel request rate (per second). With these figures, we can know how AP is sensitive to following factors: the arrival rate of VPN tunnel requests, counter N for the destination to wait, and different algorithms. From the figures, we have following observations. • AP is sensitive to the factor––the maximum counter N at the destination. The AP value increases as the number N increases. The reason is obvious: a large N implies that a large number of routes are allowed to choose. Consequently, the system has better chance to select a better route with more network information. • Furthermore, the improvement of AP is significant when N increases from 1 to 3. But further increasing the value of N does not show much improvement. • As expected, system MFS outperforms all other systems, and the AP value of SSP system is the worst in almost all cases of VPN request rates. In the case of very low Admission Probability 1.2 1
SSP DP_SP_1
AP
0.8
DP_SP_2 DP_SP_3
0.6
DP_SP_4 DP_SW_4
0.4
MFS
0.2 0 1
2
3 4 5 6 7 VPN Tunnel Request Rate
8
Fig. 10. AP performance for DP SP , DP SW algorithms.
282
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
Admission Probability 1.2 1
SSP DP_WP_1
0.8 AP
DP_WP_2 DP_WP_3
0.6
DP_WP_4
0.4
DP_SW_4 MFS
0.2 0 1
2
3 4 5 6 7 VPN Tunnel Request Rate
8
Fig. 11. AP performance for DP WP , DP SW algorithms.
VPN request rates, all systems perform almost equally. The reason that MFS is the best relies on the fact that it uses the perfect global status to achieve best solution. The relatively poor performance of SSP supports our argument that SSP does not properly distribute VPN requests in the network. • The AP of all three systems ðDP SP ; DP WP ; DP SW Þ outperform SSP and are worse than MFS. This implies that our distributed tunnel set up protocol works well with utilizing proper available system knowledge. • The AP of DP SW system is better than that of both DP SP and DP SW . The reason is obvious: it combines the advantages of both DP SP and DP SW algorithms; thus, it performs better trade-off between the network load and route selection. The DP SP and DP SW are just two different heuristic schemes which consider different aspects and achieve the similar performance. 6. Conclusion We have studied the scalable VPN architecture and presented the distributed scheme for the VPN network bandwidth management. To the best of our knowledge, this is the first study that addresses these issues. In this study, we focus on designing the scalable VPN architecture and applying the distributed scheme to manage the VPN network bandwidth resource. Specifically, our technologies include the following: (1) We present a scalable VPN architecture. In terms of VPN management cost and end-to-end message delay measured by average virtual tunnel length for all VPN source/destination pairs, the cost/ efficiency of various network topologies has been analyzed and evaluated. (2) We take the distributed approach to solve the VPN bandwidth allocation problem. Several heuristic algorithms to select the ‘better’ route are proposed. We conduct extensive performance evaluations on different architectures and algorithms. The evaluation shows that in terms of admission probabilities, the distributed VPN setup
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
283
protocol based on local status information and partial network information can perform well, which is close to the centralized optimal solution which has single failure and scalability issues for the large system deployment. Due to the space limitation, we only discuss the scalability issue from the network resource management perspective. However, more work can be done, such as, analyzing the DDOS (Distributed Deny of Service) attack in the VPN infrastructure, analyzing the performance of VPN routing protocol with high rate of VPN membership changes, and so on.
Appendix A. Revised LMPSTT algorithms [26] LMPSTT algorithms (fast approximation algorithm for multi-commodity flow problems) is a relaxed decision procedure for multi-commodity flow feasibility. Given a multi-commodity flow problem, it can answer if it is feasible, and if feasible, give a feasible flow for the problem in which every capacity increased by a factor 1 þ . Definition 1. A multi-commodity flow f and a length function l are -optimal if Pk C ðkÞ k 6 ð1 þ Þ P i¼1 i ðA:1:Þ ð vw2E lðuwÞuðvwÞÞ where k ¼ max kðvwÞ
ðA:2:Þ
vw2E
and uðvwÞ is the capacity of edge vw. For a commodity i, let Ci ðkÞ be the value of a minimum-cost flow fi satisfying the demands of commodity i, subject to length l and capacities kuðvwÞ; fi is a flow P i.e., that satisfies jfi ðvwÞj 6 uðvwÞ and minimizes the cost Ci ðkÞ ¼ vw jfi ðvwÞjlðvwÞ. Definition 2. Let f be a multi-commodity flow satisfying capacities kuðvwÞ and length function l, a commodity i is -good if Ci Ci ðkÞ 6 Ci þ
kX lðvwÞuðvwÞ k vw2E
ðA:3:Þ
otherwise, the commodity is -hard. In the revised LMPSTT algorithm (Fig. 12), we find an initial solution by separately routing all the payload traffic using max-flow algorithms first, and then from the initial solution, call DECONGEST ðf ; Þ procedure to reroute the traffic in order to produce a new solution that is closer to the optimum. The basic idea of procedure DECONGEST ðf ; Þ is that this procedure reroutes an appropriately chosen fraction of the flow of -bad commodity to the edge of a minimum-cost flow associated with this commodity, in order to reduce the congestion.
284
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
Fig. 12. DECONGEST procedure for revised LMPSTT algorithm.
References [1] J. Clercq, O. Paridaens, An architecture for provider provisioned CE-based Virtual Private Networks, Internet draft, draft-ietf-ppvpn-ce-based-02.txt, 2002, working in progress. [2] A. Nagarajan, Generic requirements for provider provisioned VPN, Internet Draft, draft-ietf-ppvpngeneric-reqts-00.txt, 2002, working in progress. [3] C.F. Su, G.D. Veciana, Statistical multiplexing and mix-dependent alternative routing in mutliservice VP networks, IEEE Transactions on Networking 8 (1) (2000). [4] D. Medhi, Multi-hour, multi-traffic class network design for virtual path-based dynamically reconfigurable wide-area ATM networks, IEEE Transactions on Networking 3 (6) (1995). [5] A.A. Lazar, A. Orda, D.E. Pendarakis, Virtual path bandwidth allocation in multiuser networks, IEEE Transactions on Networking 5 (6) (1997). [6] V.J. Friesen, J.J. Harms, Resource management with virtual paths in ATM networks, IEEE Network (Sep.) (1996). [7] M, Suzuki, J. Sumimoto, A framework for network-based VPNs, Internet draft, draft-suzuki-nbvpnframework-00.txt, 2001, working in progress. [8] B, Gleeson, A framework for IP based Virtual Private Networks, Internet RFC-2764. [9] E, Rosen, BGP/MPLS VPNs, Internet RFC-2547. [10] Cisco Systems Inc., TAG VPN Architecture, www.cisco.com. [11] S. Kent, R. Atkinson, Security architecture for the Internet protocol, Internet RFC-2401, November, 1998. [12] J. Touch, S. Hotz, X-bone, IEEE Globecom 98, Sydney, Australia, 1998. [13] R. Isaacs, I. Leslie, Support for resource-assured and dynamic Virtual Private Networks, IEEE Journal on Selected Areas in Communication 19 (3) (2001).
W. Yu, J. Wang / Simulation Modelling Practice and Theory 12 (2004) 263–285
285
[14] W. Yu et al., An integrated middleware-based solution for supporting secured dynamic-coalition applications in heterogeneous environments, IEEE Transactions on Systems, Man, and Cybernetics (accepted). [15] D. Ferrari, A virtual network service for integrated-services Internet-works, in: Proceedings of the 7th International Workshop on Network and Operating System Support for Digital Audio and Video, St. Louis, MO, May 1997, pp. 307–311. [16] L.K. Lim, Design and implementation of a virtual network service, Masters Thesis, Carnegie Mellon University, 1999. [17] T. Braun, M. G€ unter, M. Kasumi, I. Khalil, Virtual Private Network architecture, IAM-99-001, CATI, April 1999. [18] D.A. Villela, A.T. Campbell, J. Vicente, Virtuosity: programmable resource management for spawning networks, Computer Networks, Special Issue on Active Networks 36 (1) (2001) 49–73. [19] N.G. Duffield, P. Goyal, A. Greenberg, A flexible model for resource management in Virtual Private Newtorks, ACM Sigcomm (1999). [20] A. Kumar, R. Rastogo, A. Silberschatz, B. Yener, Algorithms for provisioning Virtual Private Networks in the hose model, IEEE Transactions on Networking 10 (3) (2002). [21] O. Gerstel, A. Segall, Dynamic maintenance of virtual path layout, in: 4th Annual Joint Conference of the IEEE Computer and Communications Societies, 1995. [22] I. Cidon, I. Gopal, New models and algorithms for future networks, IEEE Transactions on Information Theory 41 (3) (1995). [23] R. Ramaswami, A. Segall, Distributed network control for wavelength routed optical netwowks, IEEE Transactions on Networking 5 (6) (1997) 936–944. [24] R. Cruz, Quality of service guarantees in virtual circuit switched networks, IEEE Journal on Selected Area in Communications 13 (6) (1995) 1048–1056. [25] S. Chen, K. Nahrstedt, Admission control for statistical QoS: theory and practice, IEEE Network 13 (2) (1999) 20–29. [26] T. Leong, P. Shor, C. Stein, Implementation of combinatorial multicommodity flow algorithms, in: International Conference on Integer Programming and Combinatorial Optimization, 1998. [27] W. Yu, J.W. Lee, DSR-based energy-aware routing protocols in ad hoc network, in: IEEE International Conference on Wireless Network (ICWN), 2002. [28] S. Chen, K. Nahrstedt, Distributed QoS routing with imprecise state information, in: Proceedings of 7th IEEE International Conference on Computer, Communications and Networks (ICCCN’98), Lafayette, LA, October 1998, pp. 614–621.