Survivable automatic hidden bypasses in Software-Defined Networks

Computer Networks 133 (2018) 73–89 Contents lists available at ScienceDirect Computer Networks journal homepage: www.elsevier.com/locate/comnet Sur...

Download PDF

1MB Sizes 1 Downloads 64 Views

Report

PDF Reader
Full Text

Computer Networks 133 (2018) 73–89

Contents lists available at ScienceDirect

Computer Networks journal homepage: www.elsevier.com/locate/comnet

Survivable automatic hidden bypasses in Software-Deﬁned Networks ˙ Piotr Boryło∗, Jerzy Domzał, Robert Wójcik AGH University of Science and Technology, al. Mickiewicza 30, Krakow 30–059, Poland

a r t i c l e

i n f o

Article history: Received 20 February 2017 Revised 5 October 2017 Accepted 17 January 2018

Keywords: Bypass Optical Resilience

a b s t r a c t The paper focuses on the problem of effective and resilient network resources management under condition of increasing Internet traﬃc. Multilayer automatic optical networks are considered as a solution usually indicated as the most appropriate for the future Internet. In this paper, we propose a Survivable Automatic Hidden Bypasses approach to enhance resource utilization and reliability in multilayer optical networks. Our solution uses the Software-Deﬁned Networking concept to automatically create or remove hidden bypasses which are not visible at the network layer. We propose a novel survivability metric which is applied during optical bypass creation in order to handle network failures in a proactive manner. Such an approach contrasts with the most common one where bypasses are considered as a mechanism to handle network failures. We combine the metric with different restoration schemas and investigate the proposed mechanism under three failure scenarios. The mechanism of hidden bypasses increases throughput and reduces transmission delays while novel proactive survivability mechanisms neutralize negative impact of network failures. © 2018 Elsevier B.V. All rights reserved.

1. Introduction Global IP traﬃc is predicted to increase over threefold from 2012 to 2017, reaching 120 exabytes (1018 bytes) per month in 2017 [1]. This enormous growth puts pressure on network operators to develop robust, reliable and scalable backbones, as well as, to eﬃciently and effectively manage their network infrastructure. A multilayer optical network is the most popular and prominent architecture to meet modern requirements. However, despite many attempts to automate the process, optical networks are usually still managed manually. Optical paths are created by the administrators based on the traﬃc distribution developed for peak hours. Such an approach to network dimensioning and management results in over-provisioning and resource waste during off-peak hours. The concept of Software-Deﬁned Networking (SDN) becomes more and more popular in the context of telecommunication networks management. SDN, assumes that the control plane and the data plane are separated to simplify the management of traﬃc in the network. At the control plane, usually the central controller decides to which interface packets should be sent at the data plane. This concept is also used in optical networks, especially to improve the effectiveness of routing and wavelength assignment. Advantages of the SDN concept are perfectly suited to the cur-

∗

Corresponding author. E-mail addresses: [email protected] (P. Boryło), [email protected] (J. ˙ Domzał), [email protected] (R. Wójcik). https://doi.org/10.1016/j.comnet.2018.01.022 1389-1286/© 2018 Elsevier B.V. All rights reserved.

rent needs, such as on-demand network control, complete knowledge about network topology, knowledge about network ﬂows, application-oriented approach (e.g. SDN support for fog and cloud interplay [2]) and possibility to effectively implement traﬃc engineering mechanisms in a centralized manner (for example, energyaware anycast strategies [3,4]). Several approaches to multilayer optical networks already exist. Solutions may differ with regard to the selection of egress and ingress nodes of optical path being established in the network. First extreme solution, known as opaque, denotes establishing lightpaths between adjacent nodes. Thus, all the data carried by the lightpaths is processed in the electric layer in each router on the path. This solution, is also known as non-bypass as none of the electric layer nodes is bypassed by the optical path. It is also the most eﬃcient for lightly utilized networks. At the other extreme, full optical mesh with lightpaths between each pair of nodes can be established in the network. This solution is especially recommended for highly utilized networks which probably fully utilize established optical paths. Optical bypasses are a compromise between presented solutions. An optical bypass connects two non-adjacent network nodes skipping optical-electrical-optical conversion in intermediate nodes. A bypass can be established on demand during peak hours, in case of network congestion or to counteract a network failure. Such dynamic bypasses are designed mainly to oﬄoad electric layer, reduce energy consumption and improve network reliability.

74

P. Boryło et al. / Computer Networks 133 (2018) 73–89

Optical paths and bypasses can be broadcasted to the electric layer or remain hidden independently from the architecture. In general, instabilities may occur if creation of optical paths is advertised to the electric layer. On the other hand, cooperation of both network layers usually results in more eﬃcient resource utilization. Hybrid approaches are also possible and considered. Simultaneously, bypasses can also be classiﬁed with regard to the routing algorithms in the optical layer. Direct bypass is established between two nodes as long as there are traﬃc demands between them and usually routing is performed based on the shortest path routing. Therefore, this intuitive approach reduces the total number of required transponders and Erbium Doped Fibre Ampliﬁers (EDFAs). However, when optical path is established, the whole wavelength is occupied even for very small demands between nodes resulting in low network resource utilization. On the other hand, multihop bypasses allow demands between different pairs of nodes to share capacity of a single optical channel. This approach results in more eﬃcient optical resource utilization and reduction in energy consumption (fewer energy-hungry router ports are needed). As a side effect, length of electric layer paths may increase and reduce the scale of improvement. Our approach, presented in this paper, is an extension introducing survivability aspects to the Automatic Hidden Bypasses (AHB) mechanism proposed in [5]. We use a hidden bypass functionality as presented in [6], and add components known from SDN. Our solution takes advantage from the SDN capabilities making the process scalable, effective and completely automated. This means that bypasses are created and torn down based on existing demands in multilayer optical networks. The network decides when and how to create a new bypass as well as which ﬂows should use it. Analysis presented in [5] shows that AHB can provide lower delays and higher throughput. The mechanism yields excellent results in both low and high loaded networks, however, up till now, survivability issues were neglected. In this paper we focus on mechanisms for provisioning of optical bypasses in the resilient manner. We assume, that the network is controlled by the centralized controller consistent with the idea of the SDN, and the AHB concept. We propose novel survivability metric considered during optical bypass establishment in order to handle network failures in a proactive manner. We investigate proposed Survivable Automatic Hidden Bypass (SAHB) mechanism under various failure scenarios: (1) Occasional network failures, (2) disaster, and (3) failure of the critical network link. To the best of our knowledge this work is the ﬁrst one that analyzes deeply the problem of resilient optical bypasses establishment in order to improve survivability of multilayer network. Such an approach contrasts with the most common one where bypasses are considered as a mechanism to handle network failures. Additionally, we join SAHB with different restoration schemas and prove that advantages are signiﬁcant and valuable. The issue is important as it regards survivability in network architecture indicated as the most prominent solution for future networks with respect to the SDN network management concept. The rest of this paper is organized as follows. The next section provides a description and references to the works related to the resilient bypasses provisioning using SDN concept. Section 3 presents the main contribution of this paper which is the Survivable Automatic Hidden Bypass concept. Simulation environment, scenarios and results are presented in Section 4. Finally, Section 5 concludes the paper. 2. Related work General survey regarding the traﬃc management in multilayer networks utilizing optical bypass mechanisms was provided in our previous work [5]. Some of the described mechanisms allow for

setting up optical bypasses in both, hidden (not visible at the IP layer) and announced (visible at the IP layer) versions. In some cases, it is assumed that a centralized controller is used, while others use a distributed approach. All of the mechanisms were compared to the AHB mechanism proposed in [5]. In this paper, only brief summary of general solutions will be provided as we will focus on resilience mechanisms in the context of optical bypasses. The ﬁrst category of works focuses mainly on comparison between centralized and distributed approaches. In [7], the authors provide evaluation of four dynamic bypass mechanisms in a network without a centralized controller. The simulation results presented in the paper conﬁrm that the best results were obtained by using multihop bypass mechanisms, however, at a cost of creating and removing a high number of bypasses in the network. In [8], each network node monitors traﬃc through a predeﬁned period of time. At the end of the period, if the volume of traﬃc towards a node exceeds a given threshold, the node may create a bypass and reroute traﬃc. Information about the new bypass is not announced to a routing protocol and decision to create a hidden bypass is made locally in a node and the central controller is not used. The authors of [9] conﬁrm advantages of centralized solutions over distributed ones regarding eﬃciency. Another category of papers analyzes the issue from the perspective of differences between hidden and announced bypasses. The authors of [10] present hidden and announced versions of bypasses and show three methods for adapting a virtual topology to current needs. The authors show that the hidden bypass mechanisms have the lowest impact on the established topology, while the number of topology changes depends on the difference between the peak and low load values. The author of [11] proposes and analyzes a dynamic algorithm to ﬁnd a set of bypasses for the Atlanta 15-node network which is periodically updated after each 15–30 min. In [12] the authors present the concept of automatically switched optical bypasses not reported to the IP layer. The central controller decides how the bypasses should be established using integer linear programming for optimization. An example of using the SDN concept in optical networks is presented in [13], where the optical networks are controlled in an eﬃcient way and the reliability of transmission is improved. In [14] the authors explain that the implementation of the OpenFlow controller results in an intelligent control plane for optical networks. The analysis proves that the scalable solution improves reliability of transmission, but any of those papers considers the bypass implementation. An OpenFlow-base uniﬁed control plane architecture for optical SDN is proposed in [15] and tailored to cloud services. The authors of [15] demonstrated their mechanism in a testbed built on ADVA ﬁxed reconﬁgurable optical add-drop multiplexers (ROADMs). The results show that paths, in the hardware, can be established in time ranging from a few seconds to dozens of seconds. The operation of the controller takes few additional seconds and also this solution can be extended by bypass mechanisms. Provisioning cloud services over software-deﬁned optical networks is also considered in [16] and [17]. The former work adjusts anycast strategies for optical networks to three types of cloud services while the later investigates the issue of cooperation between an SDN controller, an optical network and cloud orchestration software. Works considering Flow-Aware Networks (FAN) in the context of bypasses are also presented. A valuable approach to multilayer FAN is found in [18]. The authors propose to extend the original FAN concept by introducing the option to route traﬃc ﬂows through a newly established optical bypass. Mechanisms proposed so far, e.g. in [19], focus only on IP network layer traﬃc management. Three different policies are used to accept the ﬂow for an optical bypass.

P. Boryło et al. / Computer Networks 133 (2018) 73–89

Numerous mechanisms which use optical bypasses were proposed in the literature. They have advantages and disadvantages over the AHB algorithm. However, separate study is needed on resilience mechanisms in the context of optical bypasses. Valuable work [20] indicates different mechanisms in case of network failures as the aim of future studies and important challenges. Two considered approaches regard (1) using of dynamic optical bypasses during link failures and (2) the effect of link failures in a network on dynamic optical bypasses. The ﬁrst approach is more common, thus numerous works can be found in the literature. The multilayer bypass optimization process, including design and cost impact of several multilayer restoration schemes and simple failure-aware capacity planning, is carefully considered in [21]. Numerous advantages of a multilayer approach to restoration are provided and explained in details from the point of optical, port and IP node failure perspective. Optical layer failures are also considered in [22]. Authors propose a multilayer hybrid mesh restoration scheme which selectively restores a subset of IP links at the optical layer. Remaining IP links are then rerouted at the IP layer utilizing additional restoration capacity and taking advantage of ﬁne-granularity resource sharing possible at that layer. The proposed scheme signiﬁcantly reduces restoration cost by planning restoration resources between the IP layer and the optical layer in a coordinated manner. However, the topic of resilient bypass setting is neglected. In [23], authors evaluate potential cost savings achieved by the multilayer optimization, i.e. the combined beneﬁts of electrical packet grooming, optical bypass, and novel multilayer resilience scheme. After a failure the high-QoS traﬃc will quasi-immediately preempt some of the best-effort traﬃc, so that during a transient period the best-effort traﬃc will severe service degradation. Meanwhile, the optical layer will restore the IP link by rerouting through the optical path. This solution requires an interworking between the optical layer and the IP layer while bypasses considered in our work are hidden from the electrical layer. Cost-aware and energy-aware aspects are also addressed in some of the papers. Work [24] describes the rationale behind multilayer traﬃc engineering and quantify its advantages in terms of cost effectiveness. The evolution of multilayer networks is brieﬂy described and optical bypasses are suggested as the last step of this evolution. Bypasses are considered as a solution to handle network failures in an eﬃcient way achieving costs and link loads reduction. Simultaneously, authors in [25] proposed the trafﬁc grooming along with an optical bypass approach to reduce the power consumed by the entire network infrastructure. They also introduced the models to evaluate survivable power ratio and protection switching time. However, they focus on the network planning stage and ensuring protection while in our work we focus on survivability under dynamic traﬃc engineering approach. In the context of multilayer network resiliency also an SDN concept is considered. SDN is utilized in [26] where authors present fast failover and switchover mechanisms to deal with link failure in OpenFlow-enabled network. Mechanisms are assessed with the respect to the recovery time and sustained time of link congestion. By using port down events and group table entries backup paths are installed. In [27], authors also consider an SDN concept to propose a novel cross-layer restoration scheme for data center services provided throughout an IP over optical network. The scheme enhances service restoration responsiveness to the dynamic end-toend service demand and was quantitatively evaluated in terms of path blocking probability and path restoration latency. Resilience in data center networking is also considered in papers [28] and [29] but assuming software-deﬁned IP over elastic optical networks. Signiﬁcant improvements are reported in the context of path blocking probability, resilience latency and resource occupation rate.

75

IP layer IP router Logical connecon

Opcal bypass

Opcal layer

Inter-layer connecon

Physical connecon Opcal crossconnect Fig. 1. Multilayer IP and optical network.

Cited works propose optical bypasses as the mechanism to ensure network reliability during link failures, while in our work we focus on the effect of link failures in a network based on dynamic optical bypasses. More precisely, we address the issue of setting up the most resilient bypasses in advance to the failure as an addition to the AHB mechanisms proposed in our previous work. Contrary to the most common approach, where bypasses are investigated as the mechanisms to ensure survivability of multilayer networks, it is hardly possible to ﬁnd works focused on the issues addressed in this paper. 3. Survivable automatic hidden bypass This section describes the Survivable Automatic Hidden Bypasses (SAHB) algorithm. An illustration of a core multilayer network is presented on Fig. 1. We assume that each IP router is bound with an optical network switch (OXC), which is typically the case with existing carrier networks. In order to provide the SAHB functionality, we employ traﬃc oﬄoading with an optical bypass. An optical bypass can be established on demand, and it is transparent to the IP layer, i.e. the IP layer is not aware of the existence of this bypass. In this way the routing tables in routers do not need to be updated. The bypass ingress node must be informed that certain transmissions should be forwarded into the bypass rather than to the interface indicated by the routing table [6]. We assume that optical ﬁbers between OXCs may fail. In case of a failure ﬂows cannot be transmitted through the broken link including direct IP connectivity and optical bypasses. The SAHB algorithm, described in this section, is a traﬃc ofﬂoading method in which optical bypasses are created and torn down automatically based on the network congestion status. We assume that the operator has mechanisms of establishing optical paths in the multilayer network. SAHB is possible due to the potential of SDN and ﬂow-based traﬃc treatment. It is very important to note that, unlike other typical mechanisms to set up bypasses, in our proposal traﬃc matrix does not need to be known in advance. Only a part of physical resources is available at the IP layer. The rest is reserved for bypass creation and is used only when congestions occur in the network. This ensures a partial separation between both layers and enables more eﬃcient resource management and survivability assurance. SAHB provides the following functionalities: • Monitoring of every link’s congestion status, • ﬁnding an optimal bypass in response to current congestion status, • resilient-aware oﬄoading of certain ﬂows through bypasses.

76

P. Boryło et al. / Computer Networks 133 (2018) 73–89

The main advantage of using SAHB is that the operator does not need to use excessive optical resources to guarantee a congestionless and reliable multilayer network. Instead, we propose using as many optical resources as necessary in each situation. When the usage of certain links exceeds a certain threshold, new optical paths are created. When the demand ceases, resources are freed. This way, not all optical links are used. Moreover, optical paths are established based on the current traﬃc matrix and can be better tailored to the existing demands. Simultaneously, resilient aware bypasses are established to reduce negative impact of optical link failures in a proactive manner. In the remainder of this section, the SAHB algorithm is presented in details. In particular, the following are presented: the components required to realize the mechanism, the packet processing procedure, the method of calculating an optimal and survivable bypass, and some schemas applicable after failures.

3.1. Components and packet processing procedure On top of classic networks, SAHB requires additional components that are typical for SDN networks. The ﬁrst one, central controller, collects statistics from the nodes and is responsible for ﬁnding the optimal optical bypass (see Section 3.2), creating it, tearing it down and handling bypasses traversing failed broken links. The second component is the Flow Forwarding Table (FFT) present in each IP node and storing the information about all the ﬂows that are active on nodes interfaces. The information includes: hashed ﬂow ID (the ID is hashed from the usual 5-tuple), Flow Time Stamp (the time of the last received packet of this ﬂow), TTL (time to live) value in IPv4 or hop limit in IPv6 (stored for the ﬁrst packet of a ﬂow and used by the controller to ﬁnd out the exact path), and bypass interface identiﬁer (the ID of an optical interface on which the bypass originates, set for ﬂows transmitted by the bypass and empty if ﬂow is forwarded according to the routing table). The last component is the communication protocol to be used between the central controller and nodes in the network, e.g. OpenFlow. OpenFlow is able to transmit data between nodes and the central controller. In SAHB, it is necessary to inform the central controller periodically about active ﬂows and their rates, as well as link failures. These three additional components are fairly standard when the notion of SDN appears. As SDN architecture is well established in the community for further details please refer to [30]. Introduced components are used to implement packet processing procedure in routers. Pseudocode 1 describes the procedure. Firstly, the ﬂow is identiﬁed by hashing certain ﬁelds in the packet header. Then, it is checked whether the ﬂow is already present on the FFT. If it is not, the ﬂow ID is added to the table, and information about the new ﬂow is sent to the central controller. Note that such an information does not need to be sent on every new ﬂow. It is suﬃcient to gather more data and distribute this information once a while. After the ﬂow has been added to the FFT, or if it was already present there, its time stamp is updated. The last action in the process is to determine the outgoing interface for the packet. This is performed by checking the bypass interface identiﬁer ﬁeld in the FFT. If the ﬁeld is empty, it means that this ﬂow is to be forwarded normally, i.e. according to the routing table. However, when the ﬁeld is not empty, this means that the ﬂow should be forwarded through a bypass, which is identiﬁed by the value in the ﬁeld. Finally, the packet is forwarded to the appropriate interface. The same procedure was visualized in Fig. 2 from the perspective of operations performed on the FFT. Consecutive actions are bolded in the ﬂow chart.

Fig. 2. Actions performed on the FFT.

Algorithm 1 Packet processing procedure. Require: Arriving packet p 1: Identify ﬂow f based on the header of p 2: if ID of f is not on FFT then 3: Send information about the f to the controller Add f to FFT (ﬂow ID) 4: 5: end if 6: Update Timestamp of f 7: if Bypass interface of f is set then Get bypass interface i from FFT 8: 9: else 10: Get destination interface i from routing table 11: end if 12: Send p to interface i

3.2. Optimal and survivable bypass calculation A bypass creation procedure can be triggered by two mechanisms. In the base case process of optimal and survivable bypass calculation is triggered when a link in the network crosses a congestion threshold. The other mechanism is related to the failure handling and is carefully addressed in the Section 3.3. Thus, in this section, only the base case is considered. The calculation algorithm is a main contribution of the paper as it improves bypasses survivability in a proactive manner. By introducing resilient-aware factors into proposed gain formula we aim at neutralizing negative impact of potential network failures. The node which detects the congestion informs the central controller about the situation. We have to note that in this case a congestion does not mean that the link is overburdened. We assume that a link is congested when its

P. Boryło et al. / Computer Networks 133 (2018) 73–89

throughput exceeds the ﬁxed threshold, e.g. 80% of the link capacity. Along with the congested link’s ID the node sends information about the rate of each ﬂow that is active on the link. This information, combined with the knowledge the controller already has, is suﬃcient to start the selection of a new bypass. In particular, data obligatory to introduce survivability to the bypass calculation process is available for the network controller. The central controller has the information about all the ﬂows in the network, especially about their exact paths. Additionally, the controller is aware of the currently available optical resources that can be used to form hidden bypasses. The ﬂows’ bit rate can be estimated, e.g. by using techniques proposed in [31]. This effective solution analyzes basic ﬂows’ properties easily obtainable from a sampled stream. The amount of total traﬃc transmitted in a link is observed continuously for the purpose of quick congestion detection. The state and the information about particular ﬂows which is required in the bypass creation process may also be collected constantly. However, to reduce the workload of the routers, the information gathering may be triggered when a link exceeds a pre-congestion threshold which is slightly lower than the congestion threshold. This way, under light loads the information is not gathered, but when there is a threat that congestion might occur, the ﬂow state information starts to be collected. Upon receiving the congestion notiﬁcation, the controller searches for all possible bypass conﬁgurations and chooses the one with the highest gain calculated according to the following formula:

gain = w1 ∗ T − w2 ∗ nλ + w3 ∗ nIP − w4 ∗ nby −w5 ∗ Tby − w6 ∗ TcongIP

(1)

where: T is the amount of traﬃc that can be pushed into a bypass, nλ is the number of optical links that form a bypass, nIP is the number of IP hops that a bypass traverses, nby is the maximum number of optical bypasses installed on any of the links shared with the bypass, Tby is the maximum amount of traﬃc of optical bypasses installed on any of the links shared with the bypass, TcongIP is the sum of IP traﬃc on all the links that exceeded pre-congestion threshold and traversed by the bypass and, w1 , w2 , w3 , w4 , w5 , w6 are the weight coeﬃcients. Six factor are evaluated, three of them were originally proposed in our previous work [5]. First of them is the amount of traﬃc present on the congested link that can be transferred to the assessed bypass (T). The more data can be oﬄoaded, the more visible the effect of the action will be. Another factor that contributes to the gain is the number of IP hops that a bypass traverses (nIP ). All data which travels via a bypass is transmitted optically without electric conversion in the nodes. This means that from the IP layer perspective a bypass is a one hop jump. This factor is important, as optical-electrical conversion and queuing take time which is saved by the bypass. The more IP hops a bypass covers, the gain is greater. The third factor that contributes to the cost rather than gain (hence the minus sign) is the amount of optical resources that needs to be used to create a bypass (nλ ). Obviously, the more resources that are required, the less attractive the bypass becomes. In the process of ﬁnding a bypass, existing bypasses are also taken into consideration. One of them may be selected as the optimal solution. Three additional factors are novel contribution of this paper and are introduced in order to reﬂect survivability aspect in the context of optical bypasses. All of them contribute to cost rather than gain (expressed by the minus sign before those factors in the formula). It means that the lower is value of the factor the higher value of the gain computed for the particular bypass will be, and thus, preference of this bypass will be higher. It results from the properties of novel factors and the fact that those factors are expected to reduce negative impact of future network failures. General prin-

77

ciple is to use link properties to derive path properties considering worse case. Therefore, the ﬁrst factor (nby ) is the highest number of bypasses installed on any of the optical links traversed by the bypass for which a gain is being calculated. The more bypasses traverse selected link the more bypasses will be affected if the link fails. In order to spread bypasses evenly in the whole network we try to prefer bypasses using optical links traversed by the smallest number of other bypasses. Thus, the number of bypasses affected by a potential link failure is minimized by considering nby as a cost in the formula (minus sign). Another factor (Tby ) corresponds to the previous one and reﬂects the amount of traﬃc of all optical bypasses on the link for which this value is the biggest (worst case) and is traversed by the bypass for which a gain is being calculated. The rationale behind this factor is that not only the number of bypasses is important but also the amount of traﬃc that those bypasses transfer. The more traﬃc must be handled after the potential failure the higher the negative impact of the failure is. The aim of the factor is to evenly distribute the traﬃc over the network, and thus, to reduce the negative impact of a failure. Again, it is reﬂected by the minus sign assigned to the Tby factor in the proposed formula. Finally, as some of the links may not be traversed by the optical bypasses but are overloaded in the IP layer, the last factor is introduced (TcongIP ). It sums IP layer traﬃc on all the links traversed by the bypass which exceeded pre-congestion threshold. This factor is also treated as a cost which decreases the attractiveness of the particular bypass (minus sign in the proposed formula). The rationale comes from the assumption that not only traﬃc being transferred by bypasses must be handled after failure, but also the IP layer traﬃc. As the aim is to reduce the amount of affected traﬃc, the formula will prefer bypasses that do not traverse links heavy congested in the IP layer. TcongIP factor reﬂects the fact that considered network architecture is multilayer. Fig. 3 illustrates the meaning of all the factors. The topology was simpliﬁed as the aim is to clearly present how consecutive factors are calculated and not to investigate properties of the proposed mechanisms. Due to the same reason only unidirectional communication is considered. One wavelength is visible for the IP layer between each adjacent router (black edge), which means that this wavelength terminates in the IP layer at every single node. On two links (R2-R3 and R4-R5) the congestion threshold is exceeded. Two optical bypasses exist in the network, between R1 and R3 (green) and R2 and R4 (orange). Finally, we consider two bypass candidates to be established in the network, purple and light blue dashed lines. Firstly, consider bypass candidate between R2 and R5. This bypass can transport all the IP layer traﬃc that traverses simultaneously R2 and R5 routers where it can be injected and retrieved, respectively. Thus, the traﬃc originating at R1 or R2 nodes and terminating at R5 can be served. To the T factor only, the IP traﬃc traversing R2-R3 link will contribute as only congested links are considered. The bypass traverses R3 and R4 routers, thus nIP = 2. A single lambda is occupied on three network edges R2-R3-R4-R5 which results in nλ = 3. The highest number of bypasses installed on any of the optical links traversed by the bypass candidate is equal to 2 and concerns the link between R2 and R3 (nby = 2). Therefore, Tby is equal to the sum of traﬃc being handled by existing bypasses. Finally, TcongIP is the sum of IP layer traﬃc on both congested links as all of them are traversed by the bypass candidate between R2 and R5. Analogously, consider bypass candidate between R3 and R5 which can transport all the IP layer traﬃc that traverses simultaneously R3 and R5 routers where it can be injected and retrieved, respectively. Thus, the IP traﬃc originating at R1, R2 or R3 nodes and terminating at R5 can be served. Factor T is equal to 0 as none of the congested links can be oﬄoaded by

78

P. Boryło et al. / Computer Networks 133 (2018) 73–89

Fig. 3. Illustration of factors for optimal and survivable bypass calculation. (For interpretation of the references to colour in this ﬁgure legend, the reader is referred to the web version of this article.)

the bypass candidate. nIP = 1 and nλ = 3 as bypass traverses only R4 router and a single wavelength is occupied on two edges. The highest number of bypasses installed on any of the optical links traversed by the bypass candidate is equal to 1 and concerns the link between R3 and R4, thus nby = 1. By analogy, Tby is equal to the traﬃc being handled by a single bypass. Finally, TcongIP is the IP layer traﬃc traversing congested link between R4 and R5 as only this congested link is traversed by the bypass candidate between R3 and R5 nodes. In formula (1), all of the factors are meaningless without properly set weights. It is the operator’s responsibility to modify these weights accordingly. For example, when there are plentiful spare resources, the greatest focus can be placed on the number of saved IP hops. On the other hand, since resources are scarce, the eﬃciency of weights w1 and w2 may be important. Weights w1 , w2 and w3 are set up based on the experiments performed in our previous work [5]. Simultaneously, various conﬁgurations of the weights w4 , w5 and w6 are investigated in the simulation experiments which are presented later in the paper as a part of survivable bypasses assessment. Once the optimal bypass is found, the request to establish it is sent to the proper network devices (if the bypass does not exist already). Also, a list of ﬂows that are to be forwarded through a bypass is sent to the bypass ingress node. The node adds information to the respective ﬂows’ rows in the FFT. This is how a node knows whether an incoming packet is to be forwarded to a bypass or via a standard IP originated route. The ﬁnal action is to remove the unneeded bypass. Each bypass takes only a portion of traﬃc that existed at the moment of its creation. Because of this, each bypass is bound to lose rather than gain traﬃc. Normally, a bypass should be removed when transmission on it ends. This releases resources for later reuse. However, in order to avoid ineﬃciency, a bypass can be unset earlier, i.e. when the utilization of its resources falls below a certain threshold, referred to as bypass remove threshold (BRT). Bypass removal actions are also performed due to link failures, as described in Section 3.3. All actions taken by the controller and routers do not need to be realized immediately. Nodes in the network observe traﬃc load in links in a real time and when a node notices that a link is likely to become congested (by exceeding pre-congestion threshold) this is reported to the central controller. From this point on, detailed statistics about ﬂows being served in this link are collected by the router, and when the link becomes congested, they are sent to the central controller. This package of data can be successfully

sent between a node and the controller, even if millions of ﬂows are served at the time. As a result, the central controller can begin the procedure of selecting an optimal bypass just after the link becomes congested. However, the list of all possible bypasses is created when the pre-congestion threshold is exceeded. In this way, when the link becomes congested, the central controller already has candidates for bypasses. The key assumption is to set the thresholds that indicate actions properly. Even if the number of candidate bypasses is large, the calculation is simple and can be executed quickly. To do this, a dedicated resources can be reserved for such operations in the central controller. As we can see, the whole process of selecting an optimal bypass does not need to be performed in a very short time. The central controller has enough time for all operations. Of course, the process of selecting candidate bypasses can be improved by selecting a ﬁxed limited number of candidates (e.g. paths having no more than a ﬁxed number of hops, with limited maximum delay, etc.). As a result, the calculation process of an optimal bypass can be even shorter. In scenarios analyzed in this paper the process of selecting bypasses was very short, negligible from the whole SAHB algorithm operation process time. The assumptions about expected time effectiveness are not always valid in case of link failures. This issue is addressed separately in Section 3.3. There is also no need for synchronization among routers because the traﬃc sent through bypasses is not visible at the IP layer. Under normal conditions, bypasses are set up to cover a part of ﬂows’ path at the IP layer. This means that after O-E conversion (at the end point of a bypass) packets of ﬂows have to be served again at the IP layer in a node which belongs to the original path. In this way, loops cannot occur. On the other hand, when the topology changes, some bypasses might have been torn down. The central controller is aware of the current topology and can react accordingly. In such a case, the controller analyzes whether the change affects bypass and can cause loops. The bypasses which may result in loops have to be re-arranged. The SAHB mechanism is proposed to be used for bypass selection by the central controller. Of course, this solution may be generalized to other network solutions such as, e.g. MPLS or to select one optical path from the set of candidates. Also other factors can be taken into consideration when ﬁnding a bypass. It depends on the operator’s needs. Moreover, the SAHB algorithm can cooperate with solutions, where, e.g. congestion in links is indicated in a different way. For example, in FAN a link is considered as congested when the coeﬃcient representing the amount of elas-

P. Boryło et al. / Computer Networks 133 (2018) 73–89

79

tic or streaming traﬃc in this link exceeds border threshold. FAN was originally proposed in [32] as an architecture to ensure quality of service for transmissions based on ﬂows. However, substantial work was necessary to be conducted to serve priority traﬃc in congestion. There are some mechanisms to solve this problem, like Eﬃcient Congestion Control Mechanism proposed recently in [33]. Such a solution can be more effective when the SAHB algorithm is also implemented. To estimate the ﬂows’ throughputs the structure of two-dimensional model proposed in [34] can be used. This is an eﬃcient method to estimate the average throughputs of elastic ﬂows, which usually consume most of link bandwidth.

3.3. Mechanisms for failure handling This work focuses on the proactive approach to set up bypasses in the survivable manner as described in Section 3.2. However, the problem of network behaviour and control layer mechanisms handling traﬃc after an event of a failure must be also addressed. Traditional IP traﬃc can be rerouted without communication with SDN controller using some of the well known techniques, e.g. loopfree alternates or not-via addresses. The IP layer rerouting is out of the scope of this paper, thus, it is assumed that after a failure routing tables are recomputed. However, also existing bypasses are affected by the failures. In this case optical link failures must be communicated to the network controller so the affected bypasses can be immediately handled by the central entity. The communication can be realized using, for example, port down events as proposed in [26]. Three different restoration mechanisms are proposed and investigated in further experiments. Once the controller receives an information about a link failure it must withdraw all the bypasses traversing failed link. In the base case, referred further as basicRestoration, the SDN controller relies on the IP layer. Therefore, routing tables are affected by the failure which may further entails requests for new bypasses as congestion threshold is exceeded on selected links. The main drawback of this approach is that several link failures may cause ineﬃcient network resource utilization as optical paths are not rerouted after repairs. Such a situation may last till bypass removal threshold is exceeded. That is why, dropAll mechanism is proposed. It drops not only bypasses directly affected by the failure but all of the bypasses. Therefore, the IP layer reroutes the whole traﬃc after a failure, which may further entail requests for new bypasses. Both solutions suffer from the fact that numerous requests for new bypasses can imply denial of service issue regarding the central controller. In order to avoid this situation the redirectBypass mechanism is proposed. For all the bypasses that are going to be withdrawn, the SDN controller tries to ﬁnd new optical paths instead of relaying on the IP layer. The only drawback concerns the time needed to compute and establish a new bypass. However, as stated in Section 3.2 the whole procedure is quick, and the issue may be further neutralized by preparing a set of backup bypasses in case of a failure of each link and keep the list updated according to the changing network conditions. An independent solution is proposed to handle the IP layer trafﬁc affected by the link failure. The bypassIP mechanism tries to set up a bypass for that traﬃc not to generate additional requests for new bypasses. Its main drawback concerns the fact that nonzero time is needed to establish a bypass. Thus, we propose the following mechanism to avoid dropping packets in a real live deployment. For the time needed to establish a bypass, the IP layer handles all the packets but all of the requests for new bypasses are ignored by the SDN controller. Once a bypass to handle the IP layer traﬃc affected by the link failure is established and ﬂows are directed to the bypass, the controller starts to handle new requests

Fig. 4. Traﬃc handling after failure event.

for bypasses again. All the mechanisms handling traﬃc after a failure event are presented in Fig. 4. 4. Results In this section, we present the results of simulation experiments performed in the ns-3 simulator. The source code of SAHB is publicly available at [35]. Numerous simulation runs were performed in the topology presented in Fig. 5 (US backbone network provided by the SDNlib project [36]) in order to show the eﬃciency of the proposed mechanism. Every core node has the OXC+IP functionality and it is possible to implement up to eight lambdas between OXCs. Some of them may not be visible to nodes at the IP layer (if SAHB is used). The capacity of links between two nodes at the IP layer is equal to the total capacity of lambdas implemented between these nodes at the optical layer and reported to the IP layer. For example, if all eight lambdas between two nodes are visible at the IP layer, the capacity of the link between these nodes at the IP layer is equal to eight times the capacity of one lambda. The propagation delay for each link in the network was set to 1 ms. It is assumed that at least a few paths can be established between any nodes and additional lambdas are available in the optical layer. The selected network topology is a realistic one while SAHB can be implemented in each topology with multiple paths between nodes having enough lambdas in the optical layer. Traﬃc was sent from nodes 0 (Vancouver), 5 (Los Angeles) and 4 (San Francisco) located at the west coast to nodes 30 (Charlotte), 37 (Boston) and 28 (Tampa) located at the east coast. The rationale behind this assumption is to reﬂect realistic scenarios. For example, intensive virtual machines migration between data centers located at different coasts may be triggered by the differences in en-

80

P. Boryło et al. / Computer Networks 133 (2018) 73–89

Fig. 5. Network topology.

ergy costs. In January 2014, the cost of 1 MWh in the eastern part of the USA reached almost $700, while on the western coast it remained below $55 [37]. Therefore, the ratio between these costs is more than twelvefold. Other possible causes of such a migration may be related to the maintenance works, inter data center failures, follow the wind follow the sun concept and many others. Such an approach stresses the network with the SAHB mechanism under its intuitive usage scenario, as bypasses are usually established during peak hours, in case of network congestion or to counteract network failures. The purpose is to oﬄoad the electric layer, to reduce energy consumption and to improve network reliability. 40 individual sources were connected to each of the three source nodes and another 40 users were connected to each of the destination nodes. Every 0.01 s an individual source was randomly selected and began a TCP ﬂow transmission to a randomly selected destination user. Flows were generated with Pareto distribution. Capacity of each lambda was scaled down to 10 Mbit/s, so the generated traﬃc could be proportionally scaled down, and thus, simulation execution time was reduced. This artiﬁce was applied due to the huge number of performed simulations. Tests were performed to prove that scaling down resources and requests does not impact the results. The mean size of each ﬂow was 5 MB, shape factor was set to 1.5, and the packet size was set to 1500 B. For every core link eight lambdas were available at the optical level. We used FIFO queues sized to 100 packets and the OSPF routing protocol. The simulation duration was set to 20 s and the warm-up time was 5 s. The simulations were repeated at least 8 times for each conﬁguration. 95% conﬁdence intervals were calculated using Student’s t-distribution. We observed several parameters during simulations. First group of parameters was introduced in our previous work [5] in order to assess the Automatic Hidden Bypass mechanism. Transmitted data is a mean value of data transmitted during a simulation run. The delay shows a mean packet delay, and the number of hops shows the mean number of links at the IP layer which were traversed during a simulation run. The resource usage means a ratio (presented in %) of available (active) capacity to the total capacity of all links in the network. The number of bypasses describes how many bypasses were set up during a simulation, and bypass length is the mean number of hops

at the optical layer for a bypass. The value of bypass activity shows how long on average a bypass was active. Additionally, we introduce several novel parameters to assess resilience properties of the SAHB mechanism. Number of withdrawn bypasses shows how many bypasses were disconnected due to the failures and the total traﬃc handled by affected bypasses is represented by throughput of withdrawn bypasses. Failures affect also the IP layer traﬃc and redirected IP traﬃc parameter represents the total amount of the IP traﬃc on all failed links at the moment of failure. Our aims are to (1) present SAHB properties in comparison to the AHB mechanism where bypasses were established in a survivable agnostic manner (the AHB mechanism itself was carefully studied in our previous work [5]), and (2) compare SAHB properties with real-life conventional approaches represented by selected simulation scenarios. We decided not to consider non-bypass solutions as in our previous work advantages of introducing bypasses in the multilayer networks were carefully studied. Based on that research the following weights were assigned: w1 = 10 0 0, w2 = 1, w3 = 100 keeping in mind that in a real network an operator should decide on these values considering available resources and goals to be achieved. We analyzed the network with only one lambda visible at the IP layer (the capacity of each link at the IP layer was 10 Mbit/s). Seven other lambdas for each link were used to establish optical bypasses. Each router in the network periodically (1 s) informed the central controller about new ﬂows it was serving. As a result, the central controller was able to build complete paths for all ﬂows. When a link in the network crossed a pre-congestion threshold (70% of link capacity) the node corresponding to this outgoing link started to collect throughputs of all ﬂows it was serving in this link. When the congestion threshold was reached in the link (80% of the link capacity), a call for a bypass was sent to the central controller with statistics for all ﬂows transmitted through this link. We decided to set the congestion threshold to 80% which allowed us not to saturate links in the network based on the analysis which proved that this value was set rationally (delay in communication between controller and routers was acceptable). Studies presented in [5] proved also that it is efﬁcient to assign 0.3 value to BRT. Table 1 summarizes simulation parameters.

P. Boryło et al. / Computer Networks 133 (2018) 73–89 Table 1 Topology parameters for investigated networks. Parameter

Value

Nodes sending traﬃc Nodes receiving traﬃc No. of individual traﬃc sources per node No. of individual traﬃc receivers per node Mean time between ﬂows Flows generation distribution Capacity of each lambda Mean size of each ﬂow Shape factor Packet size FIFO queue size Routing protocol Simulation time Warm-up period Pre-congestion threshold Congestion threshold BRT

0, 4, 5 28, 30, 37 40 40 0.01 [s] Pareto 10 [Mbit/s] 5 [MB] 1.5 1500 [B] 100 [packets] OSPF 20 [s] 5 [s] 70 [%] 80 [%] 0.3

We considered several cases imposed by variable survivabilityrelated parameters. First of all, different values assigned to the w4 , w5 and w6 were analyzed: • • • • • • • •

Case Case Case Case Case Case Case Case

(a): w4 = 0, w5 = 0, w6 = 0 (b): w4 = 10 0 0, w5 = 0, w6 = 0 (c): w4 = 0, w5 = 100, w6 = 0 (d): w4 = 0, w5 = 0, w6 = 100 (e): w4 = 10 0 0, w5 = 10 0, w6 = 0 (f): w4 = 10 0 0, w5 = 0, w6 = 100 (g): w4 = 0, w5 = 100, w6 = 100 (h): w4 = 10 0 0, w5 = 10 0, w6 = 10 0

One should note that case (a) represents the survivability agnostic AHB mechanism and, as all resilient-aware weights are set to 0 only reactive survivability mechanisms can be enabled, is considered as a real-live conventional approach. This assumption was taken based on the comprehensive literature studies. Previous works consider exclusively optical bypasses as a mechanism to oﬄoad and handle traﬃc after events of failure and none of them directly addresses resilience during bypass establishing process. Such a baseline approach is reﬂected by case (a) combined with dropaAll, redirectBypass and bypassIP mechanisms. Case (a) can be also considered as a scenario possible to implement without utilizing the SDN concept. It is because only ﬁrst three factors in Eq. (1) are signiﬁcant. Those three factors can be gathered and utilized, with some limitations, in a distributed manner. The rationale behind other cases is to switch off and on some factors from formula 1 in order to assess their effectiveness. Cases (b)–(d) assumes that only a single factor is enabled (nby , Tby and TcongIP , respectively). In cases (e)–(g) combinations of two different factors are considered. Finally, in case (h) all of the factors are enabled. Values of factors enabled in cases (b)–(h) are changing so rapidly that attempts to implement those mechanisms in a distributed way (without using SDN) would be extremely challenging. As in case of weights w1 , w2 and w3 , also values of survivable weights should be adjusted by the network operator to the network conditions and its strategy. Simultaneously, three ﬂags are deﬁned to reﬂect the status of mechanisms proposed in Section 3.3 with the aim to handle traﬃc after failures: dropAll, redirectBypass and bypassIP. True value assigned to the ﬂag denotes that corresponding mechanism is enabled. Following cases were considered: • Case (I): dropAl l = f al se, redirectBypass = f alse, bypassIP = f alse • Case (II): dropAll = true, redirectBypass = f alse, bypassIP = f alse

81

• Case (III): dropAll = true, redirectBypass = f alse, bypassIP = true • Case (IV): dropAl l = f al se, redirectBypass = true, bypassIP = f alse • Case (V): dropAl l = f al se, redirectBypass = f alse, bypassIP = true • Case (VI): dropAl l = f al se, redirectBypass = true, bypassIP = true Base case (I) was described in Section 3.3 as basicRestoration (none of the proposed mechanisms is utilized). As enabling dropAll and redirectBypass mechanisms will result in huge amount of requests to the SDN controller, those two cases with both ﬂags set simultaneously were skipped. In our studies different combinations of weights cases ((a)–(h)) with ﬂags cases ((I)–(VI)) were considered. It should be noted that combinations of case (a) with cases (I)–(VI) reﬂect approaches where only after-failure mechanisms are enabled. Presented after failure mechanisms are well known, and thus, these combinations should be considered as a real-live conventional approaches. We performed our studies under three different failure scenarios assuming different range and parameters of failures. Occasional failures and repairs scenario assumes that both events occur with deﬁned probability for each link and no more than one link is broken at the same time. The idea is to reﬂect standard network behaviour. In case of a disaster scenario, it is assumed that numerous network links fail simultaneously reﬂecting some serious issues, e.g. huge network crash or even natural phenomenons like earthquakes. Finally, the last scenario was helpful to assess proposed solution under critical failures case. Such an issue may be an effect of random conﬂuence of events but it may also result from intentional and targeted malicious operation. Failure and repair events were generated with exponential distribution. When analyzing results one must also notice that the amount of transmitted data was computed under the assumption of scaled down network resources. Units of [MB] in the table should be changed to [GB] if using realistic 10 Gbit/s rates instead of scaled down 10 Mbit/s. Two important conclusions valid for all failure scenarios regard the w4 weight and the dropAll mechanism. Enabling the dropAll mechanism results in complete deterioration of network performance, especially in terms of transmitted data. It results from the fact, that the IP layer with strongly constrained resources must handle the traﬃc from all of the bypasses in the network. Unmanageable congestions impose data loss and increase delay. Simultaneously, if w4 = 10 0 0 then the nby factor becomes signiﬁcant in the gain Eq. (1), and thus, bypasses using optical links traversed by the smallest number of other bypasses are preferred. Such an approach is intuitively reasonable, however as a side effect, the average bypass length increases signiﬁcantly as the controller avoids links with existing bypasses even when there is only a single bypass transmitting negligible amount of data. Moreover, longer paths are selected also while there is a possibility to reuse one of the existing bypasses. Therefore, to limit the amount of data and for clarity purposes we do not present detailed results for cases (II) and (III) (where dropAll mechanism was enabled), as well as, (b), (e), (f) and (h) assuming w4 = 10 0 0. 4.1. Occasional failures and repairs scenario Table 2 presents results under ﬁrst scenario of occasional failures and repairs. For each parameter 16 results can be noticed, one per each combination of weights cases ((a), (c), (d) and (g)) with after failure mechanisms cases ((I), (IV), (V), (VI)). In order to analyze results in a systemic way, we will ﬁrst compare weights cases under assumed after failure mechanisms combination (it denotes considering consecutive table rows in particular column). After that we will compare different after failure mechanisms as-

82

P. Boryło et al. / Computer Networks 133 (2018) 73–89 Table 2 Network performance under occasional failures and repairs scenario. Case(a): w4 = 0, w5 = 0, w6 = 0; Case(c): w4 = 0, w5 = 100, w6 = 0; Case(d): w4 = 0, w5 = 0, w6 = 100; Case(g): w4 = 0, w5 = 100, w6 = 100 After failure ﬂag cases. Legend: letters F and T denote false and true, respectively. Consecutive letters regard dropAll, redirectBypass, and bypassIP mechanisms. Parameter Transmitted data [MB]

Delay [ms]

No. of hops

No. of bypasses

Bypass length

Bypass activity [s]

No. of withdrawn bypasses

Throughput of withdrawn bypasses [MB]

Redirected IP traﬃc [MB]

Case

I: FFF

IV: FTF

V: FFT

VI: FTT

(a) (c) (d) (g)

217.86 ± 1.44 211.02 ± 1.72 220.78 ± 1.5 210.33 ± 1.59

256.09 ± 1.29 236.20.12 ± 1.40 243.01 ± 1.74 236.79 ± 1.59

243.54 ± 1.32 233.66 ± 1.49 245.27 ± 0.98 228.11 ± 1.33

249.39 ± 1.13 247.38 ± 1.10 246.64 ± 1.53 241.54 ± 1.75

(a) (c) (d) (g)

43.68 ± 2.34 45.52 ± 2.07 43.83 ± 2.18 45.29 ± 4.26

44.60 ± 2.37 46.59 ± 1.80 44.33 ± 2.46 46.23 ± 1.96

45.51 ± 2.23 44.75 ± 2.1 45.16 ± 2.5 46.63 ± 1.9

45.55 ± 3.17 47.32 ± 3.76 46.43 ± 2.37 47.15 ± 2.36

(a) (c) (d) (g)

10.81 ± 0.23 10.89 ± 0.15 10.85 ± 0.21 10.93 ± 0.16

10.81 ± 0.21 10.96 ± 0.16 10.88 ± 0.19 10.88 ± 0.13

10.84 ± 0.16 10.79 ± 0.25 10.8 ± 0.13 10.91 ± 0.20

10.88 ± 0.17 10.94 ± 0.18 10.85 ± 0.17 10.96 ± 0.18

(a) (c) (d) (g)

53.25 ± 0.25 52.37 ± 0.38 55.38 ± 0.13 49.25 ± 0.28

65.75 ± 0.48 68.38 ± 0.61 63.13 ± 0.50 64.75 ± 0.51

69.13 ± 0.77 64.75 ± 0.53 72.25 ± 0.53 68.75 ± 0.70

73.75 ± 0.74 79.13 ± 0.71 74.63 ± 0.77 77.00 ± 0.62

(a) (c) (d) (g)

6.74 ± 0.33 6.40 ± 0.57 6.79 ± 0.43 6.57 ± 0.32

6.71 ± 0.49 6.08 ± 0.31 6.71 ± 0.46 6.36 ± 0.4

6.63 ± 0.45 6.22 ± 0.38 6.6 ± 0.29 6.09 ± 0.32

6.67 ± 0.38 6.10 ± 0.35 6.57 ± 0.36 6.11 ± 0.28

(a) (c) (d) (g)

6.53 ± 0.52 6.48 ± 0.57 6.44 ± 0.52 6.72 ± 0.64

6.71 ± 0.49 7.08 ± 0.39 7.33 ± 0.58 7.07 ± 0.5

6.8 ± 0.18 6.97 ± 0.21 6.66 ± 0.25 6.61 ± 0.21

7.08 ± 0.28 7.00 ± 0.36 7.14 ± 0.44 7.01 ± 0.18

(a) (c) (d) (g)

10.75 ± 0.14 9.37 ± 0.12 11.5 ± 0.16 10.13 ± 0.10

9.89 ± 0.14 12.00 ± 0.16 12 ± 0.17 12.63 ± 0.16

12.88 ± 0.18 11.88 ± 0.15 12.63 ± 0.16 11.5 ± 0.16

15.13 ± 0.20 14.88 ± 0.18 15.00 ± 0.16 14.00 ± 0.16

(a) (c) (d) (g)

8.51 ± 0.10 7.43 ± 0.10 9.08 ± 0.11 7.93 ± 0.08

9.89 ± 0.14 8.95 ± 0.12 9.62 ± 0.13 9.40 ± 0.12

10.17 ± 0.14 9.62 ± 0.12 9.44 ± 0.13 8.58 ± 0.11

11.79 ± 0.16 11.44 ± 0.16 11.63 ± 0.13 10.77 ± 0.13

(a) (c) (d) (g)

37.68 ± 0.48 36.23 ± 0.40 37.09 ± 0.46 34.77 ± 0.43

36.21 ± 0.43 38.05 ± 0.52 37.91 ± 0.46 39.22 ± 0.50

37.58 ± 0.46 38.81 ± 0.45 40.51 ± 0.50 36.98 ± 0.43

37.35 ± 0.44 38.97 ± 0.47 40.96 ± 0.50 39.64 ± 0.50

suming ﬁxed weights case (analogously, consecutive columns in selected row will be compared for each parameter). Finally, some global conclusions will be drawn. While analyzing results please note that case (a) is considered as a conventional scenario as only well known reactive survivability mechanisms can be enabled and SDN usage is not critical in that case. Assuming case (I) with all after failure mechanisms disabled, it can be observed that case (c) gives improvements in terms of reliability parameters (number of withdrawn bypasses, throughput of withdrawn bypasses and redirected IP traﬃc) at the cost of slight deterioration on network performance (transmitted data and delay). It is reasonable as case (c) makes Tby to contribute to the gain formula, so lightpaths are more evenly distributed over the network. As a side effect smaller number of bypasses can be established, which decreases network capacity. On the other hand, case (d) increases the impact of TcongIP parameter to the gain. As a result, network switching capabilities increase at the cost of reliability factors. Finally, weights assumed in case (g) try to merge both previous cases, but not successfully. Network performance is signiﬁcantly deteriorated, and simultaneously, reliability factors can be roughly evaluated somewhere between cases (c) and (d). Case

(IV) enables redirectBypass mechanism. Case (a), that do not introduce any factors to the gain formula, provides the best results in terms of network performance. Moreover, case (c) is the only one that improves throughput of a withdrawn bypasses reliability parameter. Thus, proactive mechanisms should not be combined only with redirectBypass after failure mechanisms. That is because all bypasses affected by the failure, even those transmitting small amount of traﬃc, will be rerouted in the optical domain instead of being groomed. The effect of ineﬃcient network resource consumption will be ampliﬁed once reliability factors will contribute to gain formula and widely distribute lightpaths over the network. The bypassIP mechanism is enabled in case (V), which combined with case (d), provides the best results in terms of network performance. Those results are comparable to case (a). That is because case (d) makes TcongIP to contribute to the gain formula so links congested in electrical layer are avoided by optical paths, and thus, the IP traﬃc can be successfully bypassed utilizing the bypassIP mechanism. All the cases (c), (d) and (g), introducing any proactive survivability mechanisms, outperform case (a) with regard to the reliability parameters. The IP traﬃc affected by the failure is automatically handled in the optical domain. Simultaneously, all the

P. Boryło et al. / Computer Networks 133 (2018) 73–89

83

Fig. 6. Transmitted data (white) and throughput of withdrawn bypasses (gray) under case (V) of after failure mechanisms.

optical paths, including those created after failures, are optimized by survivable proactive mechanisms. This observation proves the fact that cooperation between layers in multilayer networks may bring valuable results. Finally, case (VI) enables both, redirectBypass and bypassIP after failure mechanisms. Case (a) provides the best results regarding the network performance, however, differences are much less signiﬁcant, especially when comparing case (a) with cases (c) and (d). Also in terms of network reliability cases (c), (d) and (g) outperform case (a). It means that enabling bypassIP as an addition to redirectBypass makes it reasonable to combine proactive and after failure survivability mechanisms. The reason is that electric layer traﬃc affected by the failure may be bypassed utilizing existing bypasses with spare capacity. As a result optical paths are utilized more eﬃciently, and thus, survivable proactive mechanisms may introduce more advantages. As an example, Fig. 6 presents results obtained for different proactive cases under assumption of (V) combination of after failure mechanisms. It illustrates the fact that case (d) provides the best results in terms of network performance, simultaneously ensuring reliability at a reasonable level. Now, we will assume consecutive weights cases and compare after failure mechanisms under this assumption. No matter which proactive mechanisms are enabled (cases (a), (c), (d) or (g)), throughput and the number of withdrawn bypasses is minimal when all after failure mechanisms are disabled (case (I)). The reason is that disabling any after failure mechanism reduces the number of bypasses in the network which naturally reduces the negative impact of each failure on those bypasses. The only exception can be observed for case (a) where enabling the redirectBypass mechanism reduces the number of withdrawn bypasses. It results from the fact that in case (a) no reliability factors contribute to the gain formula and lightpaths are less distributed over the network. Thus, the redirectBypass mechanism is able to redirect affected bypasses reducing number of withdrawn bypasses. In terms of network performance, enabling any after failure mechanism brings signiﬁcant improvements compared to the case with all after failure mechanisms disabled. For case (a) the best results can be observed with the redirectBypass mechanism enabled. The reason is the same as the one provided above for reliability parameters: the redirectBypass mechanism is able to effectively handle bypasses affected by failures as other bypasses are rather concentrated in selected network areas. For cases (c) and (g), the best network performance is achieved in case (VI) with both, redirectBypass and bypassIP after failure mechanisms enabled. It means that if only the

Tby factor contributes to the gain formula, spreading optical layer traﬃc over the network, it is reasonable to simultaneously redirect bypasses in optical domain and bypass the IP layer traﬃc if failure occurs and network performance is considered. Finally, case (d), which assigns non-zero weight only to the TcongIP factor in the gain formula, achieves similar network performance for cases (IV), (V) and (VI). It means that if proactive mechanisms direct bypasses avoiding links congested in IP layer, any after failure mechanism may improve performance as it integrates different multilayer resilience mechanisms. Globally, the best results in terms of network performance are achieved by the combination of cases (a) and (IV), which means that none of the reliability factors contribute to the gain formula and only redirectBypass is enabled. On the other hand, from the survivability point of view, the combination of cases (c) and (I) outperforms other cases but at the cost of signiﬁcant deterioration on network performance. However, there are also combinations that only slightly decrease network performance and achieve signiﬁcantly improved network survivability. Cases (c) and (d) combined with cases (IV) and (V) are examples that provide valuable tradeoff between performance and reliability. 4.2. Disaster scenario Table 3 presents results under a disaster scenario. The format of the table and the approach to the analysis is exactly the same as in a previous scenario. However, some analysis are skipped not to repeat those provided in case of occasional failures. First important observation is that for any case (I), (IV), (V) or (VI), weights combination (d) is not able to improve neither network performance nor survivability. Thus, it is not reasonable to make optical bypasses to avoid links congested in the IP layer by making TcongIP the only factor that contributes to the gain formula in a disaster scenario. Assuming cases (I) and (IV) weights combination, (c) and (g) introduce slight deterioration of network performance. However, at the same time, those cases improve network survivability. Therefore, a tradeoff can be observed if only Tby contributes to the gain formula (only Tby or accompanied by TcongIP ). For cases (V) and (VI), the tradeoff is valid only for weights combination (c), while combination (g) decreases both network performance and survivability. It means that for the disaster scenario mechanism redirecting bypasses affected by the failure can be combined with both cases, (c) and (g). Simultaneously, it should be avoided to assign non-zero weight to TcongIP in the gain formula if the network controller tries to bypass the IP layer traﬃc

84

P. Boryło et al. / Computer Networks 133 (2018) 73–89 Table 3 Network performance under disaster scenario. Case(a): w4 = 0, w5 = 0, w6 = 0; Case(c): w4 = 0, w5 = 100, w6 = 0; Case(d): w4 = 0, w5 = 0, w6 = 100; Case(g): w4 = 0, w5 = 100, w6 = 100 After failure ﬂag cases. Legend: letters F and T denote false and true, respectively. Consecutive letters regard dropAll, redirectBypass, and bypassIP mechanisms. Parameter Transmitted data [MB]

Delay [ms]

No. of hops

No. of bypasses

Bypass length

Bypass activity [s]

No. of withdrawn bypasses

Throughput of withdrawn bypasses [MB]

Redirected IP traﬃc [MB]

Case

I: FFF

IV: FTF

V: FFT

VI: FTT

(a) (c) (d) (g)

174.38 ± 2.29 165.90 ± 2.30 162.39 ± 2.11 162.43 ± 2.47

211.09 ± 4.13 191.12 ± 4.91 181.60 ± 1.49 193.39 ± 3.51

203.81 ± 2.78 196.90 ± 2.36 179.59 ± 1.81 186.43 ± 3.3

214.81 ± 3.26 204.40 ± 3.81 180.39 ± 3.03 196.35 ± 1.44

(a) (c) (d) (g)

43.86 ± 2.90 44.72 ± 2.29 45.59 ± 3.57 52.36 ± 3.84

42.21 ± 4.28 45.04 ± 3.18 47.88 ± 4.30 43.03 ± 5.42

44.59 ± 4.89 43.14 ± 3.77 46.48 ± 5.20 47.35 ± 3.93

47.70 ± 4.29 43.65 ± 4.56 47.98 ± 5.60 46.25 ± 5.06

(a) (c) (d) (g)

11.05 ± 0.69 11.08 ± 0.64 11.12 ± 0.70 11.41 ± 1.03

10.63 ± 0.66 11.03 ± 0.66 11.17 ± 0.70 11.15 ± 0.82

11.30 ± 0.83 10.70 ± 0.55 11.86 ± 2.52 11.78 ± 2.17

11.24 ± 0.93 10.89 ± 1.06 11.11 ± 1.84 11.32 ± 1.27

(a) (c) (d) (g)

46.13 ± 0.42 42.38 ± 0.43 42.57 ± 0.38 44.25 ± 0.31

66.00 ± 1.01 64.25 ± 0.87 76.00 ± 0.72 64.60 ± 0.74

72.83 ± 0.68 64.75 ± 0.83 77.80 ± 0.90 73.25 ± 0.52

79.50 ± 0.69 82.33 ± 0.71 79.00 ± 0.82 86.00 ± 1.17

(a) (c) (d) (g)

7.16 ± 0.57 6.89 ± 0.76 7.26 ± 0.44 6.77 ± 0.69

7.03 ± 0.67 6.48 ± 0.48 6.45 ± 0.52 6.61 ± 0.61

7.01 ± 0.29 6.60 ± 0.75 6.93 ± 0.54 6.80 ± 0.64

7.16 ± 0.37 6.32 ± 0.32 6.02 ± 0.51 6.74 ± 0.55

(a) (c) (d) (g)

6.27 ± 0.46 6.50 ± 0.75 6.27 ± 0.46 6.17 ± 0.61

6.84 ± 0.91 6.94 ± 1.07 6.60 ± 0.32 6.66 ± 0.55

6.69 ± 0.66 6.26 ± 0.33 6.20 ± 0.56 5.92 ± 0.75

7.05 ± 0.91 6.39 ± 1.08 6.68 ± 1.02 6.31 ± 1.04

(a) (c) (d) (g)

16.63 ± 0.07 14.00 ± 0.12 17.71 ± 0.06 14.88 ± 0.09

24.75 ± 0.40 20.00 ± 0.29 26.00 ± 0.26 22.40 ± 0.27

25.83 ± 0.23 19.00 ± 0.28 28.40 ± 0.16 28.25 ± 0.25

24.25 ± 0.37 23.67 ± 0.40 32.00 ± 0.50 30.00 ± 0.59

(a) (c) (d) (g)

12.69 ± 0.09 10.55 ± 0.14 13.84 ± 0.08 11.33 ± 0.11

17.27 ± 0.46 14.66 ± 0.40 18.72 ± 0.32 16.87 ± 0.35

20.03 ± 0.35 14.92 ± 0.41 22.18 ± 0.27 22.08 ± 0.44

19.31 ± 0.51 16.98 ± 0.28 30.87 ± 0.64 22.49 ± 0.61

(a) (c) (d) (g)

69.87 ± 0.65 73.36 ± 0.63 77.99 ± 0.64 68.52 ± 0.65

56.21 ± 1.14 59.28 ± 1.11 81.59 ± 0.62 65.71 ± 1.13

68.02 ± 1.24 77.06 ± 1.52 88.44 ± 0.95 90.16 ± 0.63

59.78 ± 1.12 80.66 ± 1.17 83.29 ± 1.62 86.21 ± 0.67

affected by the failure as both mechanisms compete in handling the affected IP layer traﬃc. The effect is ampliﬁed as numerous failures occur in the network, which is the case under the disaster scenario. As an example, Fig. 7 presents the results obtained for different proactive cases under the assumption of (VI) combination of after failure mechanisms. It illustrates the fact that case (c) provides slight deterioration in terms of network performance, simultaneously ensuring the best reliability. Now, in order to compare different after failure mechanisms, we will ﬁx consecutive weights cases. Analogously to the occasional failure scenario, no matter which proactive mechanisms are enabled (cases (a), (c), (d) or (g)), throughput and the number of withdrawn bypasses are minimal when all after failure mechanisms are disabled (case (I)). The reason is that disabling any after failure mechanism reduces the number of bypasses in the network which naturally reduces the negative impact of each failure on those bypasses but also on network performance. Thus, enabling any of after failure mechanisms brings signiﬁcant improvements in terms of network performance compared to the case with all after failure mechanisms disabled. For cases (a) and (g) the best results can be observed if only the redirectBypass mechanism is enabled

(cases (IV) and (VI)). For case (c) the bypassIP mechanism must be enabled (case (V) or (VI)) to obtain the best results regarding network performance. Finally, all cases (I), (IV), (V) and (VI) provide more or less the same results assuming case (d). However, in some cases those improvements come at the cost of reliability deterioration. Especially case (VI) has a negative impact on survivability parameters for all cases (a), (c), (d) and (g), just as case (V) for cases (a) and (g). Globally, considering the improvement of performance balanced by the reliability factors, the most attractive results can be obtained when case (c) is combined with redirectBypass or bypassIP mechanisms (case (IV) or (V)). Simultaneously, the best results in terms of network performance are achieved by the combination of case (a) with cases (IV) and (VI), which means that none of the reliability factors contribute to the gain formula and at least the redirectBypass mechanism is enabled. On the other hand, from the survivability point of view, the combination of cases (c) and (I) outperforms others but at the cost of signiﬁcant deterioration on network performance. All the general conclusions are consistent with those drawn for the occasional failure scenario.

P. Boryło et al. / Computer Networks 133 (2018) 73–89

85

Fig. 7. Transmitted data (white) and throughput of withdrawn bypasses (gray) under case (VI) of after failure mechanisms.

4.3. Single critical failure scenario Finally, Table 4 presents results under a single critical failure scenario. First of all, it must be explained why reliability parameters (the number of withdrawn bypasses, the throughput of withdrawn bypasses and redirected IP traﬃc) are independent from after failure mechanisms combination. There is only a single failure during the simulation, so reactive after failure mechanisms cannot change the impact of any future failures. When comparing weights combinations in terms of reliability parameters the following conclusions can be drawn. Case (c), which makes Tby to contribute to the gain formula, limits the number of withdrawn bypasses and their throughput. Network is even more reliable under case (g), when both factors (Tby and TcongIP ) contribute to the gain formula making bypasses even more distributed over the network. However, those improvements come at the cost of network performance deterioration especially for cases (c) and (g), when compared to case (a). Case (d) retains network performance at the level comparable to the base case (a). Simultaneously, after failure mechanisms combinations (cases (I), (IV), (V), (VI)) have various impact on network performance. The general conclusion is that enabling any after failure mechanism has negative impact on network performance, no matter which combination of weights is considered. Moreover, despite the fact that for case (d) network performance may be retained, network operator should deeply analyze the risk of a single critical failure in his infrastructure and assess whether a tradeoff associated with the proposed proactive mechanisms is acceptable in such a case. As an example, Fig. 8 presents results obtained for different proactive cases under the assumption of (VI) combination of after failure mechanisms. It illustrates the fact that case (d) provides slight deterioration in terms of network performance, simultaneously ensuring marginal reliability improvement. Signiﬁcant improvements regarding the survivability provided by cases (c) and (g) come at the cost of network performance deterioration. To sum up, we performed simulations under three major failure scenarios. For occasional and disaster failure scenarios similar and consistent conclusions were drawn. Thus, it can be stated that properties of mechanisms are only slightly dependent on failures intensity. Common conclusions are as follows. The best combinations of mechanisms were selected in terms of network performance, survivability and the tradeoff between these two factors. The best performance may be obtained if none of the factors is introduced to the gain formula and at least the redirectBypass mechanism is enabled. On the other hand, from the survivability point

of view, introducing Tby to the gain formula while all after failure mechanisms are disabled outperforms other combinations. However, the best tradeoff between multilayer network performance and reliability is achieved when Tby contributes to the gain formula and redirectBypass or bypassIP mechanisms are enabled. A bit different conclusions are drawn for the single critical failure scenario which is based on substantially different assumptions. In this scenario network operator must be even more careful if a tradeoff associated with proposed proactive mechanisms is acceptable in confrontation with the risk of a single critical failure. 4.4. Practical deployment considerations To send data between the central controller and routers, the communication protocol, e.g. OpenFlow needs to be used. We did not observe the amount of traﬃc sent between the controller and routers. However, it is simple to estimate the maximum amount of such traﬃc. In the most complex scenario, we generated 3810 ﬂows. As network capacity was scaled down we assume 10 times more ﬂows for further considerations, which is 38100. An ID of a ﬂow was stored in memory by using 4 bytes. Flow IDs were sent to the controller when they were registered in routers. We had 39 routers in the network, so at maximum 5.94 MB was sent. Moreover, no more than 95 bypasses were set up in the network during a simulation run. If we assume that all ﬂows were active in the congested links, 95∗ 38100∗ 8=28,96 MB (we needed additional 4 bytes to send a ﬂow rate to the controller) was sent in the network. This value could be twice greater if we assume that similar information was sent back from the controller to routers. As considered in Section 3.3, among mechanisms handling traﬃc after an event of failure only dropAll imposes some additional trafﬁc exchange between controller and network devices. The impact of redirectBypass and bypassIP mechanisms is reﬂected only by the maximum number of bypasses set up in the network during a simulation run and do not affect other performance issues. Thus, applying 1.4 scaling factor to reﬂect the dropAll mechanism in the amount of data being exchanged is overestimated. Concluding, at maximum 87.03 MB was sent between the controller and routers in the network in one simulation run. The number of ﬂows in core networks varies depending on the approach to the rules placement in forwarding tables, as well as, ﬂow deﬁnition and aggregation. Simultaneously, our aim is to prove that the amount of control trafﬁc is not an issue for the SAHB mechanism even in the worst case. Therefore, additional multiplying data amount by 10 results in less than 1GB of data. This value is negligible in the context of mul-

86

P. Boryło et al. / Computer Networks 133 (2018) 73–89

Table 4 Network performance under single critical failure scenario. Case(a): w4 = 0, w5 = 0, w6 = 0; Case(c): w4 = 0, w5 = 100, w6 = 0; Case(d): w4 = 0, w5 = 0, w6 = 100; Case(g): w4 = 0, w5 = 100, w6 = 100 After failure ﬂag cases. Legend: letters F and T denote false and true, respectively. Consecutive letters regard dropAll, redirectBypass, and bypassIP mechanisms. Parameter Transmitted data [MB]

Delay [ms]

No. of hops

No. of bypasses

Bypass length

Bypass activity [s]

No. of withdrawn bypasses

Throughput of withdrawn bypasses [MB]

Redirected IP traﬃc [MB]

Case

I: FFF

IV: FTF

V: FFT

VI: FTT

(a) (c) (d) (g)

250.66 ± 0.36 232.14 ± 0.11 248.40 ± 0.32 219.93 ± 1.12

241.74 ± 0.68 226.10 ± 1.13 243.38 ± 0.44 212.99 ± 1.10

245.42 ± 0.60 225.90 ± 1.33 247.37 ± 0.49 212.31 ± 1.23

244.64 ± 0.58 222.51 ± 1.32 243.69 ± 0.70 218.38 ± 1.04

(a) (c) (d) (g)

46.13 ± 1.80 45.95 ± 2.70 45.19 ± 1.14 46.76 ± 2.36

46.23 ± 2.15 46.47 ± 2.47 46.09 ± 1.83 46.31 ± 2.26

48.01 ± 2.91 45.59 ± 2.90 46.47 ± 2.12 48.09 ± 3.05

48.17 ± 2.56 47.79 ± 2.95 46.47 ± 2.57 46.52 ± 1.98

(a) (c) (d) (g)

10.93 ± 0.16 11.04 ± 0.11 10.90 ± 0.09 11.12 ± 0.19

10.93 ± 0.08 11.07 ± 0.11 10.92 ± 0.10 11.15 ± 0.18

10.93 ± 0.08 11.11 ± 0.11 10.92 ± 0.06 11.14 ± 0.19

11.01 ± 0.15 11.07 ± 0.10 10.94 ± 0.07 11.15 ± 0.12

(a) (c) (d) (g)

47.13 ± 0.16 45.13 ± 0.18 49.38 ± 0.22 44.33 ± 0.22

48.29 ± 0.17 44.88 ± 0.31 49.75 ± 0.19 46.71 ± 0.15

49.29 ± 0.20 44.14 ± 0.19 50.00 ± 0.21 46.67 ± 0.17

50.38 ± 0.15 44.57 ± 0.20 50.63 ± 0.24 48.50 ± 0.15

(a) (c) (d) (g)

6.76 ± 0.25 6.35 ± 0.29 6.56 ± 0.25 6.27 ± 0.57

6.67 ± 0.24 6.37 ± 0.28 6.69 ± 0.27 6.09 ± 0.51

6.85 ± 0.34 6.34 ± 0.33 6.76 ± 0.29 6.07 ± 0.59

6.66 ± 0.30 6.33 ± 0.21 6.67 ± 0.35 6.05 ± 0.31

(a) (c) (d) (g)

7.73 ± 0.37 7.82 ± 0.34 7.55 ± 0.29 7.99 ± 0.39

7.57 ± 0.35 8.06 ± 0.60 7.65 ± 0.34 7.88 ± 0.37

7.81 ± 0.36 8.06 ± 0.48 7.83 ± 0.39 7.89 ± 0.34

7.72 ± 0.27 8.13 ± 0.42 7.88 ± 0.44 7.92 ± 0.24

(a) (c) (d) (g)

6.00 ± 0.02 5.38 ± 0.02 6.12 ± 0.02 5.13 ± 0.03

(a) (c) (d) (g)

4.84 ± 0.04 4.46 ± 0.04 4.68 ± 0.04 4.38 ± 0.05

(a) (c) (d) (g)

12.51 ± 0.26 12.23 ± 0.15 11.90 ± 1.40 12.35 ± 0.32

Fig. 8. Transmitted data (white) and throughput of withdrawn bypasses (gray) under case (VI) of after failure mechanisms.

P. Boryło et al. / Computer Networks 133 (2018) 73–89

tilayer core networks operating at 10 Gbit/s rates. Moreover, this is the maximum value, and for sure lower amount of traﬃc was transmitted between the controller and routers in each case. What is very important, we did not observe any unfavorable impact of signalling delay or lack of synchronization among routers on setting up and releasing bypasses. Separate issue regards to the controller scalability at the moment of a failure when multiple requests may fully utilize controller’s processing capacity. The problem is addressed in practical deployments as a more general one, i.e. to mitigate a single point of failure and any scalability issues resulting from centralized nature of the control plane. Therefore, a few solutions were proposed in order to distribute the control plane either in ﬂat or hierarchical designs [38]. Most of the approaches can be seamlessly combined with the SAHB mechanism as even if the control plane is distributed among numerous controllers it communicates with a network devices as a single entity.

87

Bypass mechanism is ﬂexible, adjustable to the network conditions and allows for eﬃcient and resilient transmission in a network. We believe that the proposed mechanisms will be a step forward in the process of solving the problem of automatic and survivable oﬄoading telecommunication networks which is especially important in the context of increasing network traﬃc. Proposed solutions may become a part of the optimization of multilayer software-deﬁned networks as network operators are searching for scalable and resilient solutions to manage their network, effectively utilize resources and increase revenues. Acknowledgements The research was carried out with the support of the project “High quality, reliable transmission in multilayer optical networks based on the Flow-Aware Networking concept” founded by the Polish National Science Centre under project no. DEC2011/01/D/ST7/03131.

4.5. Interpretation of results References Reliability properties of links are translated to the properties of bypass candidates in order to decrease negative impact of future network failures in the following way. For all the links traversed by a bypass candidate we considered worst cases: the highest number of bypasses installed, the amount of traﬃc of all optical bypasses on the link for which this value is the greatest, and the sum of the IP layer traﬃc for which the pre-congestion threshold is exceeded. Furthermore, those factors are combined with several restoration mechanisms, utilized after failures. Several combinations of mechanisms were investigated, and results proved that proactive and reactive mechanisms should be considered together. Finally, we proposed novel parameters to assess the methods from the survivability point of view. The number of withdrawn bypasses, the traﬃc throughput of those bypasses, and the IP layer traﬃc redirected after a failure are those indicators. Advantages are achieved using programmable weights and the possibility to enable or disable the whole set of proactive and reactive mechanisms. Thus, from one point of view it is always possible to ﬁnd an eﬃcient combination of values and mechanisms. However, on the other hand, an appropriate choice is not an easy task. It may require some additional tools for the network operator to seamlessly ﬁnd the proper conﬁguration for the production multilayer network. The issue is further ampliﬁed by the fact that some of the mechanism combinations may deteriorate network performance or reliability. Thus, we identify this issue as the main aim for future work. One of the possible solutions is to take advantage of the SDN concept in terms of network monitoring, obtain long and short term traﬃc statistics, emulate production environment in the laboratory infrastructure, and ﬁnally, incorporate Survivable Automatic Hidden Bypasses to the production infrastructure. 5. Conclusions Several methods implementing optical bypasses in multilayer networks have been proposed so far. They assume the use of distributed or centralized algorithms for selecting bypasses and some of them are resilience-aware. However, to the best of our knowledge, the solution proposed in this paper is the ﬁrst one that focuses on bypasses establishment process in order to reduce negative impact of possible future network failures. The main contribution of the paper is related to three factors introduced to the gain formula in order to reﬂect survivability aspects during bypass calculation process. The method for automatic setting up survivable hidden bypasses proposed in this paper is a promising solution to be easily implemented in software-deﬁned multilayer networks. Simulation results conﬁrm that the Survivable Automatic Hidden

[1] CISCO, Cisco visual networking index: forecast and methodology, 2012– 2017, 2013. http://www.davidellis.ca/wp-content/uploads/2012/08/cisco-vni_ c11-481360.pdf. [2] P. Borylo, A. Lason, J. Rzasa, A. Szymanski, A. Jajszczyk, Energy-aware fog and cloud interplay supported by wide area software deﬁned networking, in: Proc. IEEE ICC, Kuala Lumpur, Malaysia, 2016. [3] P. Borylo, A. Lason, J. Rzasa, A. Szymanski, A. Jajszczyk, Anycast routing for carbon footprint reduction in WDM hybrid power networks with data centers, in: Proc. IEEE ICC, Sydney, Australia, 2014, pp. 3720–3726. [4] M.D. de Assunção, R. Carpa, L. Lefèvre, O. Glück, On Designing Sdn Services for Energy-aware Traﬃc Engineering, in: Testbeds and Research Infrastructures for the Development of Networks and Communities, Springer International Publishing, 2016, pp. 14–23. [5] J. Domzal, Z. Dulinski, J. Rzasa, P. Gawlowicz, E. Biernacka, R. Wojcik, Automatic hidden bypasses in software-deﬁned networks, J. Netw. Syst. Manag. (2016), doi:10.1007/s10922- 016- 9397- 5. [6] M. Caria, M. Chamania, A. Jukan, Trading IP routing stability for energy efﬁciency: a case for traﬃc oﬄoading with optical bypass, in: 15th International Conference on Optical Network Design and Modeling (ONDM), 2011, 2011, pp. 1–6. [7] S. Joachim, Eﬃciency analysis of distributed dynamic optical bypassing heuristics, in: IEEE International Conference on Communications (ICC), 2012, pp. 5951–5956. [8] M. Ruﬃni, D. O’Mahony, L. Doyle, Optical IP switching: a Flow-Based approach to distributed cross-Layer provisioning, IEEE/OSA J. Opt. Commun. Netw. 2 (8) (2010) 609–624. [9] R. Jain, S. Paul, Network virtualization and software deﬁned networking for cloud computing: a survey, IEEE Commun. Mag. 51 (11) (2013) 24–31. [10] M. Caria, M. Chamania, A. Jukan, A comparative performance study of load adaptive energy saving schemes for IP-Over-WDM networks, IEEE/OSA J. Opt. Commun. Netw. 4 (3) (2012) 152–164. [11] S. Adibi, S. Erfani, An optimization-heuristic approach to dynamic optical bypassing, in: Photonische Netze (ITG-FB 233), 2012, pp. 41–46. [12] M. Chamania, M. Caria, A. Jukan, Achieving IP routing stability with optical bypass‘, Opt. Switch. Netw. 7 (4) (2010) 173–184. [13] R. Casellas, R. Martinez, R. Munoz, R. Vilalta, L. Liu, T. Tsuritani, I. Morita, Control and management of ﬂexi-grid optical networks with an integrated stateful path computation element and openﬂow controller [invited], IEEE/OSA J. Opt. Commun. Netw. 5 (10) (2013) A57–A65. [14] L. Liu, H.Y. Choi, R. Casellas, T. Tsuritani, I. Morita, R. Martinez, R. Munoz, Demonstration of a dynamic transparent optical network employing ﬂexible transmitters/receivers controlled by an openﬂow-stateless PCE integrated control plane [invited], IEEE/OSA J. Opt. Commun. Netw. 5 (10) (2013) A66–A75. [15] D. Simeonidou, R. Nejabati, M. Channegowda, Software deﬁned optical networks technology and infrastructure: Enabling software-deﬁned optical network operations, in: Optical Fiber Communication Conference and Exposition and the National Fiber Optic Engineers Conference (OFC/NFOEC), 2013, 2013, pp. 1–3. [16] P. Borylo, A. Lason, J. Rzasa, A. Szymanski, A. Jajszczyk, Fitting green anycast strategies to cloud services in wdm hybrid power networks, in: Proc. IEEE GLOBECOM, Austin, TX, USA, 2014, pp. 2633–2639. [17] P. Borylo, A. Lason, J. Rzasa, A. Szymanski, A. Jajszczyk, Green cloud provisioning throughout cooperation of a WDM wide area network and a hybrid power IT infrastructure, J. Grid Comput. 14 (1) (2016) 127–151, doi:10.1007/ s10723-015-9354-7. [18] V. Lopez, C. Cardenas, J.A. Hernandez, J. Aracil, M. Gagnaire, Extension of the Flow-Aware Networking (FAN) architecture to the IP over WDM environment, in: 4th International Telecommunication Networking Workshop on QoS in Multiservice IP Networks, 2008, pp. 101–106.

88

P. Boryło et al. / Computer Networks 133 (2018) 73–89

[19] J. Domzal, R. Wojcik, V. Lopez, J. Aracil, A. Jajszczyk, Efmp - a new congestion control mechanism for ﬂow-aware networks, Trans. Emerg. Telecommun. Technol. 25 (11) (2014) 1137–1148. [20] M. Chamania, A. Jukan, A comparative analysis of the effects of dynamic optical circuit provisioning on IP routing, IEEE/ACM Trans. Netw. 22 (2) (2013) 429–442. [21] O. Gerstel, C. Filsﬁls, T. Telkamp, M. Gunkel, M. Horneffer, V. Lopez, A. Mayoral, Multi-layer capacity planning for IP-optical networks, IEEE Commun. Mag. 52 (1) (2014) 44–51. [22] Q. Zhang, A.L. Chiu, X. Wang, P. Palacharla, M. Sekiya, Restoration design with selective optical bypass in ip-over-optical networks, in: Optical Fiber Communication Conference and Exposition the National Fiber Optic Engineers Conference (OFC/NFOEC 2012), 2012, pp. 1–3. [23] A. Autenrieth, M. Neugirg, J.P. Elbers, M. Gunkel, Evaluation of IP-over-DWDM core network architectures with CD-ROADMs using IP protection in combination with optical restoration, in: 14th International Conference on Transparent Optical Networks (ICTON 2012), 2012, pp. 2–5. [24] J. Enriquez Gabeiras, V. Lopez, J. Aracil, J.P. Fernandez-Palacios, C. Garcia Argos, O. Gonzalez de Dios, F.J. Jimenez Chico, J.A. Hernandez, Is multilayer networking feasible? Opt. Switching Netw. 6 (2) (2009) 129–140. [25] W. Hou, L. Guo, X. Gong, Survivable power eﬃciency oriented integrated grooming in green networks, J. Netw. Comput. Appl. 36 (1) (2013) 420–428. [26] Y.-d. Lin, H.-y. Teng, C.-r. Hsu, C.-c. Liao, Y.-c. Lai, Fast failover and switchover for link failures and congestion in software deﬁned networks, in: IEEE International Conference on Communications (ICC 2016), 2016, pp. 1–6. [27] H. Yang, L. Cheng, J. Deng, Y. Zhao, J. Zhang, Y. Lee, Optical ﬁber technology cross-layer restoration with software deﬁned networking based on IP over optical transport networks, Opt. Fiber Technol. 25 (2015) 80–87. [28] H. Yang, J. Zhang, Y. Zhao, Y. Ji, J. Wu, Y. Xiaosong, J. Han, Y. Lee, Global resources integrated resilience for software deﬁned data center interconnection based on IP over elastic optical network, IEEE Commun. Lett. 18 (10) (2014) 1735–1738.

[29] H. Yang, J. Zhang, Y. Zhao, L. Cheng, J. Wu, Y. Ji, J. Han, Y. Lin, Y. Lee, Service-aware resources integrated resilience for software deﬁned data center networking based on IP over ﬂexi-Grid optical networks, Opt. Fiber Technol. 21 (2015) 93–102. [30] Open Networking Foundation (ONF), SDN architecture overview, 2013. https://www.opennetworking.org/images/stories/downloads/sdn-resources/ technical- reports/SDN- architecture- overview- 1.0.pdf. [31] R. Vilardi, L. Grieco, C. Barakat, G. Boggia, Lightweight enhanced monitoring for high speed networks, Trans. Emerg. Telecommun. Technol. 25 (11) (2014) 1095–1113. [32] A. Kortebi, S. Oueslati, J.W. Roberts, Cross-protect: implicit service differentiation and admission control, in: Proc. High Performance Switching and Routing, HPSR 2004, Phoenix, AZ, USA, 2004, pp. 56–60. [33] J. Domzal, R. Wojcik, P. Cholda, R. Stankiewicz, A. Jajszczyk, Eﬃcient congestion control mechanism for ﬂow-aware networks, Int. J. Commun. Syst. (2015). online ﬁrst, doi: 10.1002/dac.2974. [34] B.P. Gero, P.L. Palyi, S. Racz, Flow-level performance analysis of a multi-rate system supporting stream and elastic services, Int. J. Commun. Syst. 26 (8) (2013). [35] Source code of SAHB, 2016. http://www.kt.agh.edu.pl/∼jdomzal/bypass/ ns- 3- survivable- bypass.tar.gz. [36] S. Orlowski, R. Wessaly, M. Pioro, A. Tomaszewski, Sndlib 1.0—survivable network design library, Net. 55 (3) (2010) 276–286. [37] J. Rzasa, P. Borylo, A. Lason, A. Szymanski, A. Jajszczyk, Dynamic Power Capping for Multilayer Hybrid Power Networks, in: Proc. IEEE GLOBECOM, Austin, TX, USA, 2014, pp. 2563–2569. [38] M. Karakus, A. Durresi, A survey: control plane scalability issues and approaches in software-deﬁned networking (sdn), Comput. Netw. 112 (2017) 279–293.

P. Boryło et al. / Computer Networks 133 (2018) 73–89

89

Piotr Boryło is working as assistant professor at the Department of Telecommunications, AGH University of Science and Technology. He received M.S. and Ph.D. degrees in Telecommunications from the same university in 2012 and 2016, respectively. The Ph.D. thesis was titled: ”Provisioning of Energy-Aware Cloud Services Over Optical Networks”. He is interested mainly in traﬃc engineering and resource provisioning in Software Deﬁned Networks. He worked on cloud services provisioning over optical networks, energy eﬃciency in networks (green networking) and data centers, network and data center infrastructure integration and fog computing concept. He is involved in several scientiﬁc projects and research and development activities.

Jerzy Domzał ˙ received the M.S. and Ph.D. degrees in Telecommunications from AGH University of Science and Technology, Krakow, Poland in 2003 and 2009, respectively. Now, he is an Assistant Professor at Department of Telecommunications at AGH University of Science and Technology. He is especially interested in optical networks and services for future Internet. He is an author or co-author of many technical papers, six patent applications and two books. International trainings: Spain, Barcelona, Universitat Politecnica de Catalunya, April 2005; Spain, Madrid, Universidad Autnoma de Madrid, March 2009, Stanford University, USA, May-June 2012.

Robert Wójcik received his M.Sc. and Ph.D. (with honors) degrees in telecommunications from AGH University of Science and Technology, Kraków, Poland in 2006 and 2011, respectively. Currently, he works as an assistant professor at the Department of Telecommunications of AGH. He is the co-author of 10 international journal papers, 3 books, 3 patent applications and a number of conference papers. He has been involved in several international scientiﬁc projects, including: SmoothIT, (NOE) BONE and Euro-NF. His current research interests focus on Multipath routing, FlowAware Networking, Quality of Service and Network Neutrality.

Survivable automatic hidden bypasses in Software-Defined Networks

Survivable automatic hidden bypasses in Software-Defined Networks

Recommend Documents