Computer Communications 111 (2017) 142–152
Contents lists available at ScienceDirect
Computer Communications journal homepage: www.elsevier.com/locate/comcom
Length Shuffle: Achieving high performance and flexibility for data center networks design Deshun Li, Yanming Shen∗, Keqiu Li School of Computer Science and Technology, Dalian University of Technology, No. 2 Linggong Road, Dalian 116024, China
a r t i c l e
i n f o
Article history: Received 27 December 2016 Revised 23 June 2017 Accepted 1 August 2017 Available online 8 August 2017 Keywords: Data center network High performance Flexibility
a b s t r a c t Scale-out data center requires the underlying network to provide high performance and design flexibility on commodity switches. Current rigid architectures can provide high performance, but they hardly support network design flexibility. Flexible proposals address the issue of elasticity in the rigid networks. However, the introduction of random connection cannot provide guaranteed network performance. In this paper, we proposed Length Shuffle connection to address the performance issue in flexible data center networks. The proposed approach follows a greedy principle that gradually connects the furthest two switches with their available ports. Length Shuffle connection can reduce the number of long paths significantly, which results in a small average length of routing path. The proposed connection can work with random connection and regular connection in flexible network construction, and does not interfere the original routing protocols. Numerical analysis shows the nice properties of Length Shuffle connection in various dimensions. The extensive experimental results demonstrate the advantages of Length Shuffle connection in data transmission delay and aggregate throughput under different traffic patterns. © 2017 Elsevier B.V. All rights reserved.
1. Introduction Data center networks (DCNs) serve as an important infrastructure to provide public services, such as email, web search and online retail, as well as infrastructural services, such as cloud computing and big data processing [7,13,14,33,34,43,46]. Over the past decade, researchers have focused on tackling challenges of typical tree-based networks, and the following issues with respect to high performance and design flexibility in data center networks design [3,15,18,19,26,36,38,40]: 1. High performance. Many applications running in data center are data-intensive and time-sensitive. Data-intensive application requires high bandwidth to transport large chunk of data among servers, such as MapReduce [14], Hadoop [7]. Timesensitive application transfers data to destination within a deadline time, such as web search, online retail, and advertisement. A small routing length will result in a short transmission delay, which can provide more bandwidth for applications and improve network throughput. 2. Design flexibility. Design flexibility has three aspects, arbitrarysized network, incremental deployment and switches with different ports count. The requirement of arbitrary-sized network ∗
Corresponding author. E-mail addresses:
[email protected] (D. Li),
[email protected] (Y. Shen).
http://dx.doi.org/10.1016/j.comcom.2017.08.001 0140-3664/© 2017 Elsevier B.V. All rights reserved.
and incremental deployment is a practical strategy to reduce up-front expenses and respond to increasing applications [2,41]. Due to asymmetry of server distribution or switch heterogeneity, the inconsistent number of inter-switch ports in scale-out network connection should be taken into consideration. During the past decade, a variety of network constructions have been proposed to meet the above-mentioned aspects, which fall into rigid and flexible topologies based on the structural characteristics The rigid networks is determined by the number of ports in a switch, which permits network scale with a very coarse parameter. For example, the multi-rooted tree based topologies can provide high performance on homogeneous switches [3,15,36], while its structure is completely formed by the number of ports in a switch, and it hardly supports incremental deployment. Several flexible solutions have been proposed to support on-demand network scale and incremental deployment [22,37,38,44,45]. For example, Jellyfish [38] introduces random connection to construct it’s network, however, the resulting network cannot always provide guaranteed high network performance because of ruleless connection. Notice that the construction of network is a short-term investment, while the resulting networks provide a long-term and regular service. Therefore, it is cost-effective to spend more effort in construction to improve network performance with given equipments. In this paper, we proposed Length Shuffle connection, which follows a greedy principle that connects two possible furthest
D. Li et al. / Computer Communications 111 (2017) 142–152
143
nodes gradually in each interconnection. Length Shuffle connection can be deployed on commodity top-of-rack (ToR) switches to construct flexible data center networks, which will reduce the number of long paths significantly and result in a small average length of routing path. The proposed connection can be deployed with random and regular connection in flexible network construction, which does not interfere the original routing protocols. The incremental deployment of Length Shuffle is also discussed in this paper. Length Shuffle connection demonstrates a high adaptability in various aspects, such as switch heterogeneity, hybrid deployment, incremental deployment, and fault tolerance. The numerical experiment shows that Length Shuffle connection provides a better performance than random connection. For example, the average path length in random connection is 2% ∼ 5% larger than that of Length Shuffle connection on identical equipments. We conduct evaluations on the hybrid network of random and Length Shuffle connection under different traffic patterns. The results show that Length Shuffle connection can significantly reduce data transmission delay and improve throughput in heavy random and incast traffic patterns. The rest of the paper is organized as follows. Section 2 introduces the related work and Section 3 presents the Length Shuffle connection. Sections 4 and 5 describes the deployment and measurement of the proposed connection. Section6 gives evaluation and Section 7 concludes our work.
hybrid method of regular and random connection, which supports the greediest routing protocols. SecondNet [17] is a virtualization architecture, which focuses on protocol scalability and resource utilization of the infrastructure network other than the structure itself. Jellyfish [38] deploys random connection among switches to support arbitrary-size network and incremental growth. Jellyfish can support switch heterogeneity, while the random connection can not optimize network performance with the identical equipments. PAST [39] provides a multi-path solution at the cost of degrading throughput in Jellyfish. S2 deploys multiple rings and random connection to construct data center networks, where the ring connection is coordinates-based. S2 can provide rich bandwidth in topology and scalability in protocols. Most of the current flexible networks introduce random connection to improve the flexibility of architecture, as in Scafida [22], SWDC [37], Jellyfish [38], and S2 [44,45] . The random connection cannot always provide high network performance, which maybe result in a large network diameter or critical link in random connection. Our work differs from the current flexible proposals in three key aspects. First, the proposed algorithm provides a higher performance on identical equipments. Second, the proposed algorithm supports incremental deployment, as well as asymmetrical interswitch ports count. Third, the proposed Length Shuffle connection can be applied in other flexible architectures to reduce the average length of routing path.
2. Related work
3. Length Shuffle connection
The scale-out data center networks on commodity switches and servers have been studied for decades, and various novel network architectures have been proposed to tackle challenges of typical architectures [47]. The proposed designs fall into switch-centric [3,5,15,36,38] and server-centric [9,18,19,26,32] architectures in terms of forwarding intelligence, or fall into rigid [3,15,29,36] and flexible [4,10,22,37,38,44] architectures in the light of characteristics of topologies. In this section, we introduces some of the major architectures with respect to rigid and flexible structures. Multi-rooted tree is widely used in data center network design [3,15,23,36], which provides large bandwidth and efficient routing protocols on homogeneous switches. Fat-tree [3] provides full bisection bandwidth and a large number of parallel paths between servers. VL2 [15] focuses on the issue of performance isolation, flat addressing, and load balance. Portland [36] studies migration of virtual machines on multi-rooted tree based networks. Monsoon [16] creates a mesh switching domain by using programmable commodity layer-2 switches, where all servers are connected by a layer 2 network. Recursive mechanisms are used in server-centric network designs, where the scale of network is determined by the switch ports count and the number of levels. The typical architectures include DCell [19], BCube [18], FiConn [26] and HCN [20]. MDCube [42] focuses on the interconnection among modular data centers. CamCube [1,9], DPillar [30], SWCube [27], and FSquare [28] are constructed on the 3D Torus, wrapped butterfly, generalized hypercube,and compound Clos structures, respectively. The above architectures focus on the high performance on homogeneous switches, and pay less attention to design flexibility of networks. They support network scale with a coarse parameter, which is completely determined either by the number of ports in a switch, or by both the number of levels and switch ports. Several flexible solutions have been proposed to support design flexibility in network construction. Scafida [22] and FScafida [21] provides small increment by randomness interconnection on asymmetric data center network. LEGUP [12] preserves available ports for future expansion on Clos networks. REWIRE [11] focuses on construction of network that maximizes performance with a given budget. Small world data centers [37] are constructed by a
In this section, we introduce Length Shuffle connection, which creates a graph on arbitrary number of nodes with degree constraint of each nodes. Given nodes with degree constraint, Length Shuffle connection targets for flexible graph with a small number of long path among nodes. 3.1. Algorithm of Length Shuffle Let N denote the number of nodes, and each node is assigned a unique ID vi , taking a value from [1,N]. Node vi has a degree of di , which can connect to at most di (di ≥ 2) other nodes. Length Shuffle connects all these nodes into a graph with their degree constraints, which follows a greedy principle that connects the furthest two possible nodes gradually. The distance between any two nodes is defined as the length of the shortest path between them on the current graph of all nodes. In the initialization stage, there is no connection between any two nodes in the graph, and the distance between any two nodes is infinite. Let Sv denote the set of nodes which has at least one available degree left on current stage. Let Dis(va , vb ) denote the length between va and vb , where va ∈ Sv and vb ∈ Sv . Let MaxDis denote the maximum distance between any two nodes in Sv of the current graph. Let Sc denote a set of node pairs, where each element (vi , vj ) contains two nodes vi ∈ Sv and vj ∈ Sv with Dis(vi , v j ) = MaxDis. The procedure of Length Shuffle connection is shown in Algorithm 1. The procedure first gets the set of Sv , where each node Algorithm 1 Length Shuffle connection. get set Sv ; update Dis(va , vb ) where va ∈ Sv and vb ∈ Sv ; 3: obtain set Sc where Dis (vi , v j ) = MaxDis; 4: randomly select (vi , vj ) in Sc ; 1:
2:
5:
connect vi and vj ;
update degree state of vi and vj ; 7: repeat the step 1–6 until | Sv |= 0 or | Sv |= 1;
6:
144
D. Li et al. / Computer Communications 111 (2017) 142–152
has at least one available degree for interconnection. Then, the algorithm updates distance between any two nodes in Sv , and obtains the set Sc where the nodes pair (vi , vj ) has the largest distance Dis(vi , v j ) = MaxDis. And then, it randomly selects a candidate nodes pair (vi , vj ) in Sc and connects the two nodes vi and
vj with an edge; After this, the procedure updates the number of available degrees in selected two nodes vi and vj . The procedure repeats this process until there is no available nodes pair left.
3.2. Analysis of Length Shuffle algorithm Given arbitrary number of nodes and arbitrary degree in each node, it is a difficult problem to construct a graph with the absolute minimum number of long path among nodes without traversing all possible graphs. The problem falls in the same category as Degree-Diameter problem in graph theory [24]. The Length Shuffle algorithm provides a principle to greedily reduce the number of long paths in the graph, which results in a smaller average path length than random connection. The process of Length Shuffle algorithm is convergent, which will finish after a finite steps of interconnection. In the initialization stage, there is no connection between va and vb , where the Dis(va , vb ) = ∞. From this stage, operator can connect any two isolated nodes or sub-graphs until all nodes are connected. From the connected stage, Dis(va , vb ) is finite and does not increase with edge addition. With the implement of Length Shuffle connection, MaxDis between any two possible nodes monotonously decreases. The proposed algorithm finishes when there is no available nodes pair in the network. In Length Shuffle connection, Dis(va , vb ) may be adjusted with a link interconnection. In the worst case, the distance between two possible nodes va and vb should be recalculated each time. This will result in the maximum computational work, because the num N ber of links is T = ( ii= =1 di )/2 in the resulting graph. A primal approach to connect these links is to run the computational algorithm of distance in each step, such as Dijkstra algorithm in a polynomial complexity of O(N3 ). The complexity of the proposed algorithm is the number of links times the complexity of Dijkstra algorithm. Therefore, the complexity of the proposed algorithm is O(N4 ) in network connection. In the implement of Length Shuffle connection, a large computation is unnecessary. As in the preliminary stage of unconnected graph, the distance between any two isolated nodes or sub-graphs is infinite. In this stage, operator can connect any two isolated nodes or sub-graphs until all the nodes are connected, where the number of links is N − 1. Those links do not run Dijkstra algorithm to get the maximum distance, which will reduce the computational work. Moreover, the maximum distance MaxDis can be used repeatedly before the update, because all the possible candidates are in the set of Sc . After the addition of link (vi , vj ), the operator can
update degree state of vi and vj in set Sv , and then checks the distance Dis(vi , vj ) with Dis(vi , vj ) ∈ Sc . If MaxDis = Dis(vi , vj ), we connect the two nodes vi and vj by a link. If all of the Dis(vi , vj ) < MaxDis with vi ∈ Sc and vj ∈ Sc , we update the distance Dis(va , vb ) with va ∈ Sv and vb ∈ Sv . The procedure is repeated until there is no available nodes pair left.
3.3. Properties of Length Shuffle algorithm Let G(V, E) denote the resulting graph of Length Shuffle connection, and we describe some properties of G(V, E) here. Theorem 1. With available degree di ≥ 2 for i ∈ [1, N], the resulting graph G(V, E) of Length Shuffle connection is connected.
Proof. Assuming that the resulting graph G(V, E) is unconnected. Under this presume, G at least has two connected sub-graphs, de noted by G1 (V1 , E1 ) and G2 (V2 , E2 ) with V1 V2 = ∅; And, there is no available degree left in G1 and G2 meanwhile, otherwise there would be a Length Shuffle connection between G1 and G2 ; The distance between any two nodes in G1 and G2 is infinite. Assuming that each node in G1 has no available degree. Since each node has a degree di ≥ 2, there would be at least a cycle in G1 . Let (vi , vj ) denote the last added edge of the cycle. In G1 , the distance between vi and vj is finite before the addition of edge (vi , vj ). If G2 has an available degree in a single node, the edge (vi , vj ) in G1 should not be added in Length Shuffle connection, which is contradictory to the connection principle. If G2 has no available degree in each node, G2 is similar as that in G1 , and there will be a cycle in G2 with the last added edge (vi , v j ). In this case, the addition of (vi , vj ) in G1 and (vi , v j ) in G2 is contradictory to the Length Shuffle connection principle, for the distance between nodes in G1 and G2 is infinite in the presume. Therefore, with the degree di ≥ 2 for each node, the resulting graph G(V, E) of Length Shuffle connection is connected. The resulting graph has some other properties. For example, with the degree di = 2 for i ∈ [1, N], the resulting graph G(V, E) of Length Shuffle connection is a cycle on all nodes. We omit the discussion of these properties, and we focus on the connection of data center networks. Intuitively, Length Shuffle connection can greedily reduce number of the furthest paths in construction, thereby reduce the average path length among nodes in the resulting graph. Without considering the unconnected graph of random connection, the difference between the resulting topologies of random connection and Length Shuffle connection is indistinguishable from intuition. 4. Deploy Length Shuffle connection in networks The current flexible architectures of data center are built by random or hybrid connection on the commodity top-of-rack switches. Let N denote the number of ToR switches, each of which is assigned a unique ID vi from [1,N]. Let ni denote the number of ports in switch vi , where si ports connect to servers and di ports connect to other switches. In network construction, servers are single-homing, i.e., a server connects to a single switch. Assuming that servers have been assigned and connected to their home switch reasonably. In this section, we present the deployment of Length Shuffle connection in data center networks. 4.1. Random and Length Shuffle connection We first present the difference between random connection and Length Shuffle connection. The resulting architecture of random connection on top-of-rack switches is Jellyfish, which supports arbitrary-size network and incremental growth. The remarkable advantage of Jellyfish is a shorter length of routing path than rigid architectures, which will result in a small routing delay and high network throughput. Let LS denote the architecture of Length Shuffle connection on ToR switches. Fig. 1(a) and (b) visualize the topologies of Jellyfish and LS, respectively. Both topologies consist of 16 servers and 20 4port switches, which is the identical equipment as Fat-tree topology on 4-port switches. The server distribution in this example is the same as that in Fig. 1(b) of [38], where each server connects to a switch and there is no server in 4 switches. Fig. 1(c) shows the path count between servers in Fat-tree, Jellyfish and LS with the identical equipments. As we can see, each of them has a network diameter of 6. However, the number of 6-hop paths is 96 in Fat-tree, 21 in Jellyfish, and it is only 6 in LS. In contrast to
D. Li et al. / Computer Communications 111 (2017) 142–152
hops 2 3 4 5 6
(a) Jellyfish
(b) LS
145
Fat-tree 8 0 16 0 96
Jellyfish 0 17 35 47 21
LS 0 21 39 54 6
(c) Path count
Fig. 1. Topologies with 16 servers and 20 4-port switches.
Fat-tree and Jellyfish, LS reduces the number of longest paths significantly, which leads to a shorter average length of routing path among servers. In Fig. 1, the average length of routing path is 4.375 in LS, it is 4.6 in Jellyfish which is 5% larger than that of LS, and it is 5.467 in Fat-tree which is 25% larger than LS. Compared to random connection and rigid connection, the Length Shuffle connection shows advantage in average length and the number of longest routing paths. As we can see from Length Shuffle connection, distance between any two possible switches may be adjusted with a link addition. This will result in a heavy computational work in the construction of large-scale networks. We introduce a certain ratio of random connection to reduce the computational work in network construction. Notice that the total number of interconnection links N between top-of-rack switches is T = ( ii= =1 di )/2 . Let R denote the ratio of links which are randomly connected. In switch interconnection, first randomly connect the number of R · T links, and then execute Length Shuffle connection on the current network. We denote the resulting network of this hybrid connection by RLS. As we can see, with R = 1, RLS is randomly connected completely, where the resulting network is Jellyfish; with R = 0, the resulting network of RLS is LS, which is connected by Length Shuffle completely. With increment of R, the computational load in RLS construction decreases linearly, where the number of R · T links are connected randomly. The cost of computation in RLS is about R1 of that in LS. 4.2. Regular and Length Shuffle connection Some flexible architectures deploy a hybrid mechanism of regular connection and random connection in network construction. The regular connection benefits the resulting data center networks by a key-based or greediest routing, which improves the scalability of routing protocols. As mentioned above, the random connection can not optimize network performance with given available ports. We introduce Length Shuffle connection to replace the random connection in the hybrid construction of flexible data center architectures. Small-world data centers deploy regular connection and random connection to construct network. The original small world data centers are constructed directly on multi-port servers. We go further and do small-world connection on ToR switches, where the resulting network is named after as that in [37]. In a small-world ring, all switches are connected into a ring with 2 ports, and the left available ports are randomly connected. In a small-world 2-D Torus, all switches are connected into a 2-D Torus structure with 4 ports, and the left available ports in each switch are connected randomly. For other small-world data centers, they are constructed in a similar way as the above construction. The common principle of small-world data centers is the deployment of random connection on the available switch ports after regular connection. We
follow its regular connection principle and deploy the Length Shuffle connection instead of random connection on the available ports to construct networks. After the name of small world, the resulting hybrid networks of Length Shuffle connection is denoted as LS Ring, LS 2-D Torus, and so on. Length Shuffle connection can be deployed on other flexible architectures following the above hybrid principle to replace random connection, such as in S2 and Scafida. The replacement of random connection by Length Shuffle connection does not weaken the advantage of original regular network, which focuses on the connection of the available inter-switch ports after regular connection. 4.3. Routing protocols The Length Shuffle connection does not interfere routing protocols of the fundamental networks, which greedily reduces the number of longest routing paths in network construction. Since the resulting network is switch-centric, Length Shuffle connection based architecture is compatible with the current routing protocols, such as OPSF [35], ECMP [8,25], k-shortest path routing and so on. For single path routing, a possible choice of routing protocol is the open shortest path first (OSPF), which is a well-established protocol to compute the shortest path to all destinations. OSPF can meet the requirement of routing in Length Shuffle based networks. A shortcoming of OSPF is that the size of routing table in each switch is linear to the scale of data center network. The regular connection benefits the resulting data center network by greediest routing, which can improve the scalability of routing protocols and reduce the size of forwarding table. In the greediest routing protocol, the current switch sends a packet to its neighbour which has the smallest distance to destination. The regular network connection enables the greediest routing algorithm to find the next hop of switches, and the Length Shuffle connection helps to reduce the distance in a single step. The replacement of random connection by Length Shuffle connection neither impacts the multiple path routing, as k-shortest multi-path routing in Jellyfish and coordinates-based multi-path routing in S2. In other words, the replacement of random connection by Length Shuffle connection in the flexible network architecture does not interfere routing protocols of the original networks. 4.4. Incremental deployment A remarkable feature of flexible architectures is incremental deployment, which allows the operator to incrementally deploy network according to the increasing applications. Since switch-centric architecture is compatible with the existing protocols, we focus on the physical connection of the incremental deployment. In topology of random and Length Shuffle connection, there is no regular connection. Notice that removing a link between
146
D. Li et al. / Computer Communications 111 (2017) 142–152
5. Analysis of Length Shuffle connection In this section, we present analysis of Length Shuffle connection by numerical measurement, which shows advantage and adaptability of the proposed connection in various dimensions. 5.1. Design flexibility
Fig. 2. The addition of a switch.
switches will release two ports, which can be used to connect two servers. To add a single server, if there is a free port in a switch, the server can connect to the port. If there is no free port in the existing network, the operator randomly removes a single link between two switches. This will release two available ports and the operator connects the server to a port. When there are more than one server to be added, the operator repeats this process until all servers are connected. In the hybrid architecture of regular and Length Shuffle connection, the operator can remove the Length Shuffle link to add servers, where the regular link should not be removed. We focus on the addition of servers with their home switches in the topologies of RLS. Assuming each server has been connected to its home switch, and we follow the same principle of Length Shuffle connection to add those switches. Let W = {va1 , , vak−1 , vak } denote the set of switches to be added, where k denotes the number of incremental switches. Let PM denote the set of longest paths between original switches in the current network, where each element Pi = {vbi , vbi , , vbi , vbi } denotes a path with 0
1
d−1
d
the longest distance. A switch can be incorporated by connecting to the end switches of a furthest path Pi . Specifically, taking va1 as an example, it can be incorporated by removing the links {vbi ,
As we can see from the Length Shuffle algorithm, the proposed connection can connect arbitrary number of nodes with arbitrary degrees into a graph, where LS naturally supports arbitrary-size network and heterogeneity of inter-switch ports count. LS can support the minimum size of incremental deployment, even for a single server or switch. Therefore, LS can be expanded from the existing networks, and allows operator to add variable number of switches and servers. The replacement of random connection by Length Shuffle connection does not change the design flexibility of networks. As in hybrid topology of random and Length Shuffle connection, design flexibility of network is unconstrained, which supports arbitrarysized networks, arbitrary-sized increment, and switch heterogeneity. RLS has the same design flexibility as Jellyfish, which is randomly connected. In contrary, rigid architectures construct networks with a very coarse parameter, where the size of network is completely determined either by the number of ports in a switch [3], or by both the number of levels and switch ports in servercentric architectures [19,26] . In the hybrid topology of regular and Length Shuffle connection, the design flexibility of network is determined by the regular connection. The network size, incremental scale and the number of ports in a switch should satisfy the requirement of regular connection. For example, the incremental unit in LS 2-D Torus is the number of ToR switches in a dimension of 2-D Torus, and each switch has at least 4 inter-switch ports, which is the same as that in SW 2-D Torus. 5.2. Impact of random connection ratio
0
vbi } and {vbi , vbi } in path Pi , and connecting links {vbi , va1 } and 1 0 d−1 d {va1 , vbi }. Repeat this process until there is no available ports pair d in the added switch va1 . Continue to insert the left switches until all the incremental switches are connected. Finally, run Length Shuffle connection on the resulting network. The procedure of incremental deployment is shown in Algorithm 2. Algorithm 2 Incremental deployment. compute the distance among the original switches; obtain set of the furthest path P M between switches; 3: get a path Pi = {v i , v i , · · · , v i , v i } from P M ; b b b b 1:
2:
0
1
d−1
d
4:
remove the link {vbi , vbi } and {vbi
5:
connect vbi to vai ; connect vai to vbi ;
0
1
d−1
0
, vbi }; d
d
repeat 1–5 until there no available ports pair in vai repeat step 1–6 until all switches in W are connected; 8: do Length Shuffle connection on the resulting network; 6: 7:
Fig. 2 illustrates the addition process of switch va1 with 5 interswitch ports. Let P1 and P2 denote two longest paths between the original switches. The operator first connects switch va1 to vb1 , vb1 , 0
d
vb2 , and vb2 . Then, do Length Shuffle connection on switches with 0 d available ports, i.e., switch va1 , vb1 , vb1 , vb2 , and vb2 . In Fig. 2, 1
d−1
1
d−1
the dotted lines represent the Length Shuffle links. In hybrid topologies of regular connection, the operator should deploy the incremental switches following regular connection, and then run Algorithm 2 on the irregular connection.
The proposed Length Shuffle connection can be deployed directly in network construction, or deployed in a hybrid method with random connection as in RLS. Let R denote the ratio of links randomly connected in RLS. We study the impact of different ratio R on the average path length and bisection width in the fabric of switches. In this measurement, each switch deploys 10 ports to connect other switches. The ratio R varies from 0 to 1 with a step length of 0.1, and each result is an average on 50 different topologies. Fig. 3 shows the average path length and diameter versus the ratio of random connection R. As we can see from Fig. 3(a), the average path length increases with the growth of R from 0 to 1. Taking N = 400 as an example, the average length increases from 2.7698 to 2.8776 when R increases from 0 to 1. We observe that the maximum growth of path length generates when the ratio R increases from 0.9 to 1. With N = 400, the growth rate of path length is about 3.88% when R grows from 0 to 1, and it is about 1.98% when R increases from 0.9 to 1. From Fig. 3(b), we can see that the network diameter remains constant when the ratio R increases from 0 to 0.9 for each observation. The possible growth of network diameter occurs when R = 1, which is randomly con1 nected. In summary, a ratio of 10 Length Shuffle connection can effectively reduce the average path length and network diameter. Table 1 shows the distribution of path length on different R with N = 600. As we can see, the ratio of long path (i.e., length=4) increases with the ratio of random connection R increases from 0 to 1. For example, the ratio of 4-hop path is 15.06% for R = 0, 15.76% for R = 0.5, 19.48% for R = 0.9, and 24.31% for R = 1, respectively. The ratio of 4-hop path is 24.23% in S2, which is similar as that in
D. Li et al. / Computer Communications 111 (2017) 142–152
147
5 N=100 N=400
N=200 N=500
N=100 N=400
N=300 N=600
N=200 N=500
N=300 N=600
4.5
3 2.8
Diameter
Average path length
3.2
2.6
4
3.5
2.4
3
2.2 2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
2.5
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Ratio of random connection
Ratio of random connection
(a) Path length
(b) Diameter
Fig. 3. Path length versus the ratio of random connection. Table 1 Path distribution and average length with N = 600. R
0(LS)
0.1(RLS)
0.2(RLS)
0.5(RLS)
0.8(RLS)
0.9(RLS)
1(Jellyfish)
S2
1 2 3 4 5 avg
0.0167 0.1502 0.6825 0.1506 0 2.967
0.0167 0.1502 0.6815 0.1516 0 2.968
0.0167 0.1501 0.6804 0.1528 0 2.9693
0.0167 0.1490 0.6767 0.1576 0 2.9752
0.0167 0.1447 0.6571 0.1815 0 3.0034
0.0167 0.1424 0.6461 0.1948 0 3.019
0.0167 0.1328 0.6073 0.2431 0.0 0 01 3.0771
0.0167 0.1343 0.6067 0.2423 0 3.0746
25
N=100 N=400 Lower boundary
5
N=200 N=500
N=300 N=600 Normalized bisection width
Normalized bisection width
6
4
3
2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
R=0(LS) R=1(Jellyfish) Lower boundary
20
15
10
5
0
8
16
24
32
40
48
Degree of each node
Ratio of random connection
(a) With N and R
R=0.9(RLS) S2
(b) With different ports count
Fig. 4. Normalized bisection width.
Jellyfish. With N = 600 and 10 inter-switch ports, the average path length is 2.967, 3.0771 and 3.0746 in LS, Jellyfish and S2, respectively; the length in RLS is between LS and Jellyfish. As we can see in LS, RLS, Jellyfish and S2, a small proportion of long path results in a short average path length, which will result in a small transmission delay. Fig. 4 shows the normalized bisection width. The minimum bisection value is an average of 100 bisections in a graph, and each result is an average of 50 different graphs. For the lower bound√ ary of r-regular graph, we take r/2 − r ln 2 [6,38] as the theoretical value. As we can see from Fig. 4(a), the difference of bisection width is small among various ratios of random connection R. The bisection width in RLS is about 2 times of the lower boundary in r-regular graph. Fig. 4(b) shows the bisection width on different inter-switch ports count with N = 400. As we can see, the measurement of bisection width increases with the growth of interswitch ports count in each topology, which is larger than the theoretical lower boundary. The difference of bisection width is small in RLS with R = 0.9, Jellyfish and S2, which is slightly less than that of LS.
5.3. Hybrid of regular and Length Shuffle connection As we can see above, the Length Shuffle connection effectively reduces the average path length in the resulting network. Here, we investigate the impact of the proposed connection in a hybrid architecture of regular and Length Shuffle connection. For the flexible architecture, SW Ring and SW 2-D Torus are employed, where we deploy Length Shuffle connection instead of random connection. We measure the average shortest routing length and greediest routing length in these networks. Each result is an average on 100 different topologies. In SW Ring and LS Ring, each switch has at least 2 ports to connect other switches into a ring. We study the number of ports from 4 to 10 with the number of switches N =200. Fig. 5(a) shows the result of the ring based connection, where RS, RG1 and RG2 denotes the shortest routing path, 1-hop and 2-hop neighborhood based greediest routing path . As we can see, the 1-hop neighborhood based greediest routing path in LS Ring is larger than that of SW Ring. However, the shortest routing path and 2-hop neighborhood based greediest routing path in SW Ring is larger than that of LS Ring. For example, with 10 inter-switch ports in each switch,
148
D. Li et al. / Computer Communications 111 (2017) 142–152 9
the average routing path
LS RS LS RG1 LS RG2
SW TS SW TG1 SW TG2
6 the average routing path
SW RS SW RG1 SW RG2
8 7 6 5 4
LS TS LS TG1 LS TG2
5
4
3
3 2
4
5
6
7
8
9
2
10
4
5
6
7
8
9
10
the number of ports
the number of ports
(a) Path length
(b) Path length
Fig. 5. Routing path in Hybrid connection. 6 N=100 N=400
N=200 N=500
N=100 N=400
N=300 N=600
Normalized bisection width
Average path length
3.2 3 2.8 2.6 2.4 2.2 2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Ratio of random connection
(a) Path length
N=200 N=500
N=300 N=600
5
4
3
2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Ratio of random connection
(b) Bisection width
Fig. 6. Path length and bisection width with inconsistent ports count.
RS and RG2 in SW Ring is larger by 3.75% and 7.31% than that of LS Ring, respectively. In SW 2-D Torus and LS 2-D Torus, each switch has 4 ports at least to connect other switches into a 2-D Torus. We study the number of ports from 5 to 10 on a 16 × 16 2-D Torus, where the number of switches is N =256. Fig. 5(b) shows the result of 2-D Torus based connection, where TS, TG1 and TG2 denotes the shortest routing path, 1-hop and 2-hop neighborhood based greediest routing path. As we can see, each routing path in LS 2-D Torus is less than that of SW 2-D Torus. For example, with 10 inter-switch ports in each switch, TS, TG1 and TG2 in SW 2-D Torus is larger by 3.82%, 1.36% and 6.2% than that of LS 2-D Torus, respectively. 5.4. Inconsistent inter-switch ports count In flexible architectures, operator should take inconsistency of inter-switch ports count into consideration, which derives from the asymmetry of server distribution or switch heterogeneity. To measure network performance on switch with inconsistent number of inter-switch ports, we investigate the average path length and bisection width in the fabric of switches. Assuming the number of inter-switch ports di ∈ [5, 15] (i ∈ [1, N]) in a switch, and we study the measurement with different ratios of random connection R. Each result is an average on 100 different topologies. Fig. 6(a) shows the average path length versus the ratio of random connection R with various number of switches. As we can see, growth law of the path length is similar as that in Fig. 3(a), i.e., path length increases with R from 0 to 1. For example, with R increases from 0 to 1, the average path length increases about 3.18% for N = 100, and it is about 3.85% for N = 600. For the same step length of R, the maximum growth of path length generates from 0.9 to 1.
Fig. 6(b) shows the normalized bisection width versus the ratio of random connection R. The bisection width roughly equals to the value in Fig. 4(a), which has a consistent number of inter-switch ports of 10. The result implies that the different ratio of random connection R impacts the bisection width on switches with inconsistent number of ports lightly.
5.5. k-Shortest multiple path We measure multiple paths in the fabric of switches under different ratios of random connection. The target network consists of N =400 switches, each of which connects to 10 other switches. The multiple path value is computed from 50 random source-destination pairs on 10 runs, and each result is an average of 50 different topologies. Multiple paths observed between each source-destination pair are disjointed in this measurement. Fig. 7 shows the average length of k-shortest multiple paths versus different ratios of random connection R. As we can see, the length of path i increases with the growth of i. With R = 0, the length of path 1 is 2.97, and it is 4.65 in path 8. With R = 1, the length is 3.08 and 4.88 in path 1 and path 8, respectively. With the increment of R from 0 to 1, the length of path 1 increases from 2.97 to 3.08, which is about 3.70% larger than that of R = 0. The length of path 8 increases from 4.65 of R = 0 to 4.88 of R = 1, where the growth rate is 4.95%. The result shows that Length Shuffle connection presents a shorter path length than random connection in k-shortest multiple paths between source and destination. Therefore, we can draw the conclusion that the k-shortest paths with MPTCP in LS and RLS can yield a higher performance than Jellyfish, which is randomly connected.
D. Li et al. / Computer Communications 111 (2017) 142–152
149
5 R=0
R=0.1
R=0.5
R=0.9
R=1
path length
4.5
4
3.5
3
2.5 path 1
path 2
path 4
path 8
path Fig. 7. k-shortest path versus ratio of random connection.
Fig. 8. Path length in incremental deployment and link failures.
5.6. With incremental deployment We investigate the average length of routing path under incremental deployment with an original network of 400 16port switches, where 10 ports connect to other switches and 6 ports connect to servers. We increase the number of incremental switches from 4 to 40, where each switch has the same port assignment as the original network. We investigate networks with R = 0, 0.5, 0.9 and 1, and each result is an average on 10 runs of 50 different topologies. We run the same incremental algorithm on R =1, where the topology is randomly connected. Fig. 8(a) shows the average length of routing path versus the number of incremental switches for each ratio R. As we can see, the growth is almost equal in all observations. The growth is about 0.04 in each network when the number of added switches increases from 0 to 40. This suggests that each R can adapt to the network incremental deployment, and the proposed incremental deployment algorithm can efficiently merge the added switches together with the original networks.
5.7. Fault tolerance Due to commodity switches and servers, failure is normal and unpredictable in scale-out data center networks. In this experiments, we study the performance of Length Shuffle connection under random link failures. The network consists of 400 16-port switches, where each switch has the same ports assignment as above. We investigate the average length of routing path with different ratios of random connection, and each result is an average on 10 runs of 50 different topologies.
Fig. 8(b) shows average length of routing path between servers versus the fraction of random link failures. As we can see, the average length grows with the increment of link failures for each R. When the fraction of link failures increases from 0 to 25%, the growth is about 0.4 for R = 0, 0.1, 0.5 and 0.9, and it is 0.5 for R = 1. The path length increases slowly with the growth of link failures, which suggests that Length Shuffle connection is reliable under link failures.
5.8. Cost discussion The cost of the proposed Length Shuffle connection lies in two aspects, length calculation and deployment mechanism. As we can see, the complexity of Length Shuffle is O(N4 ) in LS. However, a certain ratio of computation is unnecessary in LS and RLS. As in LS, length calculation is unneeded before the network is connected, since the distance between any two isolated parts is infinite. In RLS, these random links do not need to run length calculation, and the cost is about 1 − R (R ∈ [0, 1]) times of that in LS construction. In the implement of Length Shuffle connection, the maximum distance MaxDis can be used repeatedly before its update, because all the possible candidates have been selected in the set Sc . In the incremental deployment, the number of links removed in the proposed algorithm is about two times of that in Jellyfish and S2. We argue that there is almost no difference on the impact of networks between different incremental deployments. Both increment of random and Length Shuffle connection influence the running protocols temporary on the shortest path routing and the greediest routing. In each increment, both routing protocols should update the state of neighborhood and forwarding strategy. It is the
150
D. Li et al. / Computer Communications 111 (2017) 142–152
similar case in the hybrid mechanism of regular connection and Length Shuffle connection. Moreover, all these computation can be done offline, and the resulting networks provide a long-term and regular service where a shorter average length of routing path benefits every packet. Therefore, with given equipments, it is cost-effective to spend more effort in network construction to improve performance. 6. Evaluation
the difference of throughput between different R is not significant. With increment of the number of flows, difference grows with the increment of R. When there are more than 30,0 0 0 flows, the growth ratio of throughput decreases significantly in each observation. For the number of 10 0,0 0 0 flows, the throughput is 534 for R=1, 630 for R=0.9, 665 for R=0.5 and 691 for R=0, respectively. The aggregate throughput in LS is more than Jellyfish by 29.4% with 10 0,0 0 0 random flows. We can see that, a small R can improve the throughput of networks significantly under large number of random flows.
6.1. Methodology The network performance of random connection has been verified in [38,39]. In this section, we investigate network performances on the hybrid connection of random and Length Shuffle connection. As in RLS, we study the impact of different ratios of random connection R in construction, instead of the specific routing protocols. The evaluation adopts the same simulator as used in [27,28], where switches work in an ideal store-and-forward model and flows are single-packet. The simulator is time-slot based with congestion on switch and server, where a port can send out at most one flow in each time slot. On receiving a flow, the switch stores it in the corresponding forwarding queue and waits to send it in next time slot. If there are more than one flow competing for a single target port, only the first flow of this queue will be sent to its next hop, and the left flows in this forwarding queue will be delayed. The routing path is randomly selected from all the shortest path between source and destination server. Three typical traffic patterns are conducted on networks, including random traffic, incast traffic, and shuffle traffic [28,31]. We use the number of flows to represent different degrees of traffic load, which are installed into the network at time slot 0. We investigate the average length of routing path, routing delay and aggregate throughput of networks with different ratios of random connection. We conduct simulations on networks with 500 16-port switches and 30 0 0 servers. Each switch connects to 10 other switches and 6 servers. We vary different numbers of flows in traffic pattern, and each result is an average on 10 runs of 50 different topologies. We evaluate various architectures with the ratio of random connection R = 0, 0.5, 0.9 and 1. Notice that, with R = 0 the network is connected by Length Shuffle connection, where the resulting network is LS, and with R = 1 the network is connected by random connection, where the resulting network is Jellyfish. 6.2. Random traffic In this traffic pattern, the source and destination of each flow are chosen randomly. The number of flows varies from 500 to 10 0,0 0 0, with a step length of 500 from 500 to 10,000, and 5000 from 10,0 0 0 to 10 0,0 0 0. The average length of routing path is 4.9876 for R = 1, 4.9223 for R = 0.9, 4.8766 for R = 0.5, 4.8708 for R = 0, respectively. Fig. 9(a) and (b) show the routing delay and aggregate throughput of networks with different R under various degrees of random traffic. Fig. 9(a) shows that the difference of delays between different R is not significant with a small number of flows. Due to the increment of flows, the difference grows with the increment of R. For example, with the number of 10 0,0 0 0 random flows, the average delay is 52.83 for R=1, 49.18 for R=0.9, 47.11 for R=0.5 and 46.76 for R=0. In other words, the transmission delay in LS is less than Jellyfish by 13%. As we can see, a small R can reduce the average routing delay of flows, and the reduction of delay from R=1 to R=0.9 is greater than that from R=0.5 to R=0. Fig. 9(b) shows that the aggregate throughput grows with increment of the number of flows. With a small number of flows,
6.3. Incast traffic In the incast traffic pattern, a destination receives flows from multiple source servers. In this simulation, we assume a destination receives flows from 10 source servers, and source and destination are chosen randomly. The set of flows is similar as that in the random traffic pattern. The average length of routing path in incast traffic pattern is the same as that in random traffic pattern. Fig. 10(a) and (b) show the routing delay and aggregate throughput of different networks under various degrees of incast traffic, respectively. Fig. 10(a) shows that the routing delay between different networks is similar as that in the random traffic pattern. With a large number of flows, a smaller R displays a greater advantage in average routing delay. With 10 0,0 0 0 incast flows, the delay is 54.11 for R=1, 50.73 for R=0.9, 48.88 for R=0.5, and 48.63 for R=0, respectively. The transmission delay in LS is less than Jellyfish by 11.3%. The result is larger than that of random traffic, which derives from the property of incast traffic that flows congest in downstream of routing path to the destination. Fig. 10(b) shows that the aggregate throughput is similar as that in random traffic pattern. With a large number of flows, the difference of throughput between various networks is significant. For the number of 10 0,0 0 0 flows, the throughput is 539 for R=1, 630 for R=0.9, 649 for R=0.5, and 680 for R=0, respectively. The aggregate throughput in LS is more than Jellyfish by 26.2% with 10 0,0 0 0 incast flows. This means that a small R can improve the throughput of networks in the incast traffic pattern. 6.4. Shuffle traffic In shuffle traffic pattern, source servers are intensively chosen in certain switches, and destination servers scattered randomly among other racks. We assume that there are 50 switches forwarding flows to other 450 switches. We vary the number of flows from 500 to 50,000, with a step length of 500 from 500 to 10,000, and 50 0 0 from 10,0 0 0 to 50,0 0 0. The average length of routing path is 4.9903 for R=1, 4.9293 for R=0.9, 4.8816 for R=0.5, and 4.8765 for R=0, respectively. Fig. 11(a) and (b) show the routing delay and aggregate throughput of different networks under various degrees of shuffle traffic. The routing path is a little larger than that of random and incast traffic, which derives from the rack separation of source and destination servers. Fig. 11 shows that the performance of different networks is very close in the shuffle traffic pattern. There are almost no difference in routing delay among different Rs. The aggregate throughput has a slight increment with the reduction of R. From the result we can see that, a small R can significantly reduce the routing delay, as well as improve aggregate throughput in random and incast traffic patterns for large number of flows. In shuffle traffic pattern, the impact of different R is slight. The experiments imply that Length Shuffle connection shows a higher network performance than random connection under a heavy random and incast traffic. The advantage derives from a smaller aver-
D. Li et al. / Computer Communications 111 (2017) 142–152
151
Fig. 9. Random traffic pattern. 700
60 R=1
R=0.9
R=0.5
R=0 600
50
500 Throughput
Delay
40 30 20
400 300 200 100
10
R=1 0.1
0.5 Number of Flows
1
0
0.1
R=0.9 0.3
·105
R=0.5 0.5
R=0 0.7
Number of flows
(a) Delay
1 ·105
(b) Throughput Fig. 10. Incast traffic pattern.
Fig. 11. Shuffle traffic pattern.
age routing path in Length Shuffle based networks, which benefits every flow by reducing transmission delay and network congestion.
routing delay and a higher network throughput in random and incast traffic patterns.
7. Conclusion
Acknowledgment
In this paper, we proposed Length Shuffle connection, which follows a greedy principle that connects two possible furthest switches in each interconnection. The Length Shuffle connection greedily reduces the number of the longest paths in network construction, which results in a smaller average length of routing path. In flexible architectures, the replacement of random connection by Length Shuffle connection does not interfere the original routing protocols, and the resulting network provides a higher performance. The advantage and adaptability of Length Shuffle connection is shown in various dimensions. The experiment results demonstrate that Length Shuffle connection provides a smaller
This work is supported by the National Science Foundation for Distinguished Young Scholars of China (grant no.61225010), the National Science Foundation of China (grant no. 61173160), and also by the Fundamental Research Funds for the Central Universities (grant no. DUT15TD29). References [1] H. Abu-Libdeh, P. Costa, A. Rowstron, G. O’Shea, A. Donnelly, Symbiotic routing in future data centers, in: Proceedings of ACM SIGCOMM 10, 2010. [2] N. Ahmad, Gigaom structure conference 2013. http://gigaom.com/2013/06/19/ did- facebook- finally- let- its- server- count- slip (2013).
152
D. Li et al. / Computer Communications 111 (2017) 142–152
[3] M. Al-Fares, A. Loukissas, A. Vahdat, A scalable, commodity data center network architecture, in: Proceedings of ACM SIGCOMM 08, 2008. [4] V. Asghari, R.F. Moghaddam, M. Cheriet, Performance analysis of modified BCube topologies for virtualized data center networks, Comput. Commun. 96 (2016) 52–61. [5] S. Azizi, N. Hashemi, A. Khonsari, HHS: an efficient network topology for large-scale data centers, J. Supercomput. 72 (3) (2016) 874–899. [6] B. Bollobás, The isoperimetric number of random regular graphs, Eur. J.Combinatorics 9 (3) (1988) 241–244. [7] D. Borthakur, The hadoop distributed file system: architecture and design http: //hadoop.apache.org/docs/r0.18.0/hdfs_design.pdf. [8] M. Chiesa, G. Kindler, M. Schapira, Traffic engineering with equal-cost-multi– path: an algorithmic perspective, in: Proceedings of IEEE INFOCOM 14, 2014. [9] P. Costa, A. Donnelly, G. Oshea, A. Rowstron, CamCube: AKey-based Data Center, Technical Report, Technical Report MSR TR-2010-74, Microsoft Research, 2010. ˝ [10] M. Csernai, A. Gulyás, A. Korösi , B. Sonkoly, G. Biczók, Incrementally upgradable data center architecture using hyperbolic tessellations, Comput. Netw. 57 (6) (2013) 1373–1393. [11] A.R. Curtis, T. Carpenter, M. Elsheikh, A. López-Ortiz, S. Keshav, REWIRE: an optimization-based framework for unstructured data center network design, in: INFOCOM, 2012 Proceedings IEEE, IEEE, 2012, pp. 1116–1124. [12] A.R. Curtis, S. Keshav, A. Lopez-Ortiz, LEGUP: using heterogeneity to reduce the cost of data center network upgrades, in: Proceedings of ACM CoNEXT 10, 2010. [13] M. Dong, H. Lit, K. Ota, H. Zhu, HVSTO: efficient privacy preserving hybrid storage in cloud data center, in: Proceedings of IEEE INFOCOM 14, 2014. [14] S. Ghemawat, H. Gobioff, S.-T. Leung, The GOOGLE file system, in: Proceedings of ACM SIGOPS 03, 2003. [15] A. Greenberg, J.R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D.A. Maltz, P. Patel, S. Sengupta, VL2: a scalable and flexible data center network, in: Proceedings of ACM SIGCOMM 09, 2009. [16] A. Greenberg, P. Lahiri, D.A. Maltz, P. Patel, S. Sengupta, Towards a next generation data center architecture: scalability and commoditization, in: Proceedings of ACM Workshop on Programmable Routers for Extensible Services of Tomorrow, 2008, pp. 57–62. [17] Guo, Lu, Guohan, Wang, J. Helen, Yang, Shuang, Kong, Chao, SecondNet: a data center network virtualization architecture with bandwidth guarantees, ACM CoNext 24 (2010) 620–622. [18] C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, S. Lu, BCube: a high performance, server-centric network architecture for modular data centers, in: Proceedings of ACM SIGCOMM 09, 2009. [19] C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, S. Lu, DCell: a scalable and fault-tolerant network structure for data centers, in: Proceedings of ACM SIGCOMM 08, 2008. [20] D. Guo, T. Chen, D. Li, M. Li, Y. Liu, G. Chen, Expandable and cost-effective network structures for data centers using dual-port servers, IEEE Trans. Comput. (TC) 62 (7) (2013) 1303–1317. [21] L. Gyarmati, A. Gulyás, B. Sonkoly, T.A. Trinh, G. Biczók, Free-scaling your data center, Comput. Netw. 57 (8) (2013) 1758–1773. [22] L. Gyarmati, T.A. Trinh, Scafida: a scale-free network inspired data center architecture, ACM SIGCOMM Comput. Commun. Rev. 40 (5) (2010) 4–12. [23] B. Heller, S. Seetharaman, P. Mahadevan, Y. Yiakoumis, P. Sharma, S. Banerjee, N. McKeown, ElasticTree: saving energy in data center networks., in: NSDI, 10, 2010, pp. 249–264.
[24] P. Holub, M. Miller, H. Pérez-Rosés, J. Ryan, Degree diameter problem on honeycomb networks, Discrete Appl. Math. 179 (2014) 139–151. [25] C.E. Hopps, Analysis of an equal-cost multi-path algorithm http://tools.ietf.org/ html/rfc2992. [26] D. Li, C. Guo, H. Wu, K. Tan, Y. Zhang, S. Lu, FiConn: using backup port for server interconnection in data centers, in: Proceedings of IEEE INFOCOM 09, 2009. [27] D. Li, J. Wu, On the design and analysis of data center network architectures for interconnecting dual-port servers, in: Proceedings of IEEE INFOCOM 14, 2014. [28] D. Li, J. Wu, Z. Liu, F. Zhang, Dual-centric data center network architectures, in: Proceedings of IEEE ICPP 15, 2015. [29] D. Li, J. Wu, Z. Liu, F. Zhang, Towards the tradeoffs in designing data center network architectures, IEEE Trans. Parallel Distrib. Syst. 28 (1) (2017) 260–273. [30] Y. Liao, J. Yin, D. Yin, L. Gao, DPillar: dual-port server interconnection network for large scale data centers, Comput. Netw. 56 (8) (2012) 2132–2147. [31] Y.J. Liu, P.X. Gao, B. Wong, S. Keshav, Quartz: a new design element for low-latency DCNS, in: Proceedings of ACM SIGCOMM 14, 2014. [32] G. Lu, C. Guo, Y. Li, Z. Zhou, T. Yuan, H. Wu, Y. Xiong, R. Gao, Y. Zhang, ServerSwitch: a programmable and high performance platform for data center networks, in: Proceedings of USENIX NSDI 11, 2011, pp. 15–28. [33] C. Lynch, Big data: how do your data grow? Nature 455 (7209) (2008) 28–29. [34] J. Manyika, M. Chui, B. Brown, J. Bughin, R. Dobbs, C. Roxburgh, A.H. Byers, M.G. Institute, Big data: the next frontier for innovation, competition, and productivity (2011). [35] J. Moy, OSPF version 2 (1997). [36] R. N. Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, A. Vahdat, Portland: a scalable fault-tolerant layer 2 data center network fabric, in: Proceedings of ACM SIGCOMM 09, 2009. [37] J.-Y. Shin, B. Wong, E.G. Sirer, Small-world datacenters, in: Proceedings of ACM SOCC 11, 2011. [38] A. Singla, C.-Y. Hong, L. Popa, P.B. Godfrey, Jellyfish: networking data centers randomly, in: Proceedings of USENIX NSDI 12, 2012. [39] B. Stephens, A. Cox, W. Felter, C. Dixon, J. Carter, PAST: scalable ethernet for data centers, in: Proceedings of the 8th International Conference on Emerging Networking Experiments and Technologies, ACM, 2012, pp. 49–60. [40] X. Sun, N. Ansari, R. Wang, Optimizing resource utilization of a data center, IEEE Commun. Surv. Tutorials 18 (4) (2016) 2822–2846. [41] D.R. Trust, What is driving the north america/europe data center market? http: //www.digitalrealty.com/us/knowledge- center- us/?cat=Research. (2001). [42] H. Wu, G. Lu, D. Li, C. Guo, Y. Zhang, MDCube: a high performance network structure for modular data center interconnection, in: ACM Conference on Emerging NETWORKING Experiments and Technology, CONEXT 20 09, 20 09, pp. 25–36. [43] X. Yi, F. Liu, J. Liu, H. Jin, Building a network highway for big data: architecture and challenges, IEEE Netw. Mag. 28 (4) (2014) 5–13. [44] Y. Yu, C. Qian, Space Shuffle: a scalable, flexible, and high-bandwidth data center network, IEEE ICNP 14, 2014. [45] Y. Yu, C. Qian, Space shuffle: a scalable, flexible, and high-performance data center network, IEEE Trans. Parallel Distrib. Syst. 27 (11) (2016) 3351–3365. [46] L. Zhang, T. Han, N. Ansari, Revenue driven virtual machine management in green datacenter networks towards big data, in: Global Communications Conference (GLOBECOM), IEEE, 2016, pp. 1–6. [47] Y. Zhang, N. Ansari, On architecture design, congestion notification, TCPincast and power consumption in data centers, IEEE Commun. Surv. Tutorials 15 (1) (2013) 39–64.