ELSEVIER
Computer
Networks
and
ISDN
Systems
29 (1997)
195%
1968
Design and performance evaluation of the ATM multicast shuffleout switch ’ Paolo Giacomazzi Depurrmenr
cf Electronics
und
Itzformution.
Politecnico
* , Achille Pattavina
di Miluno
/ CEFRIEL.
Piuza
Leomrdo
do Vinci
32. 20133
Miluno,
It&
Abstract This paper describes a proposal of a multicast ATM switch whose interconnection network is the same as in the Shuffleout switch architecture, that is a multistage arrangement of switching elements in which the self-routing is accomplished by shortest-path and deflection routing. Packet replication is accomplished by means of a functionality added to each switching element that is therefore able not only to replicate the multicast packets as they go through the network but also to route the single copies to each addressed outlet. By suitably engineering the number of stages, the network is able to generate the total number of requested copies per slot within the target packet loss probability. An enhanced version is also discussed that is capable of limiting the packet replications per slot. so as to obtain the desired traffic performance with less network stages. 0 1997 Elsevier Science B.V. ing, video lectures,
1. Introduction Narrowband networks have been deployed in the last decades for the provision of communication services whose objective has been mainly the transfer of voice and data information. All of these services have been provided typically between two end-users, that is as point-to-point services. The advent of broadbandnetworking capabilities in transmissionand switching equipment and the consequent key role to be played by video services has made clear the importance of providing point-to-multipoint networking services as well. Examples of possible applications are video entertainment, teleconferenc-
* Corresponding ’ Work carried Milan search tific
(Italy), Council
author. E-mail:
[email protected]. out at the Politecnico di Milano/CEFRlEL,
and jointly supported by the Italian and the Italian Ministry of University
National and
ReScien-
Research.
0169-7552/97/!~17.00 PII SOl69-7552(97)00105-O
0
1997
Elsevier
Science
B.V.
All
rights
reserved
etc. ‘The term mufficust
.srrc~ice.s
will be used here as including both one-to-all services, e.g. a broadcast entertainment service, and one-to-many services, e.g. a teleconference with dynamic join-and-leave capability. The key point to make the provision of multicast services possible is the design of a multicast switch capable of accomplishing the packet replication within the time constraintstypical of an ATM switch and with a reasonablecomplexity in its hardware. Several proposalshave been published in the technical literature. Packet multicasting is usually thought of as an added capability to a basic ATM switch, whose core function is the packet routing (or switching) from the inlets to the outlets of the switch, A classification of the presented proposals, although not easy, could be basedon the integration feasibility of the packet replication function within the basic routing function of the ATM switch. Therefore an ATM switch is said to have a replication-routing
1954
P. Giucomuzi.
A. PattaLka
/ Computer
Networks
disjoint (RR-disjoint) architecture if a copy network is provided that accomplishes packet replication and such function cannot be merged with the packet routing function into the same devices. On the other hand, an ATM switch is said to have a replicationrouting combined (RR-combined) architecture if packet replication and routing are performed by the same devices (typically hardware). In general RRcombined architectures are likely to be more cost-effective solutions owing to the equipment commonality of the two functions of copying (replication) and switching (routing). Upon receiving a packet on a multicast virtual connection, the basic functionality of a multicast switch is the generation of as many copies of the packet as the number of downstream edges of the multipoint connection. Each copy, after receiving its own new multicast channel number from a multicast channel translator, different from copy to copy, is then routed to its proper switch outlet. The broadcast packet switching network proposed by Turner [I] operates the packet replication in a copy network, whose topology is multistage banyan, that replicates the packets as they go through the stages. Conflicts can arise in this structure owing to the mutual independent operations of the port controllers that can require conflicting packet replications. The copy network, although physically separated from the routing network, has the same topology of the routing network. Therefore such network is RR-disjoint, even if it could be classified as RR-combined since the switching elements of a single banyan network could accommodate both functions. The conflicts between replication requests have been avoided in the RR-disjoint architecture proposed by Lee [2]. Here the banyan network performing the packet replication acts as a non-blocking network, since it is preceded by a running adder network and dummy address encoders to prevent replication conflicts. Unfairness problems of this architecture are overcome in the upgraded structure described in’ [3]. Lee’s proposal has the drawback of not being able to prevent that the port controllers request in a slot a total number of copies that exceeds the network replication capacity. Adding more hardware devices can avoid this problem so that the unsatisfied requests are held in the input queues [4]. Two other RR-disjoint architectures are proposed in
and ISDN Systems 29 (19971 1953-I
968
[5,6]. In the former case packets are replicated recursively by multiple passes through the same network, whereas a set of interleaved memories and three banyan networks perform in the latter case packet replication by also accomplishing packet shared queuing. In all of the above proposals packet copies can emerge at any output port of the copy network so that the multicast channel translators, usually one per output port, must contain the translation tables of all the connections supported by the switch. A more efficient solution requires such tables to be split in smaller multiple units so that the outgoing virtual channel of a multicast connection resides in only one table. Such feature is embodied in the RR-disjoint proposal of [7] which similarly to Lee’s proposal adopts a running adder coupled with a banyan network. Therefore the copy network is internally nonblocking and, unlike Lee’s proposal, the number of requested copies never exceeds the copy network size. A proposal of RR-combined multicast switch is reported in [S]; it basically adapts the rerouting banyan network, originally proposed for ATM switching of point-to-point connections [9], to act as a copy network. Unlike the previous proposals, all based on the routing function performed by a banyan network with suitable queuing capability to avoid packet loss, now deflection routing is used to cope with conflicts in packet routing or replication: in fact now several paths are available between any inlet and outlet. Therefore with this proposal the conflicts between packet replication requests are faced by suitably designing the number of stages. However, the total number of requested copies in a slot can exceed the network replication capacity. We describe here (Section 2) how the ATM Shuffleout switch [lo] can be adapted to act as a copy network, thus configuring an RR-combined architecture. Also in this case packet deflection is applied in case of packet conflicts, so that the number of stages can be engineered to limit the cell loss probability due to replication conflicts. Here it is also possible to control how not to overcome the replication capacity requested per slot by means of a running adder network that precedes the interconnection network (Section 3). The paper also provides an evaluation of the traffic performance provided by the Replication
P. Giucomuzzi.
Network 4).
A. PuttaGw
/Computer
in terms of packet loss probability
2. The Multicast
Shuffleout
switch
Nehwrkc
(Section
architecture
The Multicast Shuffleout switch (Fig. 1) includes two basic subsystems: the Replication Network and the Routing Nerwork. The Replication Network generates a number of copies of each packet equal to the number of addressed network outlets. Each copy is then given a proper network outlet address and its own new multi.cast virtual channel number by a multicast channel translator (MCT) and, finally, the Routing Network transmits each packet to the required outlet. Fi,g. 2 shows the structure of both the Replication Network and the Routing Network. The interconnection network is a multistage structure built out of 2 X 4 switching elements (SEs) with interstage shuffle connection patterns interconnecting input port controllers (IPC) and output port controllers (OPC). The main network parameters are the number of switch inlets and outlets N, the number K, and K2 of switching stages of the Replication Network and Routing Network. Each OPC is equipped with an output queue fed by the C outlets of a concentratlJr which can receive up to K, (K,) packets per s101 from the interconnection network in the Replication (Routing) Network. Only in the Replication Network, each OPC is provided with a
urrd ISDN Svsrems 29 (1997)
1953-1968
I955
MCT. A SE is connected to the previous stage by its two inlets and to the next stage by its two interstage outlets. Two additional local outlets are available to access the associated output queues. Apparently, a different functionality characterizes the SEs in the Replication Network or Routing Network. In the following we will describe the Replication Network as only performing packet replication and not routing. Nevertheless, a single interconnection network with K stages performing both replication and routing can well be engineered by providing the SEs with the functionality of both replicating and routing. In this case the multicast cells cross twice the interconnection network. The first time packet replication is performed followed by the mapping operation in the addressed MCT and the second time for their routing to the addressed switch outlet. Unicast cells, that is packets belonging to one-to-one connections, would cross the network only once. Therefore, with such an arrangement, the output queue would act as storage capacity for the routed packets before their transmission to the switch outlet and also as a temporary buffer for the replicated packets before their re-entry into the interconnection network for routing. The distributed routing algorithm adopted in the interconnection network is jointly based on the shortest path and deflection routing principles. As a consequence, the number of interconnection network stages crossed by a cell (I/O path length) is vari-
Output Queues & 0 1
N-l Fig.
1. Architecture
of the Multicast
Shuffleout
switch.
1956
P. Giacomaz.i.
A. Pattacina
/ Computer
Networks
able. This is because a SE attempts to route the received cells along its outlets belonging to the minimum I/O path length to the required destination. The autput distance of a cell from the switching element it is crossing to the required output queue is defined as the minimum number of stages to be crossed by the cell in order to enter a SE interfacing the addressed output queue. The SE can compute very easily the cell output distance after reading the cell output address. In fact consider a N X N network with inlets and outlets numbered 0 through N - 1 and SEs numbered 0 through N/2 - 1 (see Fig. 3 for N = 16). The inlets and outlets of a generic switching element of stage i with index (x,- ,, x,-*, . . . , x,> (n = log,N) have addresses (x~-,, x,-~, . ., x,,O) and (x,- ,, X,-I’ ..., x,, I>. Owing to the interstage shuffle connection pattern, this element is connected to the switching elements (xn- ?, x,-~, . . . , x,, 0) and (x,-~,x,-~, . . . . x,, 1) in stage i+ 1. Thus a IPCS
and ISDN Systems 29 (I 997) 1953- 1968
cell received on inlet (x,- ,, xn-?, . . , x,, yl (y = 0, I) is at output distance d = 1 from the network outlets (x,-~, x n-3 >..., x,,y,d (y,z=O,l). It follows that the SE determines the cell output distance to be d = k, if k cyclic left-rotations of its own address are necessary to obtain an equality between the n - k most significant bits of the rotated address and the n -k most significant bits of the cell address. If the cell output distance is d = 0, the cell requires a local outlet of the SE. The SE routing function is very simple in the Routing Network: when two cells require the same SE interstage outlet, only one can be correctly routed, while the other must be transmitted to the other interstage outlet, due to the memoryless structure of the SE. Conflicts are resolved by the SE applying the deflection routing principle: if the conflicting cells have different output distances, the closest one is routed to its required outlet, while the other is deflected to the other link. If the cells have the same
1
1 2
N-2 N-l
Fig. 2. Architecture
of the Shuffleout
interconnection
network
P. Giacomazzi,
A. Pattacina/
Computer
Nehwrks
output distance, a random choice is carried out. If the conflict occurs BX a local outlet, the loser packet is deflected onto an interstage outlet. The SE routing function is more complex in the Replication Network, even if it includes as a particular case the simple routing algorithm of the Routing Network. When a multicast cell enters the Replication Network, iI! is modified by the accessed IPC with the addition of a tag composed of four numeric fields (see Fig. 4) called Start Address (SA), End Address (EA) (both ranging from 0 to N - l), Multicast Channel Number (MCN) and Activity bit (A). The MCN is used in the MCTs to update the multicast virtual channel identifier (MVCI) in the cell header. The Activity bit is set to 1 for the cells
and
ISDN
S.vstems
1
7
3
h A/V 7
3
8 9
Fig. 3. 16 x 16 network
1953-1968
I957
carrying information and to 0 for dummy cells. When a multicast cell addressing f switch outputs (that is, a cell with fanout equal to f) passes through the Replication Network, it is replicated f times, and each copy of the cell is addressedto a different MCT. Each MCT assignsthe switch output addressand the new MVCI to only one copy of the multicast cell, so that the cell is transmitted to the specified switch output by the Routing Network. The selectionof the MCTs and MVCIs for each multicast connection is done at connection set-up time. The Replication Network generates the copies of the multicast cells and routes them to the addressed MCTs on the basisof the addressrange (SA, EA): a multicast cell can addressany number f (with 1
0
6
29 f/997)
with
shuffle
interconnections
P. Giucomazi.
1958 ATM
A. Puttaoina
/ Computer
Nehvorks
cell ditional
tas
and ISDN Systems 29 (I 997) 1953-l
Therefore, each multicast cell with fanout f must address a set of f contiguous MCTs. For example, a cell with (SA, EA) = (.5,7) addresses the three MCTs with index 5, 6 and 7. Since SA can be greater than EA, two types of address ranges are provided: (SA, EA)
=
(X:
x integer,
(x:
SA s x 5 EA], x integer, SAsxsN-1,01x
i
MCN: SA: EA: A:
Multicast Channel Start Address End Address’ Activity bit
if SA<
EA.
if SA>
EA.
(1)
Both types of address ranges are defined as modulo-N compact ranges, since they are defined by only two integers ranging from 0 to N - 1. The width W of an address range, that is, the number of MCTs it addresses, is defined as:
Number
W= (EA-SA),,,.+
Fig. 4. Cell tag structure.
I N) of switch outputs and, therefore, any number f of MCTs; moreover, the cell routing information is based only on the couple of numbers (SA, EA).
1.
(2)
When a multicast connection is set-up, its address range (SA, EA) is defined; since there are N possible ways of assigning a module-N compact address range, a random choice is carried out. Cell
(a)
structure:
ATM cell 4
tag
MCTs
IPCS
. . .
1
Replication Network
.
of multicast
.
I .
N-l
Fig. 5, (a) Example
96X
connection.
(b) Cell routing
d.N-1
in the Replication
Network.
P. Giacoma~zi.
A. Pattaoina
/ Computer
Networks
Note that each MCT can receive replicated cells belonging to multicast connections received on different inlets of the switch. Therefore, the allocation of a proper internal MCN at the switch input to each multicast cell is a mandatory task to avoid conflicts in the mapping accomplished by each MCT. Again selection of the MCN is done at connection set-up time. In the example of Fig. 5a a multicast connection enters a network node with MCVI equal to 26 and branches on three outgoing links with MCVIs equal to 18, 1 and 16, respectively. As shown in Fig. 5b, the IPC assigns to each cell of the specified connection the values 14, 0 and 2 for the MCN, SA and EA fields, respectively. The Replication Network creates 3 copies of the multicast cell and transmits them to the MCTs 0, 1 and 2. Each MTC assigns to the received cells the switch outlet index and the new MVCI to be used on the switch outgoing link.
2.1. The SE structure Each SE in row r in the interconnection network is connected through two local outlets to the output queues 2r and 2 r + 1. A SE receiving a multicast cell can perform two operations: - Copy Operation: if one (both) of the SE local outlets are included in the address range of the celI, the SE can create one copy (two copies) of the cell and transmit it (them) to the MCT(s) through the local outlet(s). - Split Operation: if the width of the cell address range is greater than 1, the cell address range can be split in two modulo-N compact address subranges, each of them associated with a new copy of the cell, these two copies are transmitted to the next stage through the interstage links. If the Split Operation is preceded by the Copy Operation, it is accomplished on the residual part of the cell address range. If the SE receives only one cell, it can perform the Copy Operation only if at least one local outlet is included in the address range (EA, SA) of the cell, in the other case it passes directly to the Split Operation. This operation consists in dividing the cell address range (.SA, EA) in two subranges (SA,, EA,)
and
ISDN
Systems
29 (1997)
I953-
1968
1959
and (SA,, EA,) with equal widths (if W is even) or with widths differing by one (if W is odd); the splitting rule is defined as: (SA,,EA,)=(sA,[sA+W/21,,,.). (SA,, EA,) = ((EA,
+ I)modN, EA).
(3)
Once the two address ranges are defined, two copies of the cell (each of them carrying an address subrange) are routed to the next stage. For both of the outgoing cells the distances dSA and d,, from the address range extremes SA and EA are computed: then, also the runge distance D = min{d,,, d,,) is computed. Th e cell with lower range distance is given higher priority in the choice of the outgoing c interstage link. If a conflict for an interstage link occurs, and the cells have the same range distances, the cell with lower address range width is chosen; if also the address range widths coincide. a random choice is carried out. If the SE can perform the Copy Operation on one (and only one) local outlet, the local outlet index must be equal to SA or EA (extreme local outlet). In this case, after the Copy Operation the residual address range is still modulo-N compact and the Split Operation can be applied according to Eq. (3). If the SE can perform the Copy Operation on both local outlets two cases may occur: I. The local outlets included in the address range are at its leftmost or rightmost edge (extreme local outlets). For example, local outlets 6 and 7 are at the rightmost edge of the address range (4,7). In this case, after the Copy Operation the residual address range is still modulo-N compact and the Split Operation can be applied according to Eq. (3). 2. The local outlets included in the address range are not at its leftmost or rightmost edge (central local outlets). For example, local outlets 6 and 7 (if N = 8) are central for the address range (4,2). In this case, after the Copy Operation the residual address range consists of two modulo-N compact ranges, and the Split Operation must be performed according to the following rule: (.W,,EA,)=(SA,(2r-
I)modN),
(SA,,EAl)=((2r+2),,,.,EA). (4) If two cells are to be routed by the SE, the Copy Operation can be applied only if at least one local
1960
P. Giacomuzzi,
A. Puttuoinu
/ Computer
Networks
outlet is extreme for one or both cells; in this case, the Copy Operation is performed on the cell with the smallest address range width. If the selected cell is not to be transmitted to the next stage after the Copy Operation, the SE can perform the Split Operation on the other cell, according to Eq. (3). If both cells must be transmitted to the next stage, the cell with the lowest range distance is given priority in the choice of the interstage link if a conflict occurs. If the two cells have the same range distance, the cell with smaller address range width is preferred. If also the address range width coincide, a random choice is carried out. An example of packet routing in the Replication Network is shown in Fig. 6 in the case of N = 8 and K, = 5. A multicast cell with address range (0,4) and MCN equal to 21 enters the network from inlet 0, a cell with address range (6,7) and MCN 17 from inlet 2 and a cell with address range (3,6) and MCN 82 from inlet 6. Note that the total number of addressed switch outlets is 11 (greater than N = 8).
o121,0.4,11
und
ISDN
Systems
29 11997)
1953-1968
In the first stage, the SE in row 0 performs the Copy Operation (transmitting a copy of the cell to the MCTs 0 and 1) and then splits the residual address range according to Eq. (3), transmitting to the second stage a cell with address range (4,4) and a cell with address range (2,3). In this case a conflict occurs for the lower interstage link: the cell with address range (2,3) wins the conflict, since it has a lower range distance. The SE in row 1 creates two copies of the received cell, one with range (6,6) and one with range (7,7); a conflict occurs for the lower interstage link and a random choice is carried out, since both cells have the same range distance and address range width. The SE in row 3 transmits a copy of the received cell to the MCT 6 and splits the cell in two new cells with address ranges (3,4) and (55); a conflict for the upper interstage link occurs, and the cell with address range (5,5) is the winner; in fact, the two cells have the same range distance, but the cell with range (5,5) has a lower address range width.
lZL.4.4.11
1
Cell tag:
l~~i-t---l
Fig. 6. An example
of packet routing
in the Replication
Network.
[MCN,SA,EA,A]
P. Giacomaxi,
A. Pattavina
Input Buffers L+
/Computer
Networks
Running Adder 7
Fig. 7. Limitation
Buffers
ISDN
Replication
of the number
Systems
29 (1997)
Network
of replication
1953-
1961
MTCs
requests
Running Adder
Running Adder 0 +
Fig. 8. (a) Time slot
1968
the cell (4,4), since it has a lower range distance. The SE in row 2 receives a cell with address range (7,7) and a cell with address range (5,5) (note that this case could be also considered as an example of point-to-point routing); the former cell is routed to the third stage through the shortest path, while the
In the second stage, the SE in row 3 receives a cell with address range (6,6) and a cell with range (3,4). The cell with range (6,6) can exit the network, while the other cell is split in two cells with ranges (3,3) and (4,411, respectively. In this case a conflict for the upper interstage link occurs, and it is won by input
and
t; (b)
time slot t + 1
MCN16 SA I EA 0
P. Giacomazzi,
1962
A. Pattavina
/ Computer
Networks
and ISDN Systems 29 (1997)
3. Limitation
‘O-‘b
Number of stages. K NF
Fig, 9. Loss probability N = 16,256.
20
30 40 50 60 Number of stages, K
for the case R=
NF;
FIX
968
of the replication
conflicts
In the previous example it has been shown that the total address range width of the cells offered to the Replication Network in the same time slot can be greater than N. Even if the network can efficiently handle such phenomenon, we will show that it can degrade the performance of the “basic” multicast Shuffleout switch, especially in the case of a concurrent arrival of many cells with a large address range width. Thus, it is worth improving the switch architecture by providing a means to control the number of replication requests in each time slot. This is obtained in the “enhanced” architecture shown in Fig. 7 by providing each switch inlet with an input buffer, in which the unsatisfied replication requests are stored for further trial in subsequent time slots. The number of replication requests allowed to enter the Replication Network in one time slot are com-
100
10
1953-l
70
60
distribution;
z,x 032 P a D I:
0
latter exits the network. The SE in row 1 receives a cell with address range (2,3) and, therefore, it can perform the Copy Operation on both local outlets. The SE in row 1 receives a cell with address range (4,4), and routes it to the third stage through the shortest path. In the third stage, the SE in row 1 receives two cells with address ranges (4,4) and (7,7), respectively. Since the shortest paths to the required outlets follow different interstage links, no conflict occurs. Also the SE in row 3 can route the received cell through the shortest path; the SE in row 2 in the same stage receives a cell with address range (4,4) and, therefore, transmits it to the required MCT. In the example of Fig. 6, no more conflicts occur in following stages of the Replication Network.
0
5
IO 15 20 25 Number of stages, K
5
IO
Fig. 10. Loss probability F=4,16.
15 Number
30
20 25 30 of stages, K
for the case R = NF;
35
35
40
FIX distribution;
P. Giacomnz.i.
A. Pattavina/
Computer
Networks
puted by a Rutming Adder (RA) and R denotes the maximum total number of replication requests allowed to enter the Replication Network in one time slot. The structure of Running Adder described in [2] can be used here as well. In Fig. 8 an example of the RA operations is shown, in the case of N = 8 and R = N. In time slot t (Fig. 8a) the RA starts the count of the replication requests from inlet 0 (identified by the black token), offering a multicast cell with fanout 4. The total fanout of the cells offered to the adder inputs 0, I and 2 is 6, moreover, the fanout of the cell offered on inlet 3 is 4; therefore, accepting also the cell on inlet 3 would imply a total number of replication requests greater than R. In this case the network enables only two copies to be made of the multicast cell offered on input 3 and the remaining copies will be satisfied in the next time slot. In Fig. 8b the state of the input buffers at the beginning of the time slot
and
ISDN
Sysiems
0
0
29 11997)
5
5
Fig. 12. Loss probability F= 4.16.
4
0
8 12 Number of stages, K
16
20
b .z f Ll c a % 4
1963
I953-1968
IO 15 20 25 Number of stages, K
10
15 Number
20 25 30 of stages, K
for the case R = NF;
30
35
35
40
UN1 distribution:
t + 1 is shown; the RA starts the requests count from input 3 (identified by the black token). Since the total fanout of the cells offered by inputs 3 through 7 is 6, the RA can accept also the cell offered on input 0. In this case the total request count is equal to R and, therefore, the count processin time slot I + 2 will start from input 1.
4. Performance analysis and results
0
10
Fig. 11. Loss probability N = 16,256.
20
30 40 50 60 Number of stages, K
70
80
for the case R = NF; UN1 distribution;
The performance of the multicast Shuffleout switch is now examined by only referring to the operation of the Replication Network. The performance measureof our interest is, as usual in ATM networks, the packet loss probability, rr. Cells can be lost in the interconnection network due to the finite number of stages K and, in the output port controller, owing to the concentration operation with ratio K : C and to the finite output queue capacity.
1964
P. Giacomaui.
A. Partauina
/ Computer
Networks
We disregard here the study of the of the OPC loss since it can be easily evaluated by means of straightforward combinatorial analysis for the concentrator and by a Geom(C)/D/l/B, queue model for the output queue, where B, is the output queue capacity (see, e.g., [lo]>. The loss in the interconnection network is here evaluated through computer simulation. The use of such tool sets a constraint on the system that can be studied: in fact the set-up and clear-down of the multicast virtual connections requires a too long simulation time. Therefore, the multicast switch will be evaluated by assuming that each multicast connection carries only one cell, so that each multicast cell requests a new set of f outlets. Two cases have been considered for the fanout distribution of the multicast cells: fixed fanout (FIX> with replication factor f= F and fanout with a random replication factor f in the interval fi, F1 with
and
ISDN
Systems
29 (19971
1953-I
968
FIX, N=16. D=D
v)
-1::
-a-F=l
[email protected]=4
lo*
R=N
..-.
, o-5 i 10-61
,
0
I
,
2
,
,
!
/
4 6 6 10 Number of stages, K
,
,
12
[
14
.._............................................
..... ...
...... .............. ..... ....._.....................
IO
15 Number
20 25 30 of stages, K
35
40
Fig. 14. As Fig. 9 for R = N
10
0
20 30 Number of stages, K
50
40
uniform distribution (UNI). The f switch outlets of each multicast cell are selected random out of the N switch outlets. The load p offered to each switch inlet (0 < p I 1) is chosen in such a way that the carried load p of a lossless switch never exceeds the value p = I per output. Therefore the arrival events of multicast cells at each switch inlet, which are assumed to be mutually independent, are described by a Bernoulli source which generates a multicast cell in a slot with probability p up,,, where
Pmax
1 -, F
FIX, 2
F+l’ 0 Fig, 13. Comparison distributions.
10
20
30 Number
40 50 of stages,
of the loss performance
60
70
60
K for the two fanout
uiw.
A steady-state condition is assumed to have reached by the simulated system and the cell probability is given by the ratio between lost (those that have not reached the addressed
been loss cells OPC
P. Giucomazz.i,
A. Pattauina
/ Computer
Networks
and
ISDN
Systems
29 (1997)
1965
1953-1968
within stage K) and offered cells in the steady-state observation time T slot. Therefore N-l T N-l T C C .fii - C C YI, ix0 ;=o j=, ?T= N--l T j=l
I3 i;;o
C ,=,
fij
.._..,........
, 04 4
’
, o-5j.....d+i
where fj, is the fanout requested by the multicast cell received at inlet i in slot j (fij = 0 if no cell is received), and yii = 1 if the switch outlet i transmits a cell in slot j, y,, = 0 otherwise. The performance results that we are going to describe explicitly assume the worst case loading of the multicast switch, that is p = prnilx. Figs. 9-12 show the loss probability for the case R = NF, that is the basic architecture without running adder. With the FIX distribution (Figs. 9 and lo), the number of stages K of a unicast switch (F = 1) must be increased roughly by only 25% if
lo4 lo”
.--.-.-.. 4.k
7dkd%8 -+-F=16 /,I / , I,, 0 2
0
5
y!a& _.......,.....
.._ .._.
.‘..... ?? ‘f
, I,, / , 1,,,,,//,,, 4 6 8 10 Number of stages, K
IO
15 Number
20 25 of stages,
30
12
14
35
40
K
Fig. 16. As Fig. 1 I for R = N
0
5
--t-0
IO 15 20 25 Number of stages, K
30
35
N=256 5
10
15 Number
20 25 30 of stages, K
Fig. 15. As Fig. 10 for R = N.
35
40
F = fi so as to provide the sameloss performance, e.g. rr~ 10e6. Many more stagesare required when the fanout value becomes closer to the maximum fanout F = N: the required number of stagesis still acceptable with small N (K increasesby 50% for N = 16) whereas it grows substantially for large network (K increases2-3 times for N = 256). On the other hand, given a fanout F (e.g., F = 4, 16) K grows quite linearly as the network size is doubled. Qualitatively similar results are given by the UN1 distribution (Figs. 11 and 12). Fig. 13 comparesthe loss performance for the two fanout distributions assumingthat the fixed fanout value F equals the maximum fanout of the distribution UNI. It is interesting to observe that the distribution FIX, although requesting a larger average number of copies, gives almost the sameloss values as the distribution UN1 for the same K unlessthe fanout is larger than fi. Figs. 14-17 show the sameloss figures as Figs. 9-12, now referred to the case R = N, that is for the
P. Giacomazzi,
1966
A. Pattauina
/ Computer
Networks
enhanced multicast switch using input queues and a running adder so that the cumulative number of copies requested in a slot never exceeds the switch size N. It is observed that, given a network size N, increasing the fanout from F = I to F = N requires the number of stages to grow by at most 25% so that a given loss probability is maintained. Again, Ii grows almost linearly to provide the same loss as the network size doubles. These considerations apply both to distribution FIX (Figs. 14 and 15) and to distribution UN1 (Figs. 16 and 17): in fact if we compare Fig. 14 (Fig. 15) with Fig. 16 (Fig. 17), we see that both distributions provide rather comparable loss figures. The running adder implies the necessity of input queues, as shown in Fig. 8; it is easily understood that some packets will experience a delay in such queues. We define such delay as the average time (measured in time slots) elapsed from the packet arrival at the input queue and its complete transmis-
and lSDN Systems 29 (I 997) 1953-196X UNI. N=16, g
0.4
0.6 Output
UNI, zi
0.6
N=‘256,
Fig. 18. Average
0
5
lo 15 20 Number of stages,
10
15 Number
20 25 of stages,
25 K
30
30 K
Fig. 17. As Fig. 12 for R = N.
35
35
40
0.8
1
p
F&N
0.6 Outout
5
load,
,
0.4
0
FkN
0.6
load,
p
delay in the input queues
sion to the interconnection network (a packet may be transmitted in more than 1 time slot). In Fig. 18 we plot the average delay in the input queues, with reference to the same configuration used for the packet loss plots as in Fig. 16. As expected, the delay is rather small, and it increasesas the fanout of the packets grows. In the case of F = 1 the delay is always zero; this occurs becauseif the fanout is 1, at each time slot all the input queues can be always served. Also when F > 1 the delay is small, because the actual load of the input queuesis the output load p divided by the average packet fanout, that is, the input queue load is small. Fig. 19 comparesthe lossprobability for the basic (R = NF) and enhanced (R = N) architecture with the UN1 distribution. Consistently with the previous considerations, the two multicast switches give almost the sameloss performance, as long as our rule of thumb applies, that is when the fanout value is such that F s fl. If larger fanouts are required, the
P. Giacomazzi.
A. Pattauina
/Computer
Networks
and
enhanced architecture is to be preferred. Therefore, the multicast S’huffleout switch can be well engineered without input queues and running adder network if the fanout per multicast cell can be a priori limited. Finally, Fig. 20 shows how the average physical load pK on each interstage link at stage K varies in a switch with ,Y = 256 and different fanouts under the FIX fanout distribution. In general the load increases starting from stage 1 (remember that stage 1 receives a load p, = p = l/F,,,) since the replication process starts operating. Quite rapidly the load reaches a value very close to 1, since in the first stages the number of cells generated by the replication process are many more than the number of cells entering the output queues. After reaching its maximum, the physical load decreases, since more replicated cells ente:r the OPC and thus deload the interstage links. The adoption of the limiting device in
ISDN
Systems
29 (1997)
0
10
20
1953-I
30 Number
968
1967
40 50 of stages,
60
70
80
30
35
40
K
IO0 10-l : P 1 P 2 zi _m z a
lo-" 1 o-3 1U4
y
1O-5
--e----&4,6
1u6
+.
, -.+.
..,... .._ _..._.
F&5&3......
IO“ 0 Fig.
20. Variation
5
10
of average
15 Number physical
link at stage K in a switch with under the FIX fanout distribution.
Number
Fig.
19. Comparison
and enhanced
10
20 30 40 Number of stages,
of the loss
(R = N)
architecture
probability with
load N = 256
K
pK on each and
different
the replication factor in the enhanced switch the results only for large fanout values.
of stages,K
5. Comparison
0
20 25 of stages,
50 K
60
for the basic the UN1
70
(R = NF)
distribution.
interstage fanouts
affects
with other copy networks
The network proposed in this paper can be classified as a copy network for multicast ATM switches. In the literature several different architectures have been proposed to accomplish this function; in this section the main differences and similarities between the Shuffleout copy network and other copy networks are outlined. Firstly, it is worth pointing out that the proposed architecture offers its maximum advantage if employed as a copy network for the Shuffleout switching fabric; the architecture of the proposed replication network coincides with the network architecture
1968
P. Giacomazzi,
A. Pattavina
/Computer
Networks
of point-to-point Shuffleout (the difference is some enhanced functionality of the SEs). This implies that with a careful design a full multicast ATM switch can be designed and manufactured with the same hardware solutions necessary for a point-to-point Shuffleout switch. This is a significant advantage and it is likely that the best solution for a copy network for a Shuffleout switch is the one proposed here. The proposed network can be used also as a copy network for a general point-to-point ATM switch. In this case the cited architectural advantage holds no more and the choice must be accomplished on the basis of a performance/cost tradeoff. As far as performance is concerned, the proposed architecture scores as a very good one; also under maximum load the cell loss probability can be made arbitrarily low, by suitably selecting the number of stages. Copy networks lacking this interesting feature are proposed in n 171 n (a Banyan-based copy network) and [ 1I]; in these cases the cell loss probability grows up to 1 as the offered load approaches its maximum value, while the delay experienced by the packets grows without bounds. In the case proposed here the maximum cell loss rate can be arbitrarily low and the cell delay in the case of no input buffers is fixed and equal to the time required by a cell to cut through the network. As shown in Fig. 18, also in the case in which a running adder is employed the delay in the input queues is always negligible. It is more difficult to compare the cost/complexity of the cited copy networks. A simple SE counting is misleading because, for example, in the case of [7] the SEs are buffered while in the case proposed here the SEs are unbuffered. A comparison is difficult since it is hard to evaluate how many unbuffered SEs are worth one buffered SE. Moreover, the cost of the interconnections between the SEs is to be considered. The architecture proposed in this paper is likely to have a high interconnection cost but, again, it is hard to evaluate this cost and to carry out a comparison between SE costs and interconnection costs.
6. Conclusions A new architecture of a multicast ATM switch has been proposed that uses the same hardware as
and
ISDN
Systems
29 (I 997)
I953-
I968
the ATM Shuffleout switch. Its Replication Network can well be merged with the Routing Network by suitable upgrading the functionality of the switching elements. Tables in multicast channel translators can be kept within a reasonable size, since each generated copy will be served by a specific output port controller. The performance evaluation of the switch has shown that, compared to a unicast switch, a moderate increase in the number of network stages is required if the fanout F is not larger than fi.
References [I] [2]
[3]
[4]
[5]
[6]
(71
(81
[9]
[IO]
[ 1 I]
J. Turner, Design of a broadcast packet network, Proc. INFOCOM 86, Miami, FL, pp. 667-675. T.T. Lee, Non-blocking copy networks for multicast packet switching, Proc. Int. Zurich Sem. on Digital Commun., Zurich, Switzerland, March 1988, pp, 221-229. T.-H. Lee, S.-J. Liu, A fair high-speed copy network for multicast packet switch, Proc. INFOCOM 92, Florence, Italy, May 1992, pp. 886-894. T.T. Lee, R. Boorstyn, E. Arthurs, The architecture of a multicast broadband packet switch, Proc. INFOCOM 88, New Orleans. LA. March 1988, pp. 1-8. W. De Zhong, S. Shimamoto, Y. Onozato, J. Kaniyil, A recursive copy network for a large multicast switch, Proc. Int. Switching Symp. Yokohama, Japan, Vol. 2, Oct. 1992. pp. 161-165. R.P. Bianchini, H.S. Kim, Design of a non-blocking sharedmemory copy network for ATM, Int. J. Digital Analog Commun. Syst. 6 (1) (1993) 39-48. W. De Zhong, Y. Onozato, J. Kaniyil, A copy network with shared buffers for large-scale multicast ATM switching, IEEE/ACM Trans. Networking I (2) (1993) 157-165. S. Urushidani, S. Hino. K. Yamasaki, K. Yukimatsu, A high performance multicast switch for broadband ISDN, Proc. int. Switching Symp., Yokohama, Japan, 1992, Vol. 2, pp. 171175. S. Urushidani, A high performance self-routing switch for broadband ISDN, IEEE J. Selected Areas Commun. 9 (8) (1991) 1194-1204. S. Bassi, M. Decina, P. Giacomazzi, A. Pattavina, Multistage shuffle networks with shortest-path and deflection routing for high performance ATM switching: the Open-Loop Shuffleout, IEEE Trans. Commun. 42 (IO) (1994) 2889-2897. Taeck-Geun Kwon, Choong-Kyo Jeong, A simple, extensible ATM switch with load-balanced rounding copy network, Proc. ICC, 1995, pp. 1122-I 126.