COMIKITER ISDN SYSTEMS ELSEVIER
Computer Networks and ISDN Systems26 (1994) 1559-1580
Performance analysis of packet switches with input and output buffers Youn Chan Jung, Chong Kwan Un * Communication Research Laboratory, Department of Electrical engineering, Korea Advanced Institute of Science and Technology, 373-1 Kusong-dong Yusong-gu, Taejon 305-701, South Korea
Received 30 August 1993; accepted 20 October 1993
Abstract
We develop an analytical model of a nonbiocking packet switch with input and output queues that is able to transfer up to L packets per slot to a given switch output. It is modeled as a finite input and output queueing system with 1 < L ~
1. Introduction
A T M (Asynchronous Transfer Mode) is considered to be the most promising transfer technology for implementing B-ISDN [1-3]. Design for suitable switches that can handle both the requirements of large throughput and a variety of services must be solved before the B-ISDN can be fully developed. In this paper we study a nonblocking packet switch (NPS) fabric with N input links and N output links, the service capacity of which is defined by the maximum number of packets, say L, which can be simultaneously routed from multiple input links to each output link as shown in Fig. 1. We investigate the effect of L and i n p u t / o u t p u t buffer sizes, which are key parameters in A T M switch implementation, on the packet loss performance. We focus mainly on the packet loss of finite i n p u t / o u t p u t buffer queueing systems with 1 ~
* Corresponding author. Fax: +42-869-3410. 0169-7552/94/$07.00 © 1994 - Elsevier Science B.V. All rights reserved SSDI 0169-7552(93)E0097-X
Y.C. Jung, C.K. Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
1560
Therefore, we use a simple packet arrival process, neglecting correlations in packet arrivals although consideration of correlations of packet arrivals in the ATM traffic is important in obtaining the performance measures which play key roles in determining the characteristics of traffic and admission control. In most of the past work researchers considered switches with infinite size. However, in our work, we investigate packet loss for a switch with finite size, so that dependency on the switch size can be taken into consideration in our analytical model. This assumption of finite switch size may be considered to be complex in our analysis. However, the algorithmic approach by matrix manipulation has a significant advantage in that the model having a large amount of states can be explicitly computable requiring serial computation of an element matrix of size (N - L + 1) × (N - L + 1). We also analyze the input and output queueing system with 1 < L < 2, which researchers have rarely studied. The behavior of buffer queues may be characterized by considering the approach taken in the speed-up factor L for the contention resolution mechanism [4]. In an input and output queueing system with L, the required packed loss/delay cannot be obtained if the external load is larger than the maximum throughput according to L, as plotted in a solid line in Fig. 2. Obviously, the maximum throughput primarily depends on the speed-up factor L, which is also proportional to the complexity for switch realization. In an input buffered switch (i.e., L = 1), the bandwidth of the switch fabric need only be slightly greater than that of the port. The input buffer is simple to implement, but suffers from the problem of head of line (HOL) blocking which reduces significantly the throughput of the switch [5]. Recently, in Refs. [6,7], multichannel switching with L = 1 has been considered, improving remarkably the switch throughput without output buffers. The saturation throughput can be improved up to about 0.88 with a uniform channel group size of sixteen (G = 16) for multichannel switching, yielding a marked improvement. But, the increased dimension of N G × N G switching fabric is required for the N × N switch. Also, the multichannel switch cannot preserve the first-in first-out packet sequence. When L = N, the switch requires buffers only at output ports. From the view point of the performance, an output buffered switch may be considered as the best solution. Implementation of an output buffer has been considered for the crosspoint switch in [8], and for Banyan switch in [9]. These proposals demonstrate that the output buffering scheme that is effective in achieving a suitably low packet loss probability requires a very large amount of hardware. Such a complexity problem can be solved in some degree by the use of wave division multiplexing (WDM) as a switch application as shown in Refs. [10-12]. But, this architecture requires N 2 tunable receivers, a large size of output buffers and a complex electronic controller to supply the tuning information to the output port receivers for wavelength addressing.
Input buffers
1-'~1-~ I
Nonblocking Switch switching plane fabric (e.g., L = 2) Output controller buffers
12_
Packet
I
m
r'--~'nml-ll
m N---8~R--[~---L~
m []
2 I
J
-mTll- N [ - - - - ~ - ~
Fig. 1. Parallelplane switchwith input and output queueingwith L = 2.
Y.C. Jung, C.K. Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
1561
The above problems can be overcome by having buffers at both input and output of the switch (i.e., 2 ~
100
90
...............................
80
60 50 1
2
3
4
5
6
7
Input queue system Output queue system Input and output queue system q
b
Simplicity increase q
Performance increase
Fig. 2. Maximumthroughputs for different speed-up factors.
1562
Y.C. Jung, C.I~ Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
• From the implementation viewpoint, it is advantageous that the size of buffers at the output ports is made the same as buffers at the input ports, while having the input/output performance within the desirable limits. It is interesting to note that, due to the minimal signal connection from the output buffers to the input controller, the output buffers can be a part of a line interface like the input buffers. In this paper, we evaluate the performance of a switch with input and output queues that is able to transfer up to L (2 ~
2. Performance analysis 2.1. A n a l y t i c a l m o d e l o f L-server input queueing system
In this section, we analyze an L-server input queueing system of the NPS having an N x N configuration. Let us consider the packets destined for output port j. Fig. 3 shows the input queueing model based on the two-dimensional approach by the tagged input queue and the jth HOL queue. Here the jth HOL queue (logical queue) represents the number of packets at the heads of all input queues destined to the jth output. Also, without loss of generality, the occupancy of the ith input queue is Inputbuffers
Outputbuffers
1
IIIIII1-1
2
th.oLqo uo
i
Q.numberofcellsat the ~th input queue
N
l lllil
1-2
"'.. ",,~°...o% .......... _: ~ - d ~ ~~ : . . . . "..:-e:...a/
.---r--r-.r.-r-r--~lll]lli..r...-H" "-'n;mber of cells at the jth HOL queue
A : arrival to the jth HOL queue ....
Fig. 3. L-serverinput queueingmodel.
=
~
IIIIII
j
I- N
Y.C. Jung, C.K. Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
1563
considered as the tagged input queue. Recently, the analysis on such a queueing type has been done with the assumption that N goes to infinity [14]. It is shown in [14] that as N ~ ~, the packet arrival at the jth H O L queue becomes a Poisson process, which makes it easy to analyze the input queueing model. Unlike the earlier works, our input queueing model presented in this subsection is based on the finite switch size N. In our analysis, we assume the following: • The switch operates synchronously with fixed-length packets. • Independent, statistically identical traffic arrives at each input. In any given time slot, the probability that a packet will arrive at a particular input is P (i.e., the arrival process is Bernoulli). • Each packet is directed uniformly over all of the switch outputs, and successive packets are independent. • Each input has a finite buffer capacity. The number of buffers per input port is denoted as b~. The queueing discipline if FIFO. • In the case of contention for an arbitrary destination port j, new packets destined to output j arrive at the jth HOL queue where they contend for transmission with the packets blocked from last contention. L(1 ~
(1)
and are ordered in the alphabetic order. The set of states {(l, 0), (l, 1), (l, 2) . . . . . (l, N - L)}, 1 ~
I
k l_-~0(k)(~
~-)
1 (1
-
-
N
PO N-k ) ,
for
w=0,
k= 1,2,3 .... ,N,
(NkW
(2)
1---N
k = 0 , 1,2 . . . . , N - w ,
,
forl<~w<~N-L,
1564
Y.C. Jung, C.K. Un / Computer Networks and I S D N Systems 26 (1994) 1559-1580
where P0 is the probability that a tagged input queue has no packet for transmission to any output port in a time slot. a~lw" m i n ( 1 , k ) ,
for
w=0,
k=1,2,3 .... ,N,
P~uc,kI~ =
a k l ~ . m i n 1,
(3) ,
for
I<~w~
k = O , 1,2 . . . . . N - w , w=O, Psucl w =
k=l N-w
(4)
[ ~ L
Y'. aklw'minll,k-~-w-w 1 , , /
I <~w<~N-L,
for
k=O
aklw" Pb,o,klw =
m a x ( ( k - L ) , 0) k ,
for
w = 0,
k = 1, 2, 3 . . . . , N, max((k + w - L ) , O) aklw"
(5)
k +w
for
'
14w<~N-L,
k=O, 1,2,...,N-w.
The transitions from a current state to the next states may be classified as the following three types: Type I. no packet arrival and a successful contention, Type II. no packet arrival and a failing contention, or a packet arrival and a successful contention, Type III. a packet arrival and a failing contention.
TypeI
Type II t Ty~III
6
0
0
*
0
0
5
0
0
•
0
4°
0
•
0
•
P Pblo,214 4
0
0
0
/1f~x~($+1,4)
o//o
/l ~
2
0
0
*
I
0
0
*
1
2
0
•
0
"x
0/0 •
O"
s-I Q
,
0
•
0
*
0
0
s
s+I
•
"
0
bi
,,
Fig. 4. State transition diagram of the Markov chain T for the case of N = 8, L = 2.
Y.C. Jung, C.K. Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
1565
F o r e x a m p l e , T y p e I I I is t h e t r a n s i t i o n c o r r e s p o n d i n g to t h e case t h a t a p a c k e t arrives at t h e t a g g e d i n p u t q u e u e (not H O Q ) a n d t h e p a c k e t at t h e t a g g e d H O Q is b l o c k e d for t r a n s m i s s i o n in the s a m e slot. F o r T y p e I I I with t h e c u r r e n t state (s c, w c) = (s, 4) a n d L = 2 as in Fig. 4, t h e next s t a t e b e c o m e s ( m i n ( s + 1, bi), 4 + A - L ) = (s + 1, 4) if two p a c k e t s arrive newly at the j t h H O L q u e u e (i.e., A m = 2) a n d the p a c k e t at t h e t a g g e d H O Q is b l o c k e d . T h e state t r a n s i t i o n p r o b a b i l i t y for this case is P'Pb|o,214" A l l state t r a n s i t i o n p r o b a b i l i t i e s in a d j a c e n t slots a r e t h e n given in T a b l e 1, w h e r e 6 ( n ) is d e f i n e d as 1 o r 0, d e p e n d i n g n e q u a l s 0 o r not. T h e t r a n s i t i o n m a t r i x T of t h e M a r k o v c h a i n is d e f i n e d by
T=[t,c,W~;,:s]
(6)
,
where ~v
(Ss,W s
)ts c , w.s ws = 1 . c, s,
N o t i c e t h a t this m o d e l has t h e s t r u c t u r e of a Q u a s i - B i r t h - D e a t h t y p e w h e r e t h e m a t r i x T is of b l o c k - p a r t i t i o n e d form, given by
AT~
ATII
0
AT
An
Am
0
0
0
0
0
AI
AII
Ain
0
0
0
0
.. •
0
0
0
T =
, 0
0
0
0
AI
All
AIII
0
0
0
0
0
2'{1
t{ii
(7)
in which A I = (1 - P ) C ,
(8)
A n = P C + (1 - P ) B ,
(9)
A nl '~ P B ,
(10)
A~ = (1 - P ) C + P C = C ,
(11)
A I I = (1 - P ) B
(12)
+ PB =B,
Table 1 State transitions of the L-server input queueing system Current state (s c, wc)
Next state (s) (s s, ws)
Transition prob. (tsowc;ss.w)
Type (No.)
(0, O)
(0, O)
P'Psuclw +(1 - P)
I(II)
P'Pblo,klw
III
( 1 - P)'Psuclw P'Psuclw ( 1 - P)'Pblo,kIw
I II(I) II
P'Pbto,k Iw
III(II)
(1, k - L) k=L+I,L+2 (s, w); s = 1, 2.... , bi; w = 0, 1..... N - L
..... N
( s - 1, 0) (s - ~(s - bi), 0) (s, w + k - L) k = max(0, L - w + l ) ..... N - w (min(s + 1, bi), w + k - L) k =max(0, L - w +1) ..... N - w
Y.C. Jung, C.K. Un/ ComputerNetworksand ISDN Systems 26 (1994) 1559-1580
1566
with C and B being of the forms Psuc Io
0
Psucll
0
0
Psucl2
0
0
PsuclN-L
0
0
C =
B=
•••
0-
(13)
0
Pblo,LIl
Pblo,N-I[I
0
0
Pblo,LiN_ L
.
(14)
Here, each matrix element of T is a square matrix of order ( N - L + 1) except for A , , A,I, and A*n which are given by
A , = (1 - P ) C c,
(15)
A,I = (1 - P) + PC o,
(16)
A* u = PB r,
(17)
where C O= Psuc 10, Cc = Ce (e is a column vector with all its components equal to one), and B r is defined by
B r= (0, Pblo,L+ll0, Pblo,L+210, . . . , Pblo,NI0)"
(18)
Let /7 = (H0, /71,..., H t , - . - , / / b ) be the steady-state probability vector, where each element H t is a row vector, (Hto, Htt .... ,/TIN_L) except for H 0 that is a scalar. By definition, /7to is the steady state probability vector for the tagged input queue length and the number of packets blocked being equal to l and 0, respectively. That is, lira m_~=Prob (Qm = l, H,, = 0). The steady-state equations are given by /7o((1 - P )
+ PCo) + H , ( 1 - P ) C ~ = H o,
(19)
/7oPB ~ + H1( PC + (1 - P ) B ) + / / 2 ( 1 - P ) C = H1, /Ti_leB + F l i ( e c -+-(1 - e ) n )
--I-/-/i+ 1(1 - e ) c
=/7i,
(20) for
b i - 1 > i >~ 2,
(21)
IIb,_zPB +/Tb,_,(Pc + (1 - P ) B ) + IIb C = Hb,_ 1,
(22)
Flbi_lPB + IlbiB
(23)
= I/hi ,
where
C o + Bre = 1,
(24)
Ce( = C c) + Be( = B ~) = e.
(25)
Following the method of getting the matrix geometric solution [18], we can have exact matrix manipulations as shown in the Appendix leading to an explicit solution, given by //t =
/7oB~R(BR) t-t,
if
1 ~
P I I o B r R ( B R ) b ~ - 2 B ( I - B ) -1,
if
l=bi,
(26)
Y.C. Jung, C.K. Un/ ComputerNetworksand ISDN Systems 26 (1994) 1559-1580
1567
where
R =P[I-PI-(1 H0=
[
-P)B]
-1,
1 +BrR ~_, ( B R ) ' - ' e + p B r R ( B R ) b ' - 2 B ( I - B ) - ' e l=l
(27)
1-1
(28)
Here, I is an identity matrix and I is ee', where e is a row vector [1, 0, 0 . . . . ,0]. By using the method of the algorithmic approach, we can obtain numerical results of the matrix equations after some computation. Otherwise, it is often difficult and risky in numerical implementation to obtain the results for our model having a fair amount of states due to many parameters considered. Recall that P0 must be already known to solve the above equations. Notice that the following equation is satisfied: Po = (1 - P ) H o.
(29)
This naturally suggests an iterative solution. Starting with H 0 = 0, which corresponds to the case of saturated queues, one can compute the next value o f / 7 0 given by P0- This iteration continues until both Po and H o converge, leading to the exact values o f / / . We can derive the marginal distribution of Q and H. Denoting by q~ the probability that the tagged input queue is equal to l, and h w the probability that the number of blocked packets at the j t h H O L queue is equal to w, we get
qt = Hie, ~_,oII,w,
(30) for
w=O,
for
I<~w<<,N-L.
S=
hw =
bi ~Hs,
(31)
Performance parameters, such as mean queue length ~9 and packet loss probability P toss, can be obtained from a set of steady state probabilities computed above. That is, bi
= ~ lq~, 1=0
(32)
Ploss = qbi"
(33)
The saturation throughput has been obtained in several different works [13,15,16]. Although our model is based on the packet loss model, it can play the role of the saturation model as P becomes 1. The maximum throughputs obtained numerically are shown in Table 2. These values were obtained with N = 64, b i = 5, P = 1.0 for L = 2, 3 . . . . ,6. One can note that these values are in agreement with the results obtained in [13,15,16]. Table 2 Maximum throughput for different Analytical model N bi P 64 5 1 oo o¢ P/> Ps
L Maximum throughput (Ps) L=2 L=3 0.8879 0.9774 0.8845 0.9755
Reference L=4 0.9960 0.9956
L=5 0.9994 0.9993
L=6 0.9999 0.9999
This study [13,15,16]
Y.C. Jung, C.K. Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
1568
'
10-z
\ ~
1°-°
z--,
I
J
~ "-. "<'.--"~---
10-~
a.
'
'
t I .......
"'-.
.'-~ "--.~ . " - -- ~ . - - . . ---~- ~
'.
"
10_ e
I-q
I ...... N=3Z I-'~ I. - -. - - . . N izsj_~! =
"'......~
10 -1°
N=4 N= 8
~"
.~.
~
"'.
7
" ~
" " -.
".,
v
"~
)
< ~,
10-1z
10-14
10-1e
i
i
I
l
I0
20
i
i
"~
INPUT BUFFER SIZE F i g . 5. Packet loss probability as a function of b i for different N with parameter values L = 2, P = 0.8.
The simulation results in [5] show that the saturation throughput largely depends on N. For example, when N = 2 and L = 1, it is 0.75. However, when N = 8 and L = 1, it is 0.61. This result indicates that, as N increases, the saturation throughput decreases and converges to 0.586. Similarly, the packet loss performance change for different N can be observed from the results obtained from our model. Fig. 5 shows the packet loss probability as a function of b i for several values of N with parameter values L = 2, P = 0.8.
101° 10°
""
....
!
~
---
n,,
lo.
\
o
\
~--_~J-
~.~~-
10 -lo U
-.
-12
~
-\', 1o -~' lO-t.
\ ',
\
,,
)
\.
I
. . . .
10
I
,
,
20
I N P U T B U F F E R SIZE F i g . 6. Packet loss performance as a function of the input buffer size b i for different L with parameter values N = 3 2 and P = 0.8.
Y.C. Jung, C.K. Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
-__L_~ '
I
I
1569
' ~
10 -z , r-
~
".
10-4 "
o ~ m co
"''",
4
\
-
F
"'. ..
\
10 -8
x\
E" i0-io
;-
X
q
"",
~
q
-,
lo-la - \ '\
X
~
10-~4 F
7_~
\
",
-
"'.
"'. X
'I
,
"', \
[
l
- -
L = 2 1--
- .....
L~ s
- .... L= -----L=4
"-
~
10-16
"-,
'
I0
3
- - - - L = 6 I
"
l
I
l
20 INPUT BUFFER SIZE
Fig, 7. P a c k e t loss p e r f o r m a n c e as a function of the input buffer size b i for different L with p a r a m e t e r v a l u e s N = 32 and P = 0.9.
Figs. 6 and 7 show the packet loss performance as a function of the input buffer size b i for L = 2, 3, 4, 5 and 6 with p a r a m e t e r values N = 3 2 and P = 0 . 8 and p a r a m e t e r values N = 3 2 and P = 0 . 9 , respectively. Similarly to the study in [13,15,16], one may observe that the packet loss performance gets improved significantly as L increases. Table 3 shows the input buffer size required to satisfy the packet loss of less than 10 -8 for the cases of P = 0.8 and 0.9 as a function of L. One can see that, to achieve a packet loss rate of 10 -8 at the input side with the input load of 80%, 17 buffers should be provided per input port for the input and output queueing switch with L = 2. This result indicates that, for the case of L I> 2, the effect of H O L blocking is very small at the normal input load condition (e.g., P ~< 0.8). Fig. 8 shows the packet loss performance against the traffic load P as a function of b i for two cases of L = 2 and L = 3 both with p a r a m e t e r value N = 32. Assuming that the traffic load is 0.8, the minimum speed-up factor of L = 2 suffices to satisfy the requirement that the packet loss probability is within 10 -9 with input buffers having a reasonable size. However, in a heavy load condition (e.g., P = 0.9), one must have at least L = 3 in order to satisfy the same requirement as above. Obviously, it is because the maximum throughput with L = 2 is 0.88.
2.2. N : L concentrate output queueing system The purpose of the following analysis for a discrete-time output queueing system with bulk arrival is to quantify the intuition that under the same load condition a smaller value of L results in a lower packet
Table 3 I n p u t buffer size r e q u i r e d I n p u t load
P = 0.8 P = 0.9
I n p u t buffer size r e q u i r e d L=I
L=2
L=3
L=4
L=5
L=6
~ oo
17 oo
6 9
4 5
3 3
2 2
Y.C. Jung, C.I~ Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
1570
loss rate at the finite output buffers. The maximum size of a batch arrival is L, which is controlled by the L-server input queueing system. This burst smoothing scheme to throttle packet inputs into an arbitrary output buffer can improve the performance of output queue. Also, this N : L concentration from the N heads of input buffers to each output buffer facilitates greatly the implementation of the output buffer [12]. We now consider the tagged output queue (the jth output queue) with a batch arrival (up to L). We derive the arrival process at the output buffers from the departure process of an L-server input queueing system. Let O~ represent the number of packets transmitted to the output port j during the mth time slot. To consider the arrival process of the output queue j, which is dependent on the L-server input queueing, we need to compute the distribution of O. It can be derived from the throughput vector p of the tagged input queue, defined by P
= (Pt,
PE,..-,
Pk ....
,
Pa),
(34)
where p~ is the partitioned throughput for the case that the size of an arriving packet group at the output queue j is k, and the system throughput p is ZL=Ip k. Here Pk is given by
P'Ho'Ps.c,klO
+
k = 1, 2, 3 . . . . . L - 1,
Pk = ~ N N-L ~=aP'Ho'Psuc,nlO + •
N-., ~.,
(35)
hwPsuc,nlw, k = L ,
w = 0 n=max(O,L-w)
In
where
~w = l h w - H o , hw,
for for
w=0,
.
10-a
(36)
I <~w<~N-L.
I
.
-
.......
.
.
-
L
...... ...... I----
10-5
'
L = 2, =
.
.
.
'
.
bi
= 10 b i = 20 b i = 30 b I = 50 b, = i0
2
L = 2, L = 2, L = 3,
// j/
/
J
//
S/; ,'/t
4
1
"/1
,'//
/ / i' / / ,'
r~ o
,
10-9 rj
10-11
lO-ta
10-15
i
0.4
)
i
J
t
i
,i
~/
0.6
J,
I
t
I
)
0.8 TRAFFIC LOAD P
Fig. 8. Packet loss p e r f o r m a n c e against the traffic load P as a function of b i for two cases of L = 2 and L = 3 both with p a r a m e t e r value N = 32.
1571
Y.C. Jung, C.K. Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
S : output queue length O : arrival at the jth ~ output queue
~
t2
t2
t0+tl C ~ Q t 0 0
j
0
0
0
1
t2
( ~ ~ t l 1-1
l
?
O
O ~'"~Q
l+1
bo
Fig. 9. Discrete-time Markov chain state transition diagram for the output queueing system.
The distribution of O is related to the vector p, and is given by P~ tk - P r o b ( O = k ) = --~-,
k = 1, 2 , . . . , L,
(37)
and the probability that no packet arrives at output j during a slot is L t0=l~ t k. k=l
(38)
We assume that the output buffers can hold a maximum of b o packets per port, and denote S m as the number of packets in the tagged output queue at the end of m t h slot. The queue size S m is modeled by a discrete-time Markov chain E. It is assumed that when S m_ 1 = 0 and 0 m > 0, one of the new packets is immediately transmitted during the m t h time slot. Fig. 9 illustrates the state transition diagram for the N: L concentrated output queueing system. The one-step state transition probabilities for S are Pi;i or Prob (Sm+ 1 = j l S m = i ) . The transition matrix E of Markov chain is defined by E = [ p i ; j ] where Evipi;j = 1, and all the elements of matrix E can be obtained in Table 4. The steady-state probability vector q~ = (q~0, q~ . . . . , q~t,-.-, q~bo) where by definition q~t is the steady-state probability for the output queue being equal to 1 may be obtained by solving the following set of linear simutaneous equations:
bo q~j= E ~ l ) i P i ; j ,
j=0,1,2
. . . . . bo,
i=0 bo ~i i-O
(39) = 1.
In the same manner as in [17], we derive the packet loss rate. We consider the case where i packets arrive from 0 bulk arrival, and find j packets in the buffer. If we have i + j > b o + 1, (i + j ) - (b o + 1)
Table 4 State transitions of the output queueing system Current state Next states (so) (ss) s min(max(0, s + k - 1), bo) s=0, 1,2 ..... bo k=0, 1,2 .... , L
Transition Prob (so --' ss) tk (=Pr(O=k))
Y.C. Jung, C.K Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
1572
packets are lost due to buffer overflow. Thus, the probability that the test packet is lost is [(i + j ) - (b o + 1)]/i. Let P t o s s l j be the packet loss probability, given the condition that there are j packets in the buffer upon the bulk arrival O. It is given by
~ Ploss Ij ---"
( i + J ) - ( b ° + l ) t i ' bo_L+2<~j<~bo '
i=bo-j+2
(40)
P
0,
otherwise,
where p is the throughput at the tagged input queue. By removing the condition on the probability above, the packet loss probability P~ossbecomes bo
P,os~ =
~
~jP,o~sl~.
(41)
j=bo-L + l
Figs. 10 and 11 show the packet loss probability against the output buffer size b o as a function of L at P = 0.80 and P = 0.9, respectively. The solid curves in both figures reveal that the packet loss performance •becomes worse as L increases. That is, in order to obtain better performance at the output buffer, a small L is necessary. Fig 12 shows the numbers of input and output buffer per port to satisfy a packet loss of less than 10 -8 at P = 0.80 as a function of L. As an example, the condition that b o = 26, b i = 17 and L = 2 is necessary for this requirement. One may also find that the size of input buffers can be the same as that of output buffers for the same requirement if the speed-up factor L is less than 2. Getting (L, b) where L = L and b i = b o = b is possible by virtue of an adaptive operation of the speed-up factor L. This adaptive system with L (1 < L < 2) can be obtained from an alternative operation with L = 1 and L = 2, which is controlled according to the input queue length. We know that a problem associated with output queueing switches (whether concentrated or not) is the requirement of speed-up operation at the
I 10-°
. . . .
J
. . . .
I
. . . .
....
I
"'-.
L = 2 ....... ......
"'-.
lo-5
.
.
.
L=3 L=4
.
10_7
m <
10 - t i
~-
10 -13
r-
10-15 10
20
30 OUTPUT BUFFER SIZE
40
50
Fig. 10. Packet loss probability against the output buffer size b o as a function of L at P = 0.8.
Y C. Jung, C.K. Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
p:
o .<
"'.'.~
} "-."
-.....
--
L = L=
4 5
1573
~zz "
I0 -9
lO-U
40 OUTPUT
20
60
BUFFER
80
SIZE
Fig. 11. Packet loss probability against the output buffer size b o as a function of L at P = 0.9.
switching fabric. The issue related to speed-up confronts technological constraints in the design of ATM switches with several G b / s lines. Fig. 13 shows the packet loss probability at output buffers as a function of L for the case that bo = 26, b i = 25. In this figure, one can find the burst smoothing effect that the smaller the value of L, the lower the packet loss probabilities at output buffers. In practice, in addition to the performance, it is important
50 ~
"
'
I
'
I
', .
.
.
.
.
.
.
.
I - .
BUFFERS REQUIRED AT O U T P U T BUFFERS REQUIRED AT I N P U T
40 0
3o N
4
20
lo
,
-0 2
SPEED-UP
,
I
.
.
.
.
t
4 FACTOR L
Fig. 12.1nputand outputbuffersizesrequired per p o r t t o s a t i s ~ a p a c k e t l o s s o f l e s s t h a n 1 0 - S a t P = 0.80 a s a f u n c t i o n of L with parameter N = 16.
Y.C. Jung, C.K. Un / Computer Networks and 1SDN Systems 26 (1994) 1559-1580
1574
to consider the hardware complexity which increases with a larger value of L. Considering both the performance and implementation issues, the switch with L = 2 would be desirable, especially when one considers input and output buffering, assuming a reasonable input load condition. 2.3. Adaptive server input queueing system under input control
As indicated in the previous section, the switch operating with 1 < L < 2 can be viewed as the input and output queueing system with input control for adaptive service. With the input control, the controller processes packet headers and determines which packets at HOQs can be transmitted in a given time slot without the backpressured information from the output buffers. The adaptive control policy is such that if the input queue length is less than b,, the switch operates as a pure input queueing system; otherwise, it operates as an input and output queueing system with L = 2. In this section, for simplicity of analysis, we assume that the tagged input has its own controller which is correlated with others. Here we modify the L-server input queueing model in Section 2.1. Similarly to Section 2.1, the adaptive server input queueing process is modeled by a two-dimensional Markov chain T constructed by the queue size and the number of blocked packets at the jth H O L queue. The states of the Markov chain /~ are [(0, 0), {(s, w)[1 <~s <~bt, O < ~ w < ~ U - 1}, {(s, w ) I b t + 1 <.
(42)
and are also ordered in the alphabetic order. Table 5 shows the transitions for an adaptive server input queueing system. From Table 5, one can have a transition matrix 2f in the same manner as in Section 2.1. Here, one must be careful of treating the transitions which are related to the boundary levels (e.g., Q = 0, Q = b t and Q = b i ) . Let l i be the steady-state probability vector, defined by n=
(lio, li, . . . . . lil .... ,lib,, .... li~,),
(43)
where li0 is a scalar quantity. For 1 <~l <~bt, lit = (lifo, Ht, . . . . . lilN_,)' and for b t + 1 <~l <~b i, ~ = 10-z
I
. . . .
I
. . . .
I
. . . .
l
. . . .
I
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10-4
O
.
-
-
-
-
-
10 -5
r/2 b'2
o
10 -8
E.-, rj
<
f
10 -7
/ 10-8
~
P = 0.8 .......
10-9
I 2
. . . .
J 3
. . . .
I
. . . .
4 S P E E D - U P FACTOR L
I 5
P
-
0.9
.... 6
Fig. 13. P a c k e t loss probability at o u t p u t b u f f e r s as a f u n c t i o n o f L at P = 0.8 a n d P = 0.9 with p a r a m e t e r v a l u e s N = 16, b o = 26, b i = 25.
Y.C. Jung, C . K Un / Computer Networks and I S D N Systems 26 (1994) 1559-1580
1575
Table 5 State t r a n s i t i o n s of the adaptive i n p u t q u e u e i n g system C u r r e n t state
Next s t a t e (s)
T r a n s i t i o n Prob.
(sc, wc)
(Ss, w~)
(t ...... ;...... )
(0, 0)
(0, 0) (1, k - 1 ) k=2,3 ..... N
P'Psuclw [ L - 1 + ( 1 - P ) P'Pblo.ktwlL I
(s, w); s = 1, 2 . . . . . b t - 1; w = 0, 1 . . . . . N - 1
(s - 1, 0) (S, 0) (s, w + k - 1) k = max(0, 2 - w) . . . . . N - w
(1 - P)'Psuct,~.] L - t P ' Psuc I~., ] L -1 (1 - P)'Pblo.k Iw I L - t
(s+l,w+k-l) k = max(0, 2 - w) . . . . . N - w
P'Pblo.klw[l.
( s - l , 0) (s - ~(s - bi), 0) (s, w + k - 2 ) k = max(0, 3 - w) . . . . . N - w ( m i n ( s + 1, b i), w + k - 2) ( k = max(0, 3 - w) . . . . . N - w
(l--P)'psuclwlL_ 2 P'Psucl,~. I L - 2 (1 - P)'Pblo.klwl L - 2
(s,w);s=bt;w=O, 1. . . . . N - 1 ; s = b t + l , w = 0, 1 . . . . . N - 2
b t + 2 . . . . . bi;
1
P ' P blo.k I,~.I t. 2
(filo , 1~11,... , l~lN_2); W e can apply the same iterative computational procedure as that in Section 2.1 to obtain l i from l i T = 1I and /ire = 1. Using the steady-state probabilities, the packet loss probability Ploss is obtained as N-2 Ploss =
E /Ibiw" 14'=0
(44)
Here, the adaptive speed-up factor L is given by L=
fio + ]~ y o f i S w
+2
Efib,
s=l
+
E
•
(45)
s=bt+l
w=0
Denoting by h w the probability for H = w and Q < b t, and hw the probability for H = w and Q >/bt, we get bt - 1
/;tw = fi,,6(w) + Y'~ f i , ,
w -- 0, 1, 2 , . . . , N - 1,
(46)
s=l bi
fi~, hw=
for
O<~w<~N-2,
'=b,
(47)
FIb,N_,,
for
w = N - 1.
To investigate the arrival process of the tagged output queue, we derive the distribution of O, which is given by N
N-1
Pl = Y'~Pfi0Ps,c,kl0[L=l+ ~ k= 1 N-1 P2----
N-w
Y'~
w=0 k~max(0,1-w)
^
1
,,
hwPsuc,klwlL=l+~hwPsuc,,_wlwlL=2,
(48)
w=0
N-w
E E hwPsuc,klw]L=2, w = 0 k = m a x ( 0 , 2 - w)
(49)
where h w =/~w - fio ~(w). Using (37) and (38), we get t k, k = 0, 1, 2, which is related to Pl and 02-
1576
Y.C.
Jung, CK. Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
Through extensive numerical examples, we have found that, to achieve a packet loss rate of 10 -8 with the input load of 80%, a total of 46 buffers (i.e., b i = 23 and bo = 23, respectively) are required on both sides of the switch, and the condition b t = 2 is necessary for the adaptive control of L. In such a case, the value of L is found roughly 1.70. Table 6 shows analytical results for four different switch types of (L = 2, no control), (1 < L < 2, input control), (L >~bo, output control) and (L = N , no control), respectively. Note that an input and output queueing system with the input queue adaptively controlled can get an improvement on the packet loss performance, especially in a nonuniform (or imbalanced) traffic environment. In Section 2.1, it was assumed that the output address of the packets from each input link is randomly assigned with equal probability Now, using a simple nonuniform traffic model, we can compare the packet loss performance of the input and output queueing system with L = 2 with the adaptively input-controlled input and output queueing system with 1 < L < 2. This traffic model is based on the assumption that packets at individual input links are directed uniformly to the first m links of output ports. In other words, packet arrivals at each input link are modeled by an independent Bernoulli process with P at slot units. However, the output address of the packets from each input link is randomly assigned with equal probability One may express the traffic routing matrix from input links to output ports by T'= [tilj] where ti) are the traffic intensities to be destined from input i to output j with Ejm=1tilj = The input queueing model with the traffic described above is the same as that with uniform traffic in the previous sections except for a~j,, (see (2)) and t k (see (37) and (38)). Also, we assume that the traffic intensity of jth output link (h i) is maintained below 0.8 by the control of high layer protocols (e.g., service admission and network control protocols). For the analysis of the traffic described above, the input traffic load is given by
1/N.
1/m.
P"
m
=
(50)
where ,~j.= 0.80 for 1 ~
I km [Nl(l_Po k
l_P0 m
ak'w=
k = 1, 2, 3 . . . . . N, IN )(-~)~( \ k w 1
for
(51) l - - P ° ) N-w-lc ' m
for
l<~w<~N-2,
= 0, 1 , 2 , . . . , N - w . For output j(1 ~
k=1,2,
(52)
2
to = l -
~ tk.
(53)
k=l
Fig. 14 shows packet loss performances as a function of m (we define it as the non-uniform factor) at
Y.C. Jung, C.K. Un /
Computer Networks and
I S D N Systems 26 (1994) 1559-1580
1577
Table 6 Amount o f b u f f e r required for different types of switch Condition
Capacity o f b u f f e r required bi
Characteristics of switch architecture
Ref.
This study
bo
P = 0.8
17
26
- Input and output queueing with L = 2
Pios~ < 1 0 - s
23
23
- I n p u t b u f f e r length control
48
8
32
12
0
38
- l
L>~b o
-
L=N
[15]
A) = 0.80 for the cases of (b i = 17, b o = 26, L --= 2) and (b i --- 23, b o = 23, L = 1.70), respectively. It clearly shows that the adaptive control scheme prevents performance degradation of the switch with L = 2 under imbalanced traffic conditions. However, under a uniform traffic condition (i.e., m = N = 16) the two different systems yield nearly the same performance characteristic.
3. Conclusions We studied a synchronous nonblocking switch with input and output queueing, which operates L times faster than the i n p u t / o u t p u t trunk. It is modeled as a finite input and output queueing system with 1 < L ~< N, where N is finite. The analytical derivation of the effect of parameters P, L, N, b i and bo on the packet loss performance was addressed by two models: the L-server input queueing model and the N : L concentrated output queueing model. By using the algorithmic approach, numerical results
1°-8
5x10-9
'
I
I
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
E~ [..-, 0 2xlO
-°
<
10_9
r,,
_
o'1 03 °,.a
5xlO
i
-lo
<
x,o-,o lo-lO ,5
1 .
.
.
.
1
.
.
.
.
10
I
I
16
TRAFFIC N O N - U N I F O R M FACTOR m Fig. 14. Packet loss performance as a function of traffic non-uniform factor m at A) = 0.80 for two cases o f ( b i = 17, b o = 26, L = 2) and ( b i = 23, b o = 23, L = 1.7).
Y.C. .lung, C.K. Un /Computer Networks and ISDN Systems 26 (1994) 1559-1580
1578
were explicitly calculated after a reasonable amount of computation for our model having a fair amount of states due to many parameters considered. From this study, a performance trade-off could be observed in the selection of a proper L and the buffer size design at input and output. As an example, (L = 2, b i = 17, b o = 26) yields a packet loss rate 10 -8 at the input load of 80%. From both the performance and implementation points of view, we found that selecting ( L = 2, b i -- 17, b o = 26) would be proper for the design of a switch with input and output buffers without output control assuming the traffic load is below 80%. Further, we proposed an input and output queueing system with adaptive input control for the nonblocking packet switch. We found that the alternate operation between L = 1 and L = 2, depending upon the input queue length, minimizes the disadvantages of the input and output queueing system with L = 2. The adaptively input-controlled switching fabric with input and output buffers (e.g., 1 < L < 2, b~ = 23, b o = 23) has an effect on improving the switch performance, especially in a nonuniform traffic environment. Also, the required amount of buffers at the output ports can be reduced similarly to that of buffers required at the input ports, without having the backpressure scheme.
Appendix The following is a detailed presentation of the procedure in solving the matrix Eqs. (19)-(23). We can solve for/70 in terms of/71 from (19), that is, P/7o(1 - Co) = (1 - P ) / T 1 Co.
(A.1)
Multiplying both sides of (20) by e (column vector [1, 1, 1 , . . . , 1]) leads to
PH~B c = (1 - P)H2 Cc.
(A.2)
Multiplying both sides of (21) by e gives
P / 7 t B C = ( 1 - P ) I I , + , C ¢, for
l
(A.3)
The solution for Hb~_ ~ in terms of/Tb~ is obtained by multiplying both sides of (22) by e,
Pflb,-1B~ = Hb~Co.
(A.4)
Multiplying both sides of (A.3) by e" (row vector [1, 0, 0, 0 . . . . ,0]) gives
PUiB~e "= ( 1 - P ) / 7 t + l C C e "= ( 1 - P ) n , + l C ,
for
1 <~l
(A.5)
Using the above result in (21), we have
11l = /71_1PB + ~ ( PC +
(1 - P ) B + pBCe ) =111_~PB + /71( e e e + (1 - P ) B ) .
(A.6)
Hence,
H, =/7,_ 1BP[ I - P I " - (1 - P)B]
-1,
(A.7)
where I ' = ee'. Letting R = P[I - P I ' - (1 - P)B]- 1, we obtain
/Tt=/Tt_lBR,
for
2~
(A.8)
Using 1 - C O= Bre and C c = e - B ¢, (A.1)yields
PHoBre
=/71(1 - e ) ( e - B ~) =/7~(e - P e - (1 - P ) B c) =/7~(t- t q ' - (1 - e ) B ) e .
(A.9)
Y.C. Jung, C.K. Un /Computer Networks and ISDN Systems 26 (1994) 1559-1580
1579
Then, the above equation gives
H1 = HoB~R.
(A.10)
From (23), we can also solve for Hb~ in terms of Hh~_ 1, given by
FIb = Hb~_,PB( I - B ) - '
(A.11)
Then we obtain H t for l = 1, 2 . . . . . b i - l, given by
IIt = HoBrR( BR ) t- I.
(1.12)
The solution for Hb~ in terms of H 0 is obtained as
IIa~ = PHoBrR( BR)bi-2B( I - B) -~
(1.13)
Using 11o + Y~LllIte = 1, we obtain
(
Ho= 1 + B r R ~_. ( B R ) t - t e + p B r R ( B R ) b i - 2 B ( I - B ) - l e 1=1
]
(1.14)
The significance of the algorithmic approach used here can be emphasized considering the fact that (26) is actually explicitly computable for the practical ranges of N and b i in our analytical model. We can calculate the stationary probability vector H ((1 + ( N - L + 1). b i) elements) after a reasonable amount of computation that the evaluation of H = HT and He = 1 (see (7)) requires the computation of an ( N - L + 1) x ( N - L + 1) matrix.
References [1] H. Ahmadi and W.E. Denzel, A survey of modern high performance switching techniques, IEEE J. Select. Areas Comm. 7 (1989) 1091-1103. [2] F.A. Tobagi, Fast packet switch architectures for broadband integrated services digital network, Proc. IEEE 78 (1990) 133-167. [3] K.W. Sarkies, The bypass queue in fast packet switching, IEEE Trans. Comm. 39 (5) (1991) 766-774. [4] M.G. Hluchyj and M.J. Karol, Queueing in high performance packet switching, IEEE J. Select. Areas Comm. 6 (9) (1988) 1587-1597. [5] M.J. Karol, M.G. Hluchyj and S.P. Morgan, Input versus output queueing on a space-division packet switch, IEEE Trans. Comm. COM-35 (1987) 1347-1356. [6] T.H. Cheng and D.G. Smith, Queueing analysis of a multichannel ATM switch with input buffering, Proc. ICC, 1991, pp. 1028-1032. [7l S.Q. Li, Performance of trunk grouping in packet switch design, Proc. lnfocom'91, pp. 0688-0693. [8] Y. Yeh, M. Hluchyj and A. Campora, The knockout switch: a simple, modular architecture for high-performance packet switching, IEEE J. Select. Areas Comm. SAC-5 (1987) 1274-1283. [9] F.A. Tobagi and T.C. Kwok, The tandem banyan switching fabric: a simple high-performance fast packet switch, Proc. of Infocom'91, pp. 1245-1253. [10] E. Nussbaum, Communication network needs and technologies-a place for photonic switching?, IEEE J. Select. Areas Comm. SAC-6 (1988) 1036-1043. [11] C.A. Brackett, Dense wavelength division multiplexing networks: principles and applications, IEEE J. Select. Areas Comm. 8 (1990) 948-963. [12] K.Y. Eng, A photonic knockout switch for high-speed packet network, IEEE J. Select. Areas Comm. 6 (1988) 1107-1116. [13] Y. Oie, M. Murata, K. Kubota and H. Miyahara, Effect of speedup in nonblocking packet switch, Proc. oflCC'89, 13.4, 1989. [14] S.Q. Li, Nonuniform traffic analysis on a nonblocking space-division packet switch, IEEE Trans. Comm. COM-38 (1990) 1085-1096. [15] A.K. Gupta and N.D. Georganas, Analysis of a packet switch with input and output buffers and speed constraints, Proc. of lnfocom'91, pp. 0694-0700.
Y.C. Jung, C.I( Un / Computer Networks and ISDN Systems 26 (1994) 1559-1580
1580
[16] A. Pattavina, A broadband packet switch with input and output queueing, XIII international Switching Symposium Proceedings (1990), Vol. VI, pp. 11-16. [17] M. Murata, Y. Oie, T. Suda and H. Miyahara, Analysis of a discrete-time single-server queue with bursty inputs for traffic control in ATM networks, IEEE J. Select. Areas Comm. 8 (1990) 447-458. [18] M.F. Neuts, Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach (The Johns Hopkins University Press, Baltimore, MD, 1981).
!
irv o ~t:
~
m
was orninOaeu oeai 9 ereeie te S ere neecons
engineering from the Kyungpook National University, Korea, in 1980. He is currently pursuing the Ph.D. degree in the Department of Electrical Engineering at the Korea Advanced Institute of Science and Technology (KAIST) from 1989. Since 1980 he has been in the Agency for Defense Development (ADD), Korea, as a senior communication design engineer. He has more than 10 years of experience in mobile network designing, network database control and protocol architecture specification techniques, related to military tactical communication system. His current interests include ATM switch architecture design and performance evaluation for B-ISDN, and protocol architecture design for the digital mobile radio networks.
K w a n U n was born in Seoul, Korea. He received the B.S., M.S., and Ph.D. degrees in electrical engineering from the University of Delaware, Newark, in 1964, 1966, and 1969, respectively. From 1969 to 1973 he was an Assistant Professor of Electrical Engineering at the University of Maine, Portland, where he taught communications and did research on synchronization problems. In May 1973 he joined the staff of the Telecommunication Sciences Center, SRI International, Menlo Park, CA, where he did research on voice digitization and bandwidth compression systems. Since June 1977 he has been with Korea Advanced Institute of Science and Technology (KAIST), where he is a Professor of Electrical Engineering, teaching and doing research in the areas of digital communications and signal processing. So far, he has supervised 51 Ph.D. and more than 100 M.S. graduates, He has authored or coauthored over 300 papers on speech coding and processing, adaptive signal processing, data communications, B-ISDN, protocol design and analysis, and very high-speed packet communication systems. Also, he holds seven patents granted. From February 1982 to June 1983 he served as Dean of Engineering at KAIST. Dr. Un is a Fellow of IEEE. He received a number of awards, including the 1976 Leonard G. Abraham Prize Paper Award from the IEEE Communications Society, the National Order of Merits from the Government of Korea, and Achievement Awards from the Korea Institute of Telematics and Electronics, the Korea Institute of Communication Sciences, and the Acoustical Society of Korea (ASK). He was President of the ASK from 1988 to 1989. He is a member of Tau Beta Pi and Eta Kappa Nu.
- -
Chong