ELSEVIER
Performance
Evaluation 26 (1996) 201-2 18
Customer delay in very large multi-queue single-server systems Charles A. LaPadula a,l, Hanoch Levy b,* ’ Teletrafic Theory and Performance Department, AT&T Bell Laboratories, Holmdel, NJ 07733, USA b Department of Computer Science, Tel Aviv University, 69978 Tel Aviv, Israel Received 28 April 1992; revised 3 February
1995
Abstract The objective of this work is the modeling and analysis of multi-queue single-server systems consisting of many queues. The size of such systems imposes difficulties in either applying numerical procedures or in using simulations. We address the problem of whether a very large multi-queue single-server system (polling system), consisting of 100 or more queues, can be modeled by a significantly smaller system without considerably distorting its performance. In particular we study systems in which the service discipline is either exhaustive or gated and the service times in the different queues are identically distributed. We consider the delay incurred by an arbitrary customer in the system as the performance measure of interest. The main result of this paper is that in this framework a polling system consisting of several queues can approximate the behavior of a very large system fairly accurately. However, an approximation by a system consisting of a single queue (with vacation periods) will yield a fairly poor approximation. We propose an algorithm for transforming the original system (called System A) into the approximate system (called System B). We discuss the errors introduced by this transformation and provide bounds for the error in the estimates of the mean customer delay. Numerical results show that System B is good for predicting the tail probabilities of System A as well. Keywords: Multi-queue;
Single server; Polling system; Delay analysis
1. Introduction This work is motivated by the need to model and analyze a computer communication system in which an important component consists of very many queues (more than 100) which are served by a single processor (server) in a cyclic manner. The modeling of such a component is done by a multi-queue single cyclic server model (polling system). Typically, a cyclic polling system consists of N queues, each receiving an independent stream of customers, and a single server which serves them in a cyclic order. Previous * Corresponding author. E-mail:
[email protected]. ’ Charles A. LaPadula passed away while this paper was under review. 0166-53 16/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved SSDZ 0166-53 16(95)000.26-7
202
CA. LnPudrtla, H. Ley/Pe@rmance
Evuluurion 26 (1996) 201-218
studies have provided numerical procedures for the analysis of either the mean delay or the LaplaceSteijltiz Transform (LST) of the delay distribution incurred in the system queues. However, the amount of computation required by these numerical procedures significantly increases with the size of the system considered. Thus, these procedures face difficulties when used to analyze very large systems. Many applications of polling systems, including the one motivating this work, consist of a very large number (100 or more) of queues. The use of available numerical techniques for evaluating their performance may become very expensive. Moreover, in many of these applications the polling system is a submodel of a larger and more complex model. In these cases a reasonable modeling approach is to create a large simulation model, part of which is the polling subsystem. This modeling approach requires simulating a very large number of queues within one program, which is, again, a non-feasible task. The difficulties faced by these approaches raise the need for an alternative approach to model very large polling systems. This paper focuses on analyzing the properties of very large polling systems and on studying how these properties can be used to efficiently model and analyze these systems in practice. We concentrate on systems in which the service time distribution is the same for all quetres (it is quite common in large applications that the service times are abour the same in all queues) but their arrival rates and switch-over periods may vary significantly. The performance measure which we are interested in is the delay (waiting time) incurred by an arbitrary customer in the system. This is perhaps the most important single measure of performance in such systems. We are interested in both the mean value of this measure as well as in its tail probabilities. The analysis is carried out for systems in which the service is either gated (serve all customers present at the queue when the server arrives at the queue) or exhaustive (serve the queue until it is empty). The main results of this paper suggest that one can model a very large polling system with exhaustive or gated service by a significantly smaller system without introducing large errors in these performance measures. The transformation of a very large system into a smaller one is done by properly merging several input streams into a single queue. The benefit gained from such transformation is that the smaller system is much simpler to handle by either numerical procedures or simulations. We derive the error in estimating the mean delay of an arbitrary customer using a system in which two input streams are merged into one queue. We use this measure to guide a greedy algorithm for sequentially transforming an N queue system into a K queue system; we derive an upper bound for the errors (in estimating the mean delay of an arbitrary customer) introduced by the K queue system. We provide an algorithm for deriving this transformation. Numerical results for the error in predicting the tail probabilities are also provided. The structure of this paper is as follows: In Section 2 we describe the model. In Section 3 we review previous results relevant to this work. In Section 4 we describe two transformations, used later in transforming a large system into a smaller one. In Section 5 we derive several properties of large polling systems. In Section 6 we propose an approach for approximating very large systems by smaller ones; the approach uses the transformations presented in Section 4 and is based on the properties derived in Section 5. In Section 7 we provide numerical results for testing the quality of these approximations. A discussion and suggestion for how to use the approximations are provided in Section 8.
2. Model description We consider a system consisting of N infinite-buffer queues, Qt , . . . , QN, and a single server which serves them in a cyclic order. Customers arrive to Qi according to a Poisson process of rate hi ; the total arrival rate to the system is h = xi”=, hi. The service time at Qi is an independent random variable Bi with
CA. LaPadula,
H. Lev_v/Performance
Evaluation
26 (1996) 201-218
203
first and second moments h;, bj2’, and with LST B;(s). The offered load to this queue is p; = Lib;, and the total system load is p = Cr!, p;. In the context of this work we are interested in systems in which the customer service time parameters are identical in all queues, in which case we have p; (s) = p(s), b; = b and bi2’ = h”). The server moves among the queues in a cyclic order; when leaving Q; and before moving to the/next queue the server incurs a switch-over period whose duration is a random variable R; with first two moments r’I and Y!~) I . All arrival processes, service times and switch-over periods are independent. The total switch-over time in a cycle is a random variable R = xi”=, R; whose first two moments are r = xi”=, r; and rC2) = r2 + Cr=, (Y;(2) - ~2). The service discipline in all queues is either the gated service (in which when the server visits Q; it serves all the customers which are present there at the beginning of the service period) or the exhaustive service (in which the server serves Q; until the queue becomes completely empty). An important measure in evaluating very large polling systems is the degree of traffic imbalance. For a given positive real x > 1 we say that a polling system is x-balanced if for every two queues Q; and Qj, pi/p; 5 x. Note that a l-balanced system is completely balanced, since all loads are identical.
3. Previous results: Mean waiting times and a pseudoconservation
law
Let W; be the waiting time at Q; and W be the waiting time of an arbitrary customer in the system. The LST of W; can be derived by numerical procedures from hj (j = 1, . . . , N) and the LST’s of Bj and Rj(j = I,..., N). A numerical procedure for this derivation is provided in [3]. The mean value of W;, denoted E W;, can be derived by various numerical procedures from the values of hj, bj , hj”. r;, ~-/~‘(,j = 1 N). Algorithms for deriving mean delay are given in [ 12,9,6,11]; efficient algorithms for second (and higher) delay moments are given in [6, lo]. The following pseudo-conservation law has been derived by Watson [ 131 (and Boxma and Groenendijk ]l I):
where i E g relates to all queues in which the service is gated and i E e relates to all queues in which the service is exhaustive. For a system in which the service time distribution is identical for all queues we have h; = b and p; = h; b, and, as observed by Watson [ 131, we can use 2 this law to derive the mean waiting time of an arbitrary customer in the system:
N
Aj
EW=ChEW;= i=l
gf_,
h@
2(1 - P)
r(2) +,+
r @(I - P)
’ Note that this would be the case even when the service time distributions
p2 + c i Eg
pi” - c iw
pi’ .
(3.2)
I
differ from each other but their means are identical.
CA. LaPadulu, H. Levy/Per$ormance
204
Evaluation 26 (1996) 201-218
4. Merge and split of customer streams in large polling systems In Section 6 we will describe how a polling system consisting of a large number of queues can be approximated by a system consisting of a smaller number of queues. This approximation will be conducted while preserving two key parameters 3 of the polling system: the overall utilization, p, and the overall overhead, R. The main idea behind this approximation is that one can transform a large system into a smaller one by merging several customer streams into a single stream, and combining the corresponding switch-over periods into a single period. Formally, this merge is done as follows; Let System A be the original system which is to be transformed into System B consisting of fewer queues. Let Qi,,. . . , QiL be the L queues in System A which are to be represented by a single queue, say Qu, in System B, by merging their streams together. Then the parameters of Qn are set as follows: BB
=
Bi,,
AB=khir.
RB =
k=l
fJ
Ri,,
k=l
where the second sum is a sum of independent random variables. The remaining queues of System A are represented in System B without change (unless they too are merged into another queue). Remark 4.1. Note that this transformation
does not affect the values of R, p and h.
Example. Consider System A consisting of 100 queues in which Bi is deterministic of 1 unit, Ri is deterministic of 0.02 units and ht = . . . = A40 = 0.01, and h4t = . . . = hloo = 0.005. We may transform this system into System B consisting of 10 queues, Qta, . . . , Qlo,, by merging the streams of every ten consecutive queues of System A into a single queue of System B. Thus we have: ht, = . . . = A+ = 0.1, h5a = ... = htuB = 0.05 and RI,, . . . , RloB are deterministic of 0.2 units. The service times in all ten queues are deterministic of 1 unit. Note that in both systems p = 0.7 and R is deterministically of duration 2 units. The split operation (which will be used later in this paper) can be described as the inverse of the merge operation. In this operation one may represent Qi of System A by several new queues in System B, by having its input stream being split into those queues. Like the merge operation the split operation is done by preserving the values of R, h and p.
5. Properties of very large polling systems In this section we present several properties approximation procedures to be presented later.
of very large polling systems which are useful in the
5.1. Insensitivity to visit orders Here we observe the insensitivity of the mean delay to the particular order of visit used by the server. 3 Note that an alternative approximate approach is one which preserves preserve the total utilization p; this approach is addressed in Section 8.
R and the mean delay, E W, but which does not
C.A. LaPadula, H. Levy/Performance
Evaluation 26 (1996) 201-218
205
Proposition 5.1. The mean delay of an arbitrary customer in the system is independent of the visit order used by the server The proposition results directly from Eq. (3.2).
5.2. Very large polling systems: Limiting behavior For the sake of comparison it is of interest to derive the mean delay in very large polling system, when the number of queues approaches infinity. To do so we let the number of queues go to infinity while preserving the total load and system overhead (due to switch-over periods), namely holding R, A. and p constant. This can be done by repeatedly using the split operation. Our interest is in systems in which the load ratio between every two queues is bounded, and thus we let the number of queues grow to infinity by maintaining an x-balanced system (note that this can be achieved if the split operation is always performed to the queue with the largest arrival rate). Proposition 5.2. For any value of x > 1, the limit of the mean delay for an x-balanced N --+ co, is: Jim, +
EW =
hbc2)
system, when
(5.1)
2(1 - P>
Remark 5.1. Note that this result holds for either exhaustive or gated service and is independent of X. The reason is that when the individual streams are very small, the server will always find at most a single customer in any queue, and thus its behavior under both service policies will be the same. Therefore, this result will also hold true for other service policies (e.g. limited service) which are not discussed in this paper. Eq. (5.1) was derived by Fuhrmann and Cooper [4] in the context of a continuous polling system.
5.3. The eflect of merging two streams Eq. (3.2) implies that the merge operation affects the mean delay of an arbitrary customer. However, it is easy to see that such effect will be minimized if the queues merged are the lowest utilization ones. Assuming that we use such a single merge to transform System A into System B, we may provide an upper bound for the relative difference in performance between the two systems. Lemma 5.1. Let System A consist of N queues and let System B be derived by merging the two lowest utilization queues, say Ql and Q2, of System A into a single queue. Then the mean waiting times in the two systems, E WA and E We obey:
(4
Ij’the service is gated, E WB > E WA. If the service is exhaustive, E WA > E WB. [EWB - EWAI
(b) EWA
2 -=cN+N2’
206
C.A. LaPadula, H. Ley/Per$ormance
Evaluation 26 (1996) 201-218
Proof. If the service of Ql and Q2 is gated, then from (3.2) we have
EWB - EWA =
r[(m
+
p212 x1
(pf
+
P$l
-PI
rpl p2 =p>o. (1 -P>
(5.2)
Similarly, if the service of Q 1 and Q2 is exhaustive then this difference equals to the same expression but with a negative sign. Thus (a) is proved. Now, since Q 1 and Q2 are the lowest utilization queues, p1 + p2 5 2p/N. Thus, under this constraint, the term p1 p2 is maximized when p1 = p2 = p/N. Therefore, the difference E WB - E WA is bounded from above as follows: r(plN>2 l_p
IEWe-EWAII
(5.3)
.
In addition, one can easily see that the term Cy!, from (3.2): r EWA
2
W2(~lW2 2(1
-
p? is minimized when pi = p/N,
i = 1, . . . , N. Thus,
+ N(plW21.
(5.4)
P>
Finally, from (5.3) and (5.4) the proof of(b) follows.
0
Remark 5.2. Note that the merge of two gated-served queues into a single gated-served queue increases the mean delay of an arbitrary customer, while the merge of two exhaustive-served queues (into a single exhaustive-served queue) decreases the mean delay of an arbitrary customer. Note that the increase and decrease are of the same absolute value.
6. A greedy algorithm for transforming a very large polling system into a smaller one Lemma 5.1 suggests that reduction in the system size by a queue merge best preserves the system properties (in terms of mean customer delay) if the queues to be merged are the lowest utilization ones. Proposition 5.1 implies that such a merge is legitimate even if the lowest utilization queues are not consecutive on the cycle. These results suggest a simplistic greedy algorithm for transforming an N queue system into a K queue system (K -C N). This algorithm will perform N - K merge transformations, each of them merging the two lowest utilization queues. By merging the two lowest utilization queues the algorithm attempts to preserve the system properties as much as possible. An upper bound for the relative difference between the mean delay of the original system, System A of size N, and that of the transformed system, System B of size K, is provided in Theorem 6.1. Theorem 6.1. Let System B be derived from System A by perjorming N - K merge operations of the two lowest utilization queues. Then the relative difference between E WB and E WA is bounded by /EWB - EWA( EWA
2 < -. -K
(6.1)
CA. LaPadula, H. Levv/Peflormance
Evaluation 26 (1996) 201-218
207
Proof. From (5.3) we have
w2 I1-p
1 [---I< rp" -= x2 1
i+
IEWB - EW,A,\ 5 rp” 1-P
N2 N s K
(N-
dx
Now, from (5.4) the proof follows.
1
112 +.“+
rp2
I-pK
1
(K + 1)2
1
N
K(1 - P)’
q
Theorem 6.1 suggests that if System B is large enough its mean delay is guaranteed to be very close to that of System A. Note, that this still allows System B to be significantly smaller than System A. Remark 6.1. It should be noted that the greedy algorithm does not necessarily provide the optimal System B of size K (namely, the one whose mean delay is the closest to that of System A). To demonstrate this, consider a four queue system in which pt = p2 = 0.1, p3 = p4 = 0.15. The greedy algorithm, when used to produce a two queue system (consisting of Qt and Qn), will merge Qt and Q2 into Qt, and Q3 and Q4 into Qn; this yields pr = 0.2 and PII = 0.3. Obviously, a better selection is to merge Qt with Q3 and Q2 with Q4. yielding pr = pn = 0.25, and having a smaller value of C/!!, p;. Remark 6.2. The optimal System B of size K (in the sense that its mean delay is the closest to that of System A) can be found by finding a System B for which the sum of squares of its queue utilizations, namely C,r=, pf, is minimal (over all such systems of size K). However, this problem, called the “minimum sum of squares problem”, is NP-complete (see, e.g. [5,p.225], and thus is not likely to be solved by a simple algorithm. Remark 6.3. The greedy algorithm O((N - K) log N) operations.
proposed
above can be implemented
by known techniques
in
Remark 6.4. A queue elimination technique was used in an independent study by Chang and Sandhu [2] but in a different context. Their procedure aims at eliminating unstable queues (in a system with k-limited service) by replacing them by an appropriate switch-over period.
7. Tail probabilities
and mean delay: Numerical results
The bounds provided in Section 6 imply that relatively small systems can approximate very large systems yielding good accuracy of the mean customer delay. It is an open (and much more difficult) question whether this approximate approach is good for predicting the tail probabilities of customer delay. Intuitively, one may conjecture that when two low utilization queues are merged, the majority of the customer population (which is directed to the other queues) is affected only slightly by the merge and thus the waiting time of an arbitrary customer (mean as well as tail probability) does not change considerably. This suggests that the criteria used by the greedy algorithm for selecting queues to be merged, could potentially be used as good guidelines when the objective is to have small effect on the delay tail probabilities. Thus, the
208
C.A. LaPadula, H. Levy/Pe$ormance
Evaluation 26 (1996) 201-218
*:
1 queue
0 : 10 queues x : 100 queues
10
12
14
16
18
20
22
24
: 5
Time Fig. 1.
approximation approach derived in Section 6 can potentially serve as a good approximation for predicting tail probabilities. In this section we examine the quality of this approximation approach focusing on examining its quality in predicting tail probabilities, namely in predicting the value of Pr[ W 1 t] for a given t > 0. We conduct the comparison using numerical techniques for evaluating various cases representing a wide range of systems. To compute the delay tail probabilities we use an iterative approach [3] for deriving the LST of the customer delay. We then use a numerical technique proposed by Platzman et al. [8] to compute the tail probabilities of the delay from its LST. To compute the mean customer delay we use (3.2). The system considered for approximation (System A) consists of 100 queues, all served in the gated fashion. The time unit of all measures mentioned are seconds, Below we outline the different cases examined. Case
1: Fully symmetric system. We start by considering a fully symmetric system, in which the arrival rate to each queue is hi = 0.95 customers per second and the service time has a bimodal distribution: Pr[B = O.OOZ] = 0.999 and Pr[B = 0.251 = 0.001 (thus bi = 0.0082 and p = 0.783). This reflects a system in which the normal processing time of a customer is in the order of a few milliseconds; while occasionally a large amount of processing is required (thus increasing the variability of service time). The switch-over periods are identically distributed for all queues and are bimodal as well: Pr[Ri = O.OOOS]= 0.99 and Pr[Ri = 2.20001 = 0.01; this reflects a normal switch from queue to queue which takes less than a millisecond, and occasional events in which a very large switch-over period is taken (which increases the variability of the switch-over period). In Fig. 1, we plot the tail probabilities (Pr[ W > t]) for several values of t (10, 15, 20 and 25 s). Three curves are plotted in this figure: the first depicts the tail probabilities of the original system (with 100
C.A. LaPadula, H. Levy/Pe
209
queues); the other two curves depict the approximations of the original system by a 10 queue system and a I queue system. The 10 queue system is constructed by merging the streams of every 10 queues (from the 100 queue system) into a single queue while combining their switch-over periods by a way of convolution. Similarly, the 1 queue system is constructed by merging all queues into a single one. The accuracy provided by the LST inversion program is 0.002 on the vertical axis and 0.5 on the horizontal axis. This means that if the program computes p = Pr[W > t] then the performance measures obey: p - 0.002 5 Pr[W > rf0.51 < p + 0.002. Note that the 010 queue system approximates the original system very well. As a matter of fact all discrepancies between their performance fall within the accuracy of the numerical procedure (0.002). The approximation by the one queue system is not as good. The main discrepancy of this approximation is in the left side of the distribution. This discrepancy corresponds to the mean value discrepancy which is predicted by (3.2).
Case 2: Asymmetric switch-overperiods. In this case we examine how the quality of the approximation is affected by the way by which the switch-over periods are modeled. We examine two subcases: Case 2(a) and Case 2(b), in which the switch-over periods are very asymmetric. In both subcases the service times and the arrival rates are as in Case 1 and the distinction is only in the switch-over periods. The mean value of the total switch-over period (R) is similar to that of Case 1, but here most of this overhead is concentrated at RI. Case 2(a): Here we consider a large switch-over period at Qt ; this is a hundred-fold convolution of the following bimodal distribution: Pr[X = 2.21 = 0.01 and Pr[X = 0] = 0.99. The other switch-over periods are all deterministic of duration 0.0005. Note that the mean durations of the switch-over periods are: rt = 2.2, Yi = 0.0005 (i = 2, 3, . . . , 100). This 100 queue system is approximated by a 10 queue system as follows: Every 10 consecutive streams are merged into a single queue; the large switch-over period (RI) remains the same; The other 9 switch-overs each consist of the convolution of 11 small switch-over periods. Fig. 2(a) depicts this case; the figure contains three curves: one represents the 100 queue system, the second represents the 100 queue system of Case 1, and the third represents the approximation by a 10 queue system. The figure demonstrates that the quality of the 10 queue approximation is preserved even when the switch-over periods are very asymmetric. Moreover, the figure demonstrates that the tail probabilities of the 100 queue system are relatively insensitive to the location of the individual switch-over periods. Case 2(b): This case is similar to Case 2(a) but considers a smaller second moment for the long switch-over period (holding the first moments similar). Here RI is exponentially distributed with mean 2.2005, while Ri (i = 2, . . . , 100) are each exponentially distributed with mean 0.0005. The system is approximated by a 10 queue system, a 5 queue system and a 1 queue system. In all cases the streams are merged in symmetric way; the merge of switch-over periods is done by merging several small switch-over periods together; this merge is not done by convolution but by creating an exponential distribution whose mean is equal to the sum of the means of the merged switch-over periods. Fig. 2(b) depicts this case. The figure demonstrates that under a completely different setting of switch-over periods the approximation approach works very well (for the 10 queue and 5 queue approximations). As before, the one queue approximation is not as good. Note that in this case the levels of these curves differ somewhat from those of Case 1; The reason is that the second moment of the switch-over periods has some effect on the tail probabilities.
210
C.A. LaPadula, H. Levy/Petiormance
Evaluation 26 (1996) 201-218
':lOqueues
0.16 =0.14
0: lOOqueues(caseI)
E $ 0.12
x: lOOqueues
m" g 0.1 ' 0.08 0.06
0.02 0 8
10
12
14
16
Time
18
20
22
24
/
/
I
I
Fig. 2(a). I
0.5 (b)
0.45 0.4 0.35 E
k
0.3-
$0.25 -
Y \
/ \
\
\
\
\
I
I
*: \
\
\
1 queue
+: 5 queues \
\
\
0:
\
\
loqueues
x : 100 queues
\
Time
Fig. 2(b).
26
C.A. L.aPadula, H. Levy/Pe$onnance
0.3 -
r:
Evaluation 26 (1996) 201-218
*:
\
211
6 queue
0: 15queues
x: lOOqueues
0.05-
8
10
12
14
16
Time
18
20
22
24
:3
Fig. 3(a).
Case 3: Asymmetric arrivals. Here we consider a family of cases in which the system is very asymmetric in its arrival streams. It consists of a few queues whose arrival rates are very high while all others have very low arrival rate. Case 3(a): The arrival rates are hl = . . . = A5 = 10 and h6 = . . . = hloo = 0.5. All switch-over periods are bimodal distributed with: Pr[Ri = 0.00051 = 0.99 and Pr[Ri = 2.20001 = 0.01. The system is approximated by a 15 queue system and a 6 queue system. In both approximations Ql , . . . , QS remain the same while the other 95 queues are merged together. In the first case they are merged into 10 queues (9 with arrival rate 5 and one with arrival rate 2.5) and the corresponding switch-over periods are merged together by convolution. In the second approximation all 95 small queues are merged into a single queue while the corresponding switch-over periods are merged by convolution. Fig. 3(a) depicts this approximation: the results are very good. The main discrepancy appearing is at t = 10 for the 6 queue case. Case 3(b): This case is similar to Case 3(a), except that the switch-over periods are asymmetric and exponentially distributed. The mean of RI is 2.2005 and the mean of each of the other switch-over periods is 0.0005. The approximation examines transformation into a 15 queue and a 6 queue systems as in Case 3(a). The merge of switch-over periods is done by using exponentially distributed switch-over periods whose mean is equal to the sum of the means of the merged periods. This case is depicted in Fig. 3(b). Again the only discrepancy is at t = 10 for the 6 queue system. Case 3(c): Here we examine the effect of the location of the heavy queues on the system performance and on the approximation. The case is similar to Case 3(a) except for the location of the heavily loaded queues. These are now Q 1, Q 11, Q31, Q61, Q71. In Fig. 3(c) we depict the tail probabilities of this case, and compare it to that of Case 3(a). Two curves represent this 100 queue system (denoted by “100 queues (spread)“) and the 100 queue system of Case 3(a) (denoted “100 queues (continuous)“). These have practically no
212
CA. LuPadula, H. Levy/Performance
Evaluation 26 (1996) 201-2/H
0.3
0.25 :
0.2 -z c rgo.15 g CL 0.1
0.05
0t
b
I
10
12
14
16 Time
18
20
22
24
26
24 -
26
Fig. 3(b).
0.3 -
5:
6 queue
0: 15 queues
0.25 -
x: 100queuesfconsecutive)
+ : lOOqueues (spread)
0.1-
0.05-
0 8
I
10
I
12
14
16
I
18 Time _
Fig. 3(c).
I
/
20 __
22 --
-
C.A. LuPadula, H. Levy/Performance
Evaluation 26 (1996) 201-218
213
(4 0.35-
* 0’
,
\
0.3 -
\
\
\
*: \
2 queue
\
EO.25-
0 : 20 queues
F A sx 0.2-
x : 100 queues
& a’o.15-
0.1 -
!
0.05 0. 8
I IO
12
L 14
16
18
I ---_ 20
22
24
6
Time Fig. 3(d).
difference. Thus, this asymmetric system can be approximated in the same way as its symmetric counterpart, and we use the approximations suggested in Case 3(a) which are depicted in the other two curves (denoted by “6 queues” and “15 queues”). The conclusions here are as for Case 3(a) with the additional important conclusion that, like the mean values of the delay figures, the tail probabilities are not very sensitive to the location of the heavily loaded queues. Case 3(d): Here we examine how the selection of the queues to be merged affects the quality of the approximation. The question addressed is whether the merge of small utilization queues has a significant advantage on the merge of arbitrary queues. The system considered is identical to that of Case 3(a) and we consider a twenty queue approximation and a two queue approximation. In the twenty queue approximation every five consecutive queues of System A are merged together; this forms one heavily utilized queue (which is the merge of the five heavily utilized queues of System A) and nineteen lightly loaded queues (each of them is the merge of five lightly loaded queues). In the two queue approximation all the heavily loaded queues of System A are merged together and all the lightly loaded queues are merged together, yielding two queues. The merge of switch-over periods is done in both cases by convolution. The results of this case are depicted in Fig. 3(d). As can be seen from the figure, the quality of these approximations is significantly worse than that of the approximation proposed in Case 3(a); this approximation suffers from large errors even in the twenty queue approximation. Note that these results confirm the guidelines provided by (3.2), suggesting that the merge of low utilization queues should yield the best results. Case 4: Low arrival rates. In this case we examine the approximation quality when the arrival rates are significantly lower while the service durations are longer. The case is similar to Case 1, but the arrival rates
214
C.A. L.uPadula. H. Levy/Performance 0.6b-
0.5,-
Evaluation 26 (1996) 201-218
I
I
I
I
/
w;
0.4 7 E F A gO.3 I 5 0
\
*: 1 queue
\
0: lOqueues x: 100 queues
c 0.2,_
0.1
C
10
12
14
16
Time
18
20
22
24
Fig. 4.
are 10 times smaller, while the service times are 10 times larger. The results of this case are depicted in Fig. 4. The approximation quality observed in this case is similar to the previous ones: very good quality for the 10 queue approximation, and not as good for the one queue approximation. Case 5: Effect ofservice time distribution. In this case we examine the effect of the service time distribution on the accuracy of the approximation. The case is identical to Case 1, except for the service time distribution which is exponential with mean 0.008. The results of this case are depicted in Fig. 5, and the conclusions regarding the accuracy of the approximation are similar to those of Case 1. This behavior is not surprising since we expect the effect of the service time distribution on the tail probabilities in System B to be similar to that effect in System A (which is also the case in Eq. (3.2)). 7. I. Mean delay and tail probability comparison The numerical results presented above suggest that the accuracy of the approximation approach in predicting tail probabilities is similar to that of predicting the mean customer delay: approximations consisting of several queues (about 10) are very good, while approximation by only a few queues (about 1 to 3) are significantly worse. To demonstrate this property we depict in Fig. 6 the mean delay 4 as function of the approximation size (namely, as function of the number of queues) for three representative cases: Case 1,3(a) and 4. The figure shows that the mean delay in the approximations consisting of several queues is very close to that of the system consisting of 100 queues; on the other hand, the mean delay of a single queue system 4 For scaling purpose the mean delay is divided by 10.
C.A. LuPudula, H. Levy/Performance
Evaluation 26 (1996) 201-218
o.4~
215
4
\
0.35-
0.3-
zO.25-
\
\
\
\
\
\
\
\ \
\
i=
: g
\
02
*:
1 queue
0:
loqueues
x: lOOqueues
\
\
\
\
0.05-
-8
10
12
14
16
18 Time
20
22
24
26
Fig.5.
is significantly different from that of the 100 queue system. To examine the relation between the systems considered and infinitely large systems we also provide the mean delay in the corresponding x-balanced infinite size systems (computed from (5.1)); note that the differences between the hundred queue systems and the infinite queue systems are minor. For comparison we also provide in this figure curves depicting the tail probability Pr[ W > lo] as function of the system size. Note that the behavior of the mean delay curves is very similar to that of the tail probability curves; this similarity supports the use of the mean delay expression for evaluating the errors in the tail probabilities.
8. Discussion and a proposal for an approximation
algorithm
In Sections 5 and 6 we proposed an approach for approximating a large polling system (System A) by a smaller size system (System B). The analytic results of Section 5 suggested that as long as the size of System B is not too small (namely, of the order of several queues) it can be used to provide very accurate predictions of the mean customer delay in System A. These analytic results lead us to derive an approach for producing System B from System A; the approach is summarized at the end of this section. In Section 7 we numerically examined whether these approximations hold for tail probabilities as well. The numerical results reported in Section 7, which consist of the examination of a wide selection of cases, suggest the following conclusions: (1) The tail probabilities of the delay incurred by an arbitrary customer do depend on the total switchover period (R). They mainly depend on not only its mean value but also its distribution. However, as
216
C.A. LaPadula, H. Levy/Performance
Evaluation 26 (1996) 201-218
*:
Case I
0 : Case 1ll.a x :
Q. 1.
0 ^ 10”
I........ 0.
Case IV
_,) __ ,,
,_,.
I
I
”
1OL
10’
lo3
Number of Queues Fig. 6.
long as R is held fixed, these probabilities are quite insensitive to the particular switch-over periods associated with each queue (namely to the distributions of each of the Ri ‘s). (2) The merge procedure suggested in Section 5, which recommends the merge of the most lightly loaded queues, yields very good results when used to construct a system consisting of several queues (order of 5-10). The accuracy of predicting the tail probabilities is in the order of several percents. (3) The approximation of very large systems by either: (a) a single queue polling system (an M/G/ 1 with vacation periods), or (b) merging heavily loaded queues, yields significantly less accurate results. (4) Eq. (3.2) can be used as a good indicator for the accuracy of predicting tail probabilities by the queue merge procedure; such indication is obtained by evaluating the equation for both System A and System B, and computing their relative difference. The approximation studied in this paper can be described as one which preserves the total utilization (p) and the total switch-over time (R) of the original system, but changes the mean waiting time in the system (E W). Viewing Eq. (3.2), one may consider an alternative approach which preserves E W instead of either R or p. This approach is obviously better (exact!) for predicting the mean delay and thus may sound very attractive for predicting the tail probabilities of the delay. In a separate work [7] we examined approximations based on these alternative approaches. An extensive numerical examination suggests that in most cases these alternative approaches are better for predicting the lower end of the distribution (which is expected, since the lower end of the distribution contributes most to the mean) and worse for predicting the tail of the distribution. Since probabilities of large delays are usually of more interest than probabilities
C.A. LaPadula, H. Levy/Performance
Evaluation 26 (1996) 201-218
217
of small delays, in most cases the improved accuracy at the low end of the distribution will not be worth the extra cost of the computationally more complex approximation. For this reason, we believe the approach presented in this paper is superior to the alternative approaches. The combination of the analytic results (Sections 5 and 6) and the conclusions from the numerical tests suggest the following three step approach for approximating a large polling system by a smaller one: (1) Use Eq. (6.1) for approximately evaluating the trade-off between system size (complexity) and approximation accuracy. (2) Select the desired accuracy from the results derived in Step 1. (3) Repeatedly apply the merge procedure, either manually (if the parameters are simple), or by computer (using the greedy algorithm). Examine Eq. (3.2) (by evaluating it both for the original system and for the system resulting from the merge, and computing the relative difference) after every merge to evaluate the accuracy of the resulted approximation; once the desired accuracy is reached, stop the merge procedure. Lastly, it is important to note that our analysis has been provided for systems with either exhaustive or gated service but does not seem to apply directly to systems with limited type (e.g., l-limited, k-limited) service; it is an open question whether similar approach can be applied to those systems.
9. Summary We investigated the problem of evaluating the performance of very large polling systems. We proposed a simple algorithm for approximating a large system by a smaller one. Our results show that a very large polling system can be modeled fairly accurately by a system consisting of several queues, while a single queue (with vacation periods) approximation may result in very significant errors. We provided simple bounds for the accuracy of the approximation in predicting the mean customer delay. Numerical results showed that this approximation method is very good in approximating the tail probabilities of the delay observed by arbitrary customers. The exact results provided for mean customer delay can be used as a good indicator for the quality of the approximation in predicting tail probabilities.
Acknowledgements We would like to thank M. Eisenberg and K.K. Leung for their helpful suggestions.
References [l] O.J. Boxma and W.P. Groenendijk, Pseudo-conservation Jaws in cyclic-service systems, .I. Appl. Probab. 24 (1987) 949-964. [2] KC. Chang and D. Sandhu, Mean waiting time approximations in cyclic-service systems with exhaustive limited service policy, Performance Evaluation 15 ( 1992) 2 l-10. [3] M. Eisenberg, Queues with periodic service and changeover time, Oper: Res. 20 ( 1972) 44045 1. [4] S.W. Fuhrmann and R.B. Cooper, Application of decomposition principle in M/G/l vacation model to two continuum cyclic queueing models+specially token-ring LANs, AT&T Tech. J. 64 (1985) 1091-1099. [5] M.R. Garey and D.S. Johnson, Computers and Intractability-A Guide to the Theoq of NP-Completeness, Freeman, New York (1979).
218
C.A. LaPadula, H. Levy/Per$ormance
Evaluation 26 (1996) 201-218
[6] A.G. Konheim, H. Levy and M.M. Srinivasan, Descendant set: An efficient approach for the analysis of polling systems, IEEE Trans. Comm. 42 ( 1994) 1245 1253. [7] C.A. LaPadula and H. Levy, Approximate analysis of very large polling systems: The use of parameter scaling, Technical Report, AT&T Bell Laboratories, October 199 1. [S] L.K. Platzman, J.C. Ammons and J.J. Bartholdi III, A simple and efficient algorithm to compute tail probabilities from transforms, Oper Res. 36 (1) (1988) 137-144. [9] D. Sarkar and WI. Zangwill, Expected waiting times for nonsymmetric cyclic queueing systems-Exact results and applications, Management Sci. 35 ( 1989) 1463-l 474. [ 101 M.M. Srinivasan, Waiting time variances in continuous-time polling systems, Technical Report 91-37, Department of Industrial & Operations Engineering, University of Michigan, 199 1. [ 1 l] M.M. Srinivasan, H. Levy and A.G. Konheim, The individual station technique for the analysis of cyclic polling systems, Naval Res. Logist., to appear. [ 121 G.B. Swartz, Polling in a loop system, J. ACM 27 (1980) 42-59. [ 131 K.S. Watson, Performance evaluation of cyclic service strategies-a survey, in: E. Gelenbe (Ed.), PerJormance’84, Elsevier, North-Holland (1984) 521-533.
Charles LaPadula received his B.S. (1962) M.S. (1964) and Ph.D. ( 1967) degrees from New York University in the Thermal/Fluid Sciences area of Mechanical Engineering. Since that time he worked at AT&T Bell Laboratories with responsibilities including nuclear weapons effects studies, market demand forecasting for new services, transmission system economic analysis, and teletraffic theory and performance analysis. Most recently, he was Technical Manager in the Teletraffic Theory and System Performance department at AT&T Bell Laboratories. The group focused on traffic engineering, interprocessor communication and on the performance of real-time distributed database transaction systems. Charles LaPadula passed away on November 13, 1994. He was a highly valued employee and is greatly missed by his colleagues.
Hanuch Levy received the B.A degree in Computer Science with distinctions from the Technion, Israel Institute of Technology in 1980 and the M.Sc and the Ph.D degrees in Computer Science from University of California at Los Angeles, in 1982 and 1984, respectively. From 1984 to 1987 he was a member of technical staff in the Department of Teletraffic Theory at AT&T Bell Laboratories. In 1987 he joined the Department of Computer Science, Tel Aviv University, Tel Aviv, Israel. Now he is at the School of Business and RUTCOR, Rutgers University, New-Brunswick, NJ, on leave from Tel Aviv University. His interests include Computer Communication Networks, Performance Evaluation of Computer Systems and Queueing Theory.