183
Approximation Methods for Networks of Queues with Priorities J.S. K a u f m a n
1. Introduction and summary
A T&T Bell Laboratories, Holmdel, NJ 07733, U.S.A.
Received 2 March 1982 Revised 27 September 1983
Queuing network models are commonly used to analyze the performance of computer systems. Unfortunately, the class of queuing network models which can be exactly analyzed excludes CPU priority scheduling disciplines, conspicuously present in most computer systems. A popular approximation technique which we denote the reduced occupancy approximation, is often used to analyze such priority service disciphnes because of its simplicity and intuitive appeal. However, despite its widespread use, questions about its accuracy and applicability have received very little attention. Further compounding this situation, is the existence of proprietary software packages which purport to analyze such priority disciplines, but which in fact exhibit behavior remarkably similar to the roa. In this paper we show where, and more importantly why, the roa fails. This understanding leads to a significantly improved approximation technique which sacrifices neither simplicity nor applicability. Although our primary focus is on a two class preemptive priority closed network structure, the basic idea is quite general and extensions to multiclass and nonpreemptive priority structures are indicated. Keywords: Priority Queuing Models, Queuing Network Models, Reduced Occupancy Approximation, Virtual Server.
Joseph S. Kaufman, B.E.E., 1965, Pratt Institute; M.S.E.E., 1966 and Ph.D., 1970 (Computer, Information and Control Engineering), University of Michigan; Assistant Professor of Electrical Engineering and Computer Scienee, Columbia University 1971-1973; Bell Laboratories, 1973-1979, 1980- ; Visiting Scientist with the Computer Science Department, Technion-lsrael Institute of Technology during the 1979-1980 academic year. Since joining Bell Laboratories, he has been involved with the modelling and analysis of teletraffic and computer systems.
North-Holland Performance Evaluation 4 (1984) 183-198
Queuing network models of computer systems are widely used to evaluate a variety of relevant performance measures, such as mean response time, resource utilization and thruput. Because the present state of the art limits exact analysis primarily to the class of product form or BCMP network models [1], a large body of approximate analysis methods has accumulated to cope with currently intractable models. Central processing unit (CPU) priority scheduling disciplines are perhaps illustrative of this trend. Commonly encountered in practice, these priority disciplines cannot be modeled within the BCMP class of networks, and hence a number of approximate techniques have been suggested and used [2-7]. The most popular and intuitively appealing of these techniques, which we denote the reduced occupancy approximation (roa), attempts to analyze preemptive resume priority disciplines. Despite its popularity and apparently wide usage [2-5], the roa has not been critically examined to determine, for example, regions of applicability. Further compounding this situation, is the fact that a variety of proprietary software packages exist which purport to 'analyze' such preemptive resume scheduling disciplines, but which in fact exhibit behavior remarkably similar to the roa. Motivated by the work of Morris [8] and Sevcik [3], we began this study to determine where the roa fails - that it can perform very poorly was appreciated by Morris, Sevcik [9] and undoubtedly other workers in the field. Our major finding is that the roa uses a structurally flawed mean service time for the fictitious 'equivalent' low priority server which the roa creates to accommodate the BCMP network structure. Surprisingly, the correct mean service time for this 'equivalent' low priority server has a simple characterization which holds for virtually any preemptive model - regardless of stochastic assumptions and network structure. Moreover, we conjectured that if this correct mean was folded into a modified roa (m-roa), much of the
0166-5316/84/$3.00 © 1984, Elsevier Science Publishers B.V. (North-Holland)
184
J.S. Kaufman / Approximation methods for networks of queues with priorities
error associated with the roa would typically be eliminated. To 'test' this conjecture we had to narrow our focus and create a test-bed priority network model in which exact results (primarily mean response times) could be obtained efficiently. Although of necessity somewhat simplistic, the test-bed priority network model adequately illustrates the significant potential for improvement conjectured. In order to realize the potential demonstrated in the test-bed context, it remained to suitably approximate the correct effective low priority mean service time (the simple and general characterization mentioned above depends on a conditional probability which is a priori unknown). The resulting effective service approximation (esa) which incorporates the approximate effective low priority mean service time has the same wide applicability of the roa, but yields a uniformly (and often dramatically) improved approximation to the mean response time, thruput and server utilization. The seven sections of this paper logically define three parts. The first part introduces and summarizes our results (Section 1), defines and illustrates the use and potential for error of the roa (Section 2) and derives a general characterization for the effective low priority mean service time (Section 3). The section part defines and analyzes a test-bed priority network model (Section 4) and in this context quantifies the potential for improvement that obtains when the roa is modified by folding in the effective low priority mean service time (Section 5). The last part uses a simple and accurate approximation for the effective low priority mean service time to explore the implementation and performance of the effective service approximation (Section 6) and generalizes results in a variety of other directions (Section 7).
n2
Consider the central server type network shown in Fig. l(a). The CPU has a preemptive resume priority service discipline, class 1 (high priority) and class 2 (low priority) customers have exponentially distributed service times at the CPU with mean service rates Pl and ~'2, respectively. N, and Pi, i = 1, 2, denote the population size and CPU occupancy, respectively, of class i customers. The I / O subnet is as yet unspecified. The roa technique replaces Fig. l(a) by Fig. l(b), in which both high and low priority customers at the CPU now see dedicated (exponentially distributed) servers with mean service rates ~,~ and ~,2(1 -tS1) respectively, t51 is an estimate of the a priori unknown high priority CPU occupancy. The roa network in Fig. l(b) can be efficiently analyzed (for mean performance measures) assuming that the I / O subnet is itself a product form network. To estimate pl, Sevcik [3] suggested using the fixed point of p~(.) where p~(a) is the high priority CPU occupancy of the roa network given that the low priority mean service rate is i,2(1 - a). That is, Sevcik suggests using t51 = a 1 where a I = p~(a~). It is easy to prove the existence of such a fixed point, and in all examples we considered, the sequence X,, = p~(X,_l), n = 1. . . . . rapidly converged to a 1. Such a procedure is, of course, unnecessary in an open a n d / o r mixed network where X~ is a given exogenous rate and p~ is consequently known a priori. We will generally concern ourselves with the class i mean response times (mean time per job in the network), denoted by T, and ~a for the exact network and the roa respectively.
I
;nl I l*
2. The reduced occupancy approximation
I
I~'
;"
n1
')'
n2
[
O
u2 v2(1-/~ 1 )
a
b
Fig. 1. (a) A central server network with preemptive priority CPU scheduling. (b) The reduced occupancy approximation.
185
J.S. Kaufman / Approximation methods for networks of queues with priorities
mean sojourn time:
Clearly, __ T~=N1/* -1 and
otI
__ T2"
1
N2
/*21 '
Example 2.1. consider a preemptive resume single server queuing system with two priority classes as shown in Fig. 2(a). The arrival processes are independent and Poisson and the service time distributions are exponential. The exact mean low priority sojourn time T2 is, of course, well known [10] and may be written as T2 = /.2](1 -- I0) Jr- }0]/'11 J¢- p2/.2 1
-p)(1 -p,)
/*~-1
1-x2[/.2(i-pl)] -I
where % = pal(%) and p~(al) is the utilization of the 'equivalent' low priority server in the roa. Obviously, the mean delay and sojourn time per node by class as well as the mean thruput per node by class are also easily obtained from the roa. The roa has considerable intuitive appeal since /.2(1 - &) is an obvious candidate for the effective low priority service rate and its reciprocal is a reasonable guess for the low priority mean completion time [10]. The completion time, discussed in Section 3, arises naturally and plays a role analogous to service time in the analysis of priority queues. Unfortunately, its use in exact analyses has carried over somewhat uncritically to the roa. We conclude this section with two examples which illustrate the use of the roa. The first, although very simple, is particularly instructive because the roa induced error can be explicitly obtained.
(1
[/.2(i _ p , ) ] - '
T2~ -
(1
'
As /*21 --) 0, the exact expression shows that T2 is nonzero for P] > 0 (e.g., a low priority job regardless of its service time requirements - may encounter a high priority busy period) whereas the roa predicts a zero mean sojourn time for all tar Rewriting (1) in the form T 2 = (1 + 1 s 1
PJ ) /.21 Pl
(3)
1-p'
where s =/.1//.2, shows that the roa performs well for large s, but falls apart for small s. The relative error is given by p ] / ( s ( 1 - Pl) + Pl). Example 2.2. Consider the subnetwork in Fig. l(a) to consist of a single processor sharing node with distinct service rates %] and/*02 corresponding to high and low priority customers respectively. If we let n~ = number of class i customers at CPU (in queue and in service), then the corresponding product form roa network shown in Fig. l(b) has a state distribution p(nl, n2) which may be written as [1]
p(nl, n 2 ) = A ( n , , n2)r;'f';:G-]( N], N2) , O <~n i <~N~, A(nl, n2)= ( ( N ' - n l ) +( N2-n2) ) ( N] - n,)
(1)
where p~ = server utilization due to class i = Xi//*~ and p = Pl Jr 1o2. Now consider the roa approximation shown in Fig. 2(b). In this case, the mean low priority sojourn time given by the roa is just the M / M / 1
C(N ,
N,
N~
Z
Z A(.,,
n I =0 n2~O
= %i/p,,
i = 1, 2,
k2 = r2/(1 - db]).
X1
)
1 @
2'2
~"
] O
f
2'2
•
v2
(4)
Note that ri has the interpretation: ratio of CPU utilization to I / O utilization for class i customers
vI 2'I
(2)
0-p)
v2(1-pI ) a
Fig. 2, (a) A preemptive priority node. (b) The reduced occupancy approximation.
b
)
J.S. Kaufman / Approximation methodsfor networks of queues with priorities
186
in the original network (Fig. l(a)). Now 1v2 oi = 1 -
p(O,
E
= completion time random variable, X2 = E X2i i=l
.2=0
= low priority service time random variable.
So it is clear that the fixed point ¢31 = P~(P1) in particular and all performance measures in general which are obtainable from p(n], n2) depend only on the four parameters N1, N2, rl, r 2. Just as in Example 2.1, the parameter s = l q / p 2 does not appear in the roa. As we will see in Section 4, the priority network whose roa we are considering in this exampledepends on five parameters: N t, N 2, r], r 2 and s.
As is well known [10], the mean of c2 for the M / G / 1 preemptive resume model is [ u 2 ( 1 p])]-], which is precisely what the roa uses as the mean service time at the dedicated low priority server. In [3], Sevcik comments that the roa's assumed exponential low priority service time distribution fails to capture the often significant variability of c 2. Although true, we will see that this failure is overshadowed by the failure of c 2 to capture the correct mean service time of the roa's dedicated low priority server. To obtain the correct mean, it is useful to view arriving low priority customers to the preemptive priority server as if they had a dedicated server which is, after all, the basic approximation idea. Thus, if low priority customers are ignorant of the fact that their service is being interrupted by higher priority customers, what do they perceive their service time to be? For ease of exposition, we think of a fictitious service position (sp) which low priority customers enter and remain in (Fig. 3(b)) during their perceived service time. As in Section 2, denote the number of class 1 (2) customers at the preemptive resume node (in queue + in service) by n 1 (n2) and partition all low priority arrivals into three types as follows: Type (i) :customer finds n 2 > 0 upon arrival to preemptive node, Type (ii) :customer finds n 2 0 and n I = 0 upon arrival to preemptive node, Type (iii) :customer finds n 2 = 0 and n 1 > 0 upon arrival to preemptive node. Type (i) low priority customers enter the service position (sp) and perceive their service to begin at
The failure of the roa to capture the parameter s, in both of these examples, is symptomatic of a basic underlying problem. This basic problem is the failure of the roa to capture the effective mean low priority service time and is the subject of Section 3.
3. Completion and service position times The completion time (C2) , sketched in Fig. 3(a) for a preemptive resume discipline, is the period which begins the instant a low priority customer begins service, and ends the instant the server becomes free to serve the next low priority customer (if any are present) [9]. Thus, as shown in Fig. 3(a), the completion time for a preemptive resume discipline consists of the intervals XEt, . . . . x2,+1, during which a single low priority customer receives service interlaced with high priority busy periods bl] . . . . . b],, which interrupt the low priority customers service. Thus
--
c 2 = x 2 + ~ bli i=l
x21
I it
X22
bl 1
X23
(6)
.+I
(5)
X2n + 1
b12
bln c2
a
rl ),
_
v2
).
b
Fig. 3. (a) A completion time. (b) An arbitrary preemptive priority facility with a fictitious 'service position'.
J.S. Kaufman / Approximation methods for networks of queues with priorities
the instant a low priority customder completes (actual) service and they are simultaneously selected for service. Thus Type (i) low priority customers perceive their service time to be an ordinary completion time. Type (ii) customers enter the s.p. upon arrival and their perceived service time also corresponds to an ordinary completion time. Note that both Type (i) and (ii) customers perceive their service time to begin when it actually does. In contrast, Type (iii) low priority customers arrive to find n 2 = 0 and immediately enter the sp, perceiving their service time to begin when in fact the server is busy servicing high priority customers (n I > 0). Thus Type (iii) customers enter and remain in the sp for the forward recurrence time o f the high priority busy period they encounter - denoted by bf - in addition
to (but before beginning) their completion time. Thus, whereas c 2 is the correct effective low priority service time for Type (i) and (ii) customers, c 2 + bf is the appropriate effective low priority service time for type iii customers. This discussion shows that the service position time (denoted by s2) is the appropriate low priority service time to use in any approximation scheme which 'splits' a single preemptive priority server into two separate servers. Also note that ~2 is strictly greater than the completion time 02. For our purposes, we are interested in the mean service position time (g2) which fortunately has a characterization which is both simple and completely general. Theorem 3.1. Consider a general two class preemptive I priority single server queue (shown in Fig. 3(b)) in equilibrium with arbitrary stochastic assumptions (it may be part o f some general network or it may be isolated). Then 1)21
g2 = 1 - p ( n I > O / n 2 > 0)"
(7)
Proof. 2 Applying Little's Law to the fictitious service position we have 1 This result is generalized in Appendix C to a nonpreemptive discipline and any finite number of classes. 2 An alternative proof follows from the observation that as T -~ oo, f ( T ) / $ ( T ) --* s2, where f ( T ) = fraction of time in (0, T) during which the equivalent server is busy and $(T) = number of low priority departures in (0, T). Thus noting that f ( T ) ~ p(n 2 > 0)T and $ ( T ) / T ~ X 2 yields s2 = p(n2 > 0)/X 2 which when combined with (7b) yields (7).
187
where L2 = mean number of (low priority) customers in sp (averaged over all time). Now X z =p(n,
= O, n 2 > O)vz
(7b)
and f-,2 = p ( n 2 > 0).
(7c)
Eq. (7b) follows from the preemptive discipline: low priority customers are processed if and only if n 1 = 0 and n 2 > 0. Eq. (7c) follows from the observation that the sp is occupied if and only if n 2 > 0. Using (7b) and (7c) in (7a) yields P(n2>O)
-1
s2 = p ( n l = O, n 2 > O) v2 '
which when simplified yields (7). In addition to the fact that g2 > c2 noted above, some additional insight into the behaviour of $2 can be obtained by considering the following range of parameter values. (i) v2 l>>v~ -1. In this range, the presence of a low priority queue implies little about the presence of a high priority queue. That is, we expect that p ( n 1 > 0 / n 2 > 0 ) - p (n ~ > 0) and hence that "~2 -" v 2 - 1 / ( 1 -- P l ) "
(ii) v£ 1 << v~-1. In this range, the presence of a low priority queue strongly implies the presence of a high priority queue. That is, we expect t h a t p ( n I > 0//hE > 0 ) - - 1 and hence lim~p~ 0 $2 generally will be non-zero (in sharp contrast to v21/(1 - Pl))The results in this section suggest modifying the roa to incorporate $2 as the mean effective low priority service time. Unfortunately, this is not easily done in general, since the conditional probability p ( n I > O / n 2 > O) is obviously unknown. In the following two sections we define and analyze a test-bed network model in which both the roa and a modified roa (m-roa) which incorporates g2 can be analyzed and contrasted with the exact results.
J.S. Kaufman / Approximation methods for networks of queues with priorities
188
4. A network test-bed model Choosing a test-bed network model from the class of models shown in Fig. l(a), subject to the requirement of being able to exactly and efficiently analyze it, excluded all but single I / O node networks. Further, the only service disciplines (for the I / O node) which offered much hope of success were: (a) preemptive priority, (b) random selection, (c) processor sharing. Choice (a) has been analyzed by Morris [8] and, although interesting, has the disadvantage that the high priority performance measures are independent of perturbations to the low priority service mechanism. Choice (b) while a good deal more realistic as a model of I / O than (c) restricts the choice of service rates. That is, if v0i = mean service rate of class i customers at the I / O node, then the corresponding roa network is product form if and only if %1 = %2. The processor sharing choice (c) does not impose such a restriction, and more importantly, lends itself nicely to an efficient recursive solution.
"l,
] nl ,/J =
1)01
Vl
"
I0
PS I/0
• "
.
.
,,!
.
.
:
.
.
:
r........ r
2
o.°
t.
N~= population of class i customers, r i = poi/lJi
= ratio of class i CPU utilization to I / O utilization. In addition, and unlike the roa, we must also introduce the critical parameter s = vl/v 2. The Markovian global balance equations defining the state distribution p(nl, n2) for this test-bed model are easily written and involve the five parameters: (N,, ri), i = 1, 2 and s. Solving the resulting linear system of equations which definep(n l, n2) for large N 1 a n d / o r N 2 was rejected for obvious (cardinality) reasons, and standard iteration methods were also ruled out. A method for reducing the dimensionality to N 1 (or N2) due to Herzog, Woo and Chandy [11], and in spirit identical to the analytical method used by Morris [8] was briefly considered, before we developed a purely recursive scheme. Our scheme is similar to the one developed in [12], and hence we have included only a very brief summary in Appendix A.
5. Network examples and general observations
v02
a
1
Thus, our test-bed model is shown in Fig. 4(a), with all service time distributions exponential, and with state transition diagram as shown in Fig. 4(b). The roa for this model was briefly studied in Example 2.2 and as there, we define
"1" :
r
.
Using a variety of test-bed network examples, we contrast the roa with the exact results obtained as described in Section 4. Each example also illustrates the potential for improvement by showing the modified roa (m-roa) in which the exact value of g2/v~ 1 is used. 3 We will be primarily interested in the mean time spent in system for a class i customer, and for convenience we normalize by the mean work incurred by a class/job (v/-1 + vff,1) while in system. Denoting these mean normalized system (or response) times by ~ it is easily seen that
~, °
o
0
I
N~ p,(1 + rl-') '
2
NI-I
NI
b Fig. 4. (a) The test-bed model. (b) The state transition d i a g r a m for the test-bed model.
T2 -
N2 p2(l+r;')'
(8)
where Pi is the class i CPU utilization. 3 T h e m e a n service time used in the roa ~ 2 ( r o a ) / p f 1 = (1 0 ] ) - 1 is likewise c o m p u t e d using the exact value of Or-
J.S. Kaufman / Approximation methods for networks of queues with priorities
Each figure referred to in the examples, includes a table contrasting ~2(roa)/v21 with g2//P2 1, which is useful in explaining the trends observed. Characteristics peculiar to each example are briefly discussed, followed by observations of a more general nature.
189
varies from very poor ( N 2 = 1) to very good ( N 2 = 6). The improved low priority roa behavior for large N 2 is primarily due to the fact that most of the low priority delay occurs at the I / O . The low priority m-roa is uniformly better than the roa, and in particular reduces the relative error from 45 percent to 5 percent when the low priority multiprogramming degree is one.
Example 5.1 (see Fig. 5). High and low priority customers have equal mean CPU service times (s = 1) and are both perfectly balanced between CPU and I / O (r 1 = r2 = 1). The exact and m-roa high priority response times are indistinguishable, and the roa high priority response time is also an excellent approximation. The low priority roa response time poorly approximates the exact result, with the m-roa offering significant improvement (20 to 40 percent relative error for the roa is reduced to less than 10 percent).
Example 5.3 (see Fig. 7). All parameters except for s are as in Example 5.2. The high priority mean service time at the CPU is now ten times larger 4 than the low priority (s = 0.1). The low priority roa is extremely poor (relative error varies from 35 percent to 75 percent), especially at lower (low priority) multiprogramming degrees. The m-roa is uniformly and significantly better than the roa, reducing the low priority relative error to the range 10-20 percent. Note that the high priority roa also exhibits significant error which the m-roa also significantly reduces.
Example 5.2 (see Fig. 6). The C P U / I / O balance of both high and low priority customers in Example 5.1 has been disrupted: both low priority and high priority customers bottleneck at the I / O for N 2 > 2. High priority response time comparisons are similar to Example 5.1. The low priority roa
Example 5.4 (see Fig. 8). High and low priority customers are balanced between CPU and I / O (as in Example 5.1) and have equal mean service times
Mean Response T i m e / M e a n Workload s = 1.0, r I = 1.0, r 2 = 1.0, N 1 = 3 8
LP
t - "
O
~'~
® X
~ ~S~
N2 1 2 3 4 5 6
~2/P21 3.25 2.83 2.54 2.34 2.19 2.07
~2/P21 6.18 4.04 3.23 2.78 2.50 2.30
I N2
1
Fig. 5. T e s t - b e d m o d e l E x a m p l e 5.1.
2
3
4 . Exact; . . . . .
5 :roa; × :'Best/l';
6 - -
- -
: m - r o a ; (3 : esa.
4 A p p l i c a t i o n s t y p i c a l l y ( b u t n o t necessarily) h a v e values o f s~l.
190
Mean Response Time/Mean Workload s = 1.0, r 1 = 1.5, r 2 = 0.2, N 1 = 3 8
~
6 ¸
;......
---'e
/
~ LP
,/, - "
4-
t
2-
3
1
4 : Exact; . . . . .
fig. 6. T e s t - b e d m o d e l E x a m p l e 5.2.
5
~
: roa; x : ' B e s t / l ' ;
.
.
N2
c2//p21
$2//p21
1
5.41
13.85
2
3.89
7.99
3 4
2.97 2.40
5.44 4.02
5 6
2.06 1.84
3.17 2.66
l~, N2 .
.
:m-roa; @:esa.
Mean Res ~onse Time/Mean Workload s = 0.1, 11= 1.5, r 2 = 0 . 2 , N 1 = 3
J
10
LP 8~ ~-..-~
0 ~ .
j
® 0 0
6-
,,"
0 0
! p
X 4"
,
i 1
1 2
. , , i , . - " O" L HP
I 3
Fig. 7. T e s t - b e d m o d e l E x a m p l e 5.3.
I 4
~ 5
I 6
N2
C2//P21
$2//V21
1 2 3 4 5 6
5.83 4.48 3.61 3.01 2.58 2.27
42.06 20.82 13.47 9.75 7.51 6.05
• N2
: Exact; . . . . .
: roa; x : = B e s t / F ;
- -
- -
: m - r o a ; Q : esa.
J.S. Kaufman / Approximation methodsfor networks of queues with priorities
191
Mean Res )onse Time/Mean Workload 25' s = 1.0, rl= 1.0, r 2 = 1.0, N 2 = 1 / , ~
Jf
20
O
LP
15 "~/"J. ,
o
10
,
2
4
,
,
,
,
6
8
10
12
: Exact; . . . . .
Fig. 8. T e s t - b e d m o d e l E x a m p l e 5.4.
IN
n 1 c2/v21
$2/P21
2 4 6 8 10 12
4.08 8.60 14.31 21.05 28.72 37.24
2.47 4.06 5.70 7.38 9.10 10.83
I
: roa; X :'Best/l'; - -
- -
: m - r o a ; E) : esa.
Mean Response Time/Mean Workload s = 10.0, rl= 1.0, r 2 = 1.0, N 2 = 1
}
12
LP
9,
HP
6-
i
1
2
4
Fig. 9. T e s t - b e d m o d e l E x a m p l e 5.5.
2 4 6 8 10 12
,I~NI 6
10 : Exact; .....
: roa; X : 'Best/Y;
2.49 4.00 5.53 7.06 8.61 10.17
3.16 5.56 8.18 11.02 14.06 17.30
12 - -
: m - r o a ; Q : esa.
192
J.S. Kaufman / Approximation methods for networks of queues with priorities
at the CPU (s = 1.0). Low priority multi-programming degree is fixed at 1, while the high priority multiprogramming varies from 2 to 12. The low priority roa is extremely poor. The parameters in this example are realistic, and point out the potential for reaching misleading conclusions based on the roa. In contrast, the m-roa low priority result has a relative error of 3 - 4 percent. High priority error is due to incorrect I / O interference (due to low priority customers at I / O ) , but since N 2 = 1 this error is negligible for both the roa and m-roa. Example 5.5 (see Fig. 9). This example is identical to Example 5.4, except for low priority mean service times which are now ten times larger than high priority (s = 10.0). Conclusions drawn are the same as in Example 5.4. Note that the roa, being independent of the parameter s, yields results identical to Example 5.4. General observations
(a) The low priority roa is typically very poor, consistently and significantly underestimating mean response time. This failure of the roa is primarily due to the use of ?2(roa) as the mean low priority service time. Thus, as the tables associated with each example show, the correct effective low priority mean service time g2 is often hundreds of percent larger than ?2(roa). Moreover, the general behavior of s2 - 72(roa) is entirely consistent with (and predictable from) the development in Section 3. Thus, any parameter change which increases either the fraction of Type (iii) low priority customers (e.g., by decreasing N 2 in Example 5.1) or the mean forward recurrence time of the high priority busy period (e.g., by increasing N 1 in example 5.4) increases the difference between g2 and ?2. (b) The high priority roa is typically quite good. This is not surprising since incorrect 'interference' at I / O - due to the low priority customers there is the only source of high priority roa error. Thus, when N 2 is small, this error must be small. When N 2 is large, the population of low priority customers at I / O (and therefore the potential for interference) is larger - but ?2(roa) is more nearly the correct low priority mean service time, and hence the resulting I / O interference is more nearly correct. (c) The potential f o r improvement demonstrated
by the m-roa is very significant. Thus, by using the correct effective low priority mean service time, the low priority mean response time approximation improves - often dramatically - and simultaneously also improves the high priority approximation. In the next section we specifically show how to approximate g2 and hence how to tap this potential for improvement. (d) The residual error in the low priority m-roa can largely be explained by two effects: (i) Variability error - the random variable s 2 is generally nonexponential, and in fact often has a coefficient of variation much greater than 1. Failure to model this variability in the m-roa accounts for part of the residual error observed. Note that when s is small - say smaller than 1 - we expect this type of error to become more significant. It is important to note, however, that when N 2 = 1 (see Examples 5.4 and 5.5), this type of error is not present, since queuing at the 'equivalent' low priority server cannot occur. (ii) Synchronization error - a low priority customer departing the CPU preemptive priority server in any network depicted in Fig. l(a)) will find all high priority customers in the I / 0 subnetwork. Both the roa and m-roa fail to capture this synchronization phenomenon and this helps to explain why the m-roa low priority mean response time underestimates the exact result in the test-bed network examples just presented (e.g., the roa underestimate the high priority I / O population). Note that all the m-roa errors in both Examples 5.4 and 5.5 are due to this synchronization effect (since N 2 = 1). Clearly, this effect is most noticeable when the I / O subnet consists of a single node (common to both high and low priority customers) - as in our test-bed network. Other (perhaps more subtle) sources of error may exist. For example, successive low priority effective service times may be dependent. Additional comments
(a) The two node priority network analyzed by Morris [8] and referred to earlier can also be used to evaluate the m-roa. Thus, Table 1 drawn from [8] contrasts the roa with exact results for ten examples. We have appended the results obtained for the m-roa, and as is clear, the m-roa is uniformly and sometimes dramatically better than the roa - consistent with out testbed examples.
J.S. Kaufman / Approximation methodsfor networks of queues with priorities
193
Table 1 a Examples drawn from the Morris network [8, Table 1] Case
N1
N2
~'l
1'2
P02
St
S~
1
1
10
1
0.1
0.1
10
10
2
3
5
1
0.03
0.03
33.3
33.3
3
3
5
1
0.03
1
33.3
1
4
3
5
1
1
1
1
1
5
3
5
1
100
100
0.01
0.01
6
1
10
0,2
0.02
0.02
10
50
7
3
5
0,2
0.0005
0.01
400
100
8
3
5
0.2
0.1
0.01
2
100
9
3
5
0.2
0.0005
10
10
3
5
0.2
1
10
400
0.2
0.1
0.1
Technique
T2,I
T2.,
exact m-roa roa exact m-roa roa exact m-roa roa exact m-roa roa exact m-roa roa exact m-roa roa exact m-roa roa exact m-roa roa exact m-roa roa exact m-roa roa
111 111 110 406 404 400 656 658 663 18.0 15.6 12.0 6.12 3.72 0.12 2923 2924 2925 0.156(7) 0.156(7) 0.156(7) 7646 7662 7664 0.156(7) 0.156(7) 0.156(7) 819 820 780
111 111 110 406 404 400 10.5 10.0 4.12 18.0 15.6 12.0 6.12 3.72 0.12 77.5 76.2 75.0 128 128 125 ' 156 139 136 3.56 3.56 0.125 3.69 1.76 0.125
X2 0.0450 0.0453 0.0455 0.00616 0.00619 0.00625 0.750( 0.750( 0.750( 0.138 0.160 0.208 0.408 0.672 20.8 0.00333 0.00333 0.00333 0.321( 0.321( 0.321( 0.641( 0.641( 0.641( 0.321( 0.321( 0.321( 0.00608 0.00608 0.00641
2) 2) 2)
5) 5) 5) 3) 3) 3) 5) 5) 5)
a (x) denoted 10 x, l (r) denotes left (right) preemptive node, %1 = 1 in all examples.
(b) There are several commercial software packages which approximately analyze the mean r e s p o n s e t i m e f o r n e t w o r k s s u c h as t h a t i n Fig. 1. O n e o f t h e m o r e p o p u l a r o f t h e s e p a c k a g e s is 'Best/Y, marketed by BGS Systems (Boston) as a tool for the Performance Prediction of Computer S y s t e m s . T h e ' B e s t / l ' a p p r o x i m a t i o n is s h o w n i n Figs. 5-9 and often cannot be distinguished from t h e r o a . It a l s o s h a r e s t h e r o a l i a b i l i t y o f b e i n g i n d e p e n d e n t o f t h e p a r a m e t e r s -- v l / v 2 ( i l l u s t r a t e d i n F i g s . 8 a n d 9). T h u s , b y a p p r o p r i a t e c h o i c e o f s, B e s t / 1 c a n d o v e r y w e l l ( c o n t r i v e d i n Fig. 9) o r v e r y p o o r l y ( i n Fig. 8). N o t e t h a t t h e m - r o a t r a c k s the changing exact mean low priority response t i m e closely.
t o d e m o n s t r a t e t h e s i g n i f i c a n t potential f o r i m provement. A natural way to exploit this potential is t o a p p r o p r i a t e l y a p p r o x i m a t e t h e m e a n s e r v i c e position time. In Appendix B we show that a simple but useful approximation for the mean spt can be developed by focusing on the classical preemptive resume model. The resulting app r o x i m a t i o n is g i v e n b y -, . . . . . . . s2 s ( 1 - p l ) 2 + P2
v2 i
(9)
6. The effective service approximation
and our improved approximate analysis technique w h i c h e m p l o y s s2-a will b e r e f e r r e d t o as t h e e f f e c t i v e s e r v i c e a p p r o x i m a t i o n (esa). A s a n a s i d e , n o t e t h a t (9) i m p l i e s t h a t p(n I > O/n 2 > 0) is a p p r o x i m a t e d b y
T h e m o d i f i e d r o a , w h i c h e m p l o y s t h e exact mean service position time, was used in Section 5
[s(1-p,)+l} pa(nl>O/n2~'O)=~s(l Pll-~p
o,,
(lo)
J.S. Kaufman / Approximation methods for networks of queues with priorities
194
which indeed has the behavior expected from the discussion in Section 3 (e.g., limits Pl and 1 as v~- 1 ~ 0 and v21 ~ 0, respectively). Since g~ requires both the high and low priority C P U utilizations (#l and #2), which are typically a priori unknown, we proceed as in the roa to estimate them using an idea due to Sevcik [3]. Thus if O~(xp x 2) is the high priority C P U utilization of the esa and o~(xp x2) is the utilization of the 'equivalent' low priority server in the esa, when x~ and x z are used to estimate 0~ and 02, respectively, we estimate & by % where (a~, a2) is the fixed point. (lla)
(0:1, 0 / 2 ) = ( P ~ ( 0 / 1 , 0 / 2 ) ' P2(0/1, 0 / 2 ) )
and tbz (0/~, 0/2) -
(llb)
In the network models we considered (the test-bed, Morris and Sevcik models) the sequence
(
x;)= (p"l(XC
x;-,),
x;-,, x;-,)),
n = 1, 2 , . . . , converged very rapidly to the fixed point, regardless of the initial estimate ( X °, X~). The esa high and low mean (normalized) response times have been calculated for the test-bed examples of Section 5 and are shown in Figs. 5-9. The esa 'falls short' of the m-roa, although it is uniformly and often dramatically better than the roa. The primary reason for not achieving the full potential of the m - r o a appears to be due to the error induced in ~ by using the fixed point equation ( l l a ) to obtain the C P U utilization estimates t51 and 02. Evidence of this is given in Table 2 which contrasts gff(Pl, P2) with g~(01, 02).
7. Generalizations and concluding remarks As might be expected, our results easily generalize to more than two classes. Thus consider the C P U in the model shown in Fig. l(a) to consist of K > 2 priority classes with class i having preemptive priority over classes i + 1 , . . . , K . Let gi and v,-1 denote the mean service position time and the mean service time respectively for a class i customer. The following result generalizes Theorem 3.1.
Theorem 7.1. The mean service position time for class i customer, i = 2 , . . . , K, at a K class preemptive priority single server facility in equilibrium, is given by vT' s* - 1 - P ( m ~ > O / n , > O) '
(12)
where m i = n 1 + ... ~-ni_l,
i = 2 ..... K,
and as before nj = :g: o f class j customers at the facility (in queue + in service) at a random point in time.
Proof. The proof is a trivial generalization of that of T h e o r e m 3.1. Thus note that, for i = 2 .... ,K, h i = p ( n I --- 0 . . . . . h i _ 1 = 0 , n i > 0 ) v i
m a y be rewritten as h i=p(m
i - ~ O , n i > O ) vi,
since n I --- n 2 . . . . . n~_ ~ = 0 if and only if m i = 0 and hence our result follows from Little's L a w p ( n i > O) ~- )~igi.
In order to generalize the effective service approximation (esa) to this generalized model, we must appropriately approximate g~, i--- 2 . . . . . K. Of course, the results in Section 6 and A p p e n d i x B easily generalize and we obtain
Table 2 Contrasting $2 and S~ for test-bed Example 5.3 ( N 1 ~ 3, q ~ 1.5, r z = 0.2, S = 0.1)
si
N2
S2/v~ 1
S~/v~ 1
S~'(O,, 02)/"~-'
C2(roa)/vf '
i = 2 .... ,K, where
1 2 3 4 5 6
42.06 20.82 13.47 9.75 7.51 6.05
42.46 21.08 13.57 9.77 7.51 6.03
38.51 14.73 8.80 6.23 4.88 4.08
5.83 4.48 3.61 3.01 2.58 2.27
p(i)[1 -p(i)]2
i--1
. ( i ) = E (p/s,,),
s,,=
j=l
and p(i)=p3+p2+
""+P,-j.
+0/(i)&
195
J.S. Kaufman / Approximation methods for networks of queues with priorities
T h e effective service a p p r o x i m a t i o n uses gT, i = 2 .... ,K, to a p p r o x i m a t e g,, i = 2 , . . . , K , and proceeds as outlined in Section 6 to estimate (Pl . . . . . Pk) via the fixed point a defined by •"
=
. . . . .
where =
~bi(a)
(g,"/v ' ) '
i = 2 . . . . . K.
T h e essential idea e m b e d d e d in the roa is to split a single server priority facility into a m a n y server nonprioritized facility. Our contribution has been to both capture and fold into this f r a m e w o r k the effective mean service time that each customer type experiences. Obviously, these ideas can be extended to situations other than the preemptive priority case. An obvious and i m p o r t a n t extension is to the class of networks shown in Fig. l(a) in which the C P U is nonpreemptive. The fact that a class i customer m a y wait for a class j > i customer to complete before beginning service obviously introduces new difficulties. However, as shown in Appendix C our basic characterization result (Theorem 7.1) can be extended in a straightforward manner. The only closely related work we are aware of, a p p e a r e d shortly after we presented an abbreviated conference version of this material [13]. Thus, in [14], Schmitt proposes a modification of the roa which utilizes a low priority state dependent service rate of the f o r m / t ( k ) = 1 - p ( n 1 > O / n 2 = k ), k = 1 . . . . . N 2. Schmitt, in effect, suggests an interesting way of capturing some of the variability in s2, without sacrificing the essential p r o d u c t form solution. Questions a b o u t p e r f o r m a n c e - theoretical and practical - are touched u p o n in [14], but remain to be fully explored. Appendix A. The recursive algorithm for the test-bed model
T h e algorithm is best explained by referring to the state transition d i a g r a m in Fig. 4(b). N o t e that if state (n 1, n 2 ) = (N1, 0) is assumed known, we can recursively generate states ( N 1 - 1, 0), ( N 1 - 2, 0) . . . . . (0, 0). This is possible because the p r e e m p tive priority C P U rules out transitions of the type ( h i , n 2 ) ~ ( h i , n 2 -- 1) for n I > 0. Also note that
state (0, 1) is k n o w n (since states (0, 0) and (1, 0) are known). We have now apparently reached an impasse: we would like to recursively generate states (nl, 1), n 1 = 1 . . . . . N 1, but this is not possible because states (0, 2) and (1, 1) are u n k n o w n (either one would do). However, there is a solution: assume state (1, 1) is k n o w n (say p(1, 1 ) = a) and recursively generate p~(n 1, 1), n 1 = 2 . . . . . N1, where the subscript a denotes dependency on the assumed value. But note that in so generating these values {p~(nl, 1)) we have not used global balance at state (N~, 1), which provides a constraint equation relating p ~ ( N 1 - 1, 1) and p , ( N 1, 1). Finally, note that p ~ ( N 1 - 1 , 1) and p~(N1, 1) are linear in a with coefficients a ( . , 1), b ( . , 1) which are independent of a: p~(ni,1)--a(n,,1)a+b(n,,1),
n,=N,-1,
N 1.
Therefore, assuming a = 0 and recursively generating { p~(ni, 1)} determines b ( . , 1), and repeating this a second time with a = 1 (for example) determines a ( . , 1). Finally, with the four coefficients (a(ni, 1), b ( n i , 1)), n i = N 1 - 1 , N1, determined, the constraint equation relating p ~ ( N 1 - 1 , 1) and p~(N1, 1) determines a (e.g., p(1, 1)) uniquely. Thus, with p(1, 1) known, we recursively generate p ( n i, 1), n i = 2 . . . . . N 1. But now p(0, 2) is k n o w n and the same process is repeated to generate p ( n i , 2), n~= 1 . . . . . N1, etc. The process completes with all states determined in terms o f p ( N l , 0), which in turn is determined by the normalization constraint gl N2 E
E
n I ~0
nl~0
p ( n , , n 2 ) = 1.
We note as an aside that this recursive scheme applies directly to two other previously studied priority network models which have identical state transition diagrams to the test-bed model (although some transition rates are different), the first is a priority queuing network model discussed by Chow and Yu [6] whose analysis boils down to analyzing a preemptive priority node with state dependent arrival rates and exponential service times. The second is a model considered by Sevcik [9], and shown in Fig. 10. Appendix B. Approximating the mean service position time
A simple, useful and reasonably accurate approximation for the m e a n service position time g2
196
J.S. Kaufman / Approximation methodsfor networks of queues withpriorities
ority model has been analyzed, and in particular 0) can be easily obtained from Jaiswals results [10]. Unfortunately, the exact expression for g2(h) so obtained is complex and hence not suitable for our purposes. With this as background, we proceed to 'derive' our approximate expression s.-. 2 Eq. (B.2) can be rewritten as
' ] .T
p(n 2 >
N1 71 ;°
N2,72 X2~2(h) = P2 + P ( n , > 0, n2 > 0),
since, for the preemptive discipline, P2 = P(nl = O, n2 > 0). Using the fact that the low priority arrival process is Poisson, we also have
Fig. 10. The Sevcikmodel. (equation (7)) can be developed by focusing on the classical preemptive resume model. The resulting approximation, developed below, is given by
{ .s.(. 1. .- P l ) + P )ell gg=
(B.1)
s(1-Ol) 2+02
and our improved approximate analysis technique which employs g~ will be referred to as the effective service approximation (esa). If we denote the mean completion time for the classical preemptive resume model by 32(h), we expect (in light of the developments in Section 3) that
limg~=32(h).
.(
s(1 - ~
+ ( ~ - 2 / ( { - - P,))
),,
"1 - Pl
and noting that 02/(1- 01)< Pl +02.. In (b), s may diverge to 0¢ in any fashion which maintains 01+02<1. In the following development s2 (h) refers to the mean spt for the classical preemptive priority model with exponentially distributed service times and independent Poisson arrival processes, g~ (equation (B.1)) is actually an approximation to g2(h), as we make clear below. Consistent with earlier sections, high and low priority customer quantities are subscripted 1 and 2, respectively. From Section 3 (equations (7a) and (7c)) we have X2g2(h )
=p(n 2 >
= ?2(h) + p ( n 1 > 0, n2 = 0)br,
(BA)
where br = mean forward recurrence time of the high priority busy period as seen by Type (iii) low priority customers. But p, = p ( n , > 0, n 2 = 0) + p ( n , > 0, n 2 > 0), so we can rewrite (B.4) as s2(h)=c2(h)+(pl-p(n
g2(h)
Thus (a) immediately follows from rewriting (9) as follows: s2 =
g2(h)
I >0,
n 2 > 0 ) ) b t (B.5)
and therefore solving (B.3) and (B.5) yields
(a) gg > 32 ( h ) , (b)
(B.3)
0),
(B.2)
2(h)+d,f 1 + ),2bf
Note that up to this point, no approximations have been made. At this point, however, we approximate bf by the mean forward recurrence time of an ordinary M / G / 1 busy period:
5f
r?l
(1 - p, )2"
(B.7)
Using (B.7) together with c2(h)= v21/(1 -/91) in (B.6) yields (B.1). Our experience indicates that ~ is accurate over wide parameter ranges for all models (the test-bed model, the Morris [8] model, the Sevcik model [9] and the classical preemptive resume model) for which exact results are available. An example of the accuracy of g~ is shown in Table 2, where the test-bed network discussed in Example 5.3 is used. We conclude Appendix B with a few remarks about the classical preemptive resume model - the context in which s 2 was derived. (i) Using ~ in lieu of 32(roa ) in the classical -a
where X2 is the given (exogenous) low priority customer arrival rate. The classical preemptive pri-
(B.6)
J.S. Kaufman / Approximation methods for networks of queues with priorities
Theorem C.I. The mean service position time for a
preemptive resume model yields -a
T; =
s2 1
-
xd
(a.8) '
class i customer, i = 1 , . . . , k , at a K class nonpreeruptive priority single server facility in equilibrium, is given by
which can be written in the form
-1
"
T;=(
197
(1 - p) + p / s + P2 }v2," (1 _ p) d - - - p 3 ~ ~- ;f~-~-i _ p, )) (B.9)
C o m p a r i s o n with the exact result (equation (1)) shows that this approximation is excellent for P2 << s(1 - p~). (ii) The fact that (B.8) does not yield the exact result - even if g2(h) is used - is not surprising. Variability in s 2 always effects the mean response time here - unlike the closed network context. If, however, we ask for that 'equivalent' mean service time, say ~2~, which when used in (B.8) yields the exact mean response time, we obtain
i,- l_p(n~ >O.x. i/n,>O)
.
(c.4)
Remark. It is obvious that (C.1) and (C.2) hold for any priority structure - preemptive, nonpreemptive or mixed (some classes preemptive, some nonpreemptive). Therefore, T h e o r e m C.1 holds for an arbitrary priority structure, and in particular reduces to T h e o r e m 7.1 in the case of a preemptive structure, it is easily seen that, for the preemptive structure,
p(n~. > O, x d= i / n , > O) = p ( m i > O/n, > 0), i = 2 . . . . . k, where
mi-~ nl + ....+Hi_ 1 and
s(1 - pl) 2 + PlP2 which is similar to (but not surprisingly) strictly greater than s2.-"
p(n~ >O,x*
Appendix C. A mean service position time characterization for non-preemptive priority scheduling
Acknowledgment
As in the preemptive priority structure considered earlier, a class i customer who 'believes' there is a dedicated server for his class, 'perceives' this (possibly fictitious) server to have a m e a n service time g, given by g,-- ~'(n, > O)/X,.
(C.1)
Because a class j customer can be in service when n, > 0, i < j, a characterization for gi must involve a state variable x where x = class of customer currently in service. Indeed, if we let n~
=
~~-j÷inj, then
X , = [ p ( n , > 0, < - - 0 )
(C.2)
+ p ( n i > O, n f > O , x = i )]u,, which m a y be rewritten as
X,=p(ni>O)-p(n,>O,n~>O,x~i)v
,.
(C.3)
C o m b i n i n g (C.1) and (C.3) yields the following theorem.
l/nl >O)=O.
The author thanks B.T. Doshi, H. Heffes and R.J.T. Morris for their helpful comments during the course of this study.
References [1] F. Baskett, K.M. Chandy~ R.R. Muntz and F.G. Palacios, Open, closed, and mixed networks of queues with different classes of customers, J. ACM 22(2) (1975) 248-260. [2] M. Reiser, Interactive modeling of computer systems, IBM Systems J. 4 (1976) 309-327. [3] K.C. Sevcik, Priority scheduling disciplines in queuing network models of computer systems, in: B. Gilchrist, ed., Information Processing 77 (North-Holland, Amsterdam 1977). [4] M. Reiser, A queuing network analysis of computer communication networks with window flow control, IEEE Trans. Commun. COM-27 (8) (1979) 1199-1209. [5] H. Seaman, Modeling considerations for predicting performance of CICS/WS systems, IBM Systems J. 1 (1980) 68-80. [6] W. Chow and P.S. Yu, An approximation technique for central server queuing models with a priority dispatching rule, IBM Watson Research Center Res. Rept. RC 8163, 1980; Performance Evaluation 3(1) (1983) 55-62.
198
J.S. Kaufman / Approximation methodsfor networks of queues with priorities
[7] C.B. Dowling and A. Uspenski, Approximation method for closed loop model with priority discipline at CPU, 13th Asilumer Conf. on Circuits, Systems and Computers, 1980. [8] R.J.T. Morris, Priority queuing networks, Bell System Tech. J. 60(8) (1981). [9] K.C. Sevcik, Personal communication to R.J.T. Morris. [10] N.K. Jaiswal, Priority Queues (Academic Press, New York, 1968). [11] O. Herzog, L. Woo and K.M. Chandy, Solution of queuing problems by a recursive technique, IBM J. Res. Develop. (1975).
[12] W.-M. Chow, M / M / 1 priority queues with state dependent arrival and service rates", IBM Thomas J. Watson Res. Center Rept. Re 8570, 1980. [13] J.S. Kaufman, Approximate analysis of priority scheduling disciplines in queueing network models of computer systems, 6th Internat. Conf. on Computer Communication, London, 1982. [14] W. Schmitt, Approximate analysis of Markovian queuing networks with priorities, 10th Internat. Teletraffic Conf., Montreal, 1983.