ELSEVIER
Operations Research Letters 18 (19963 237-245
Observing general service queues before joining l D.A. Stanford a, M. Hlynka b,, ~' Department of" Statistics and Actuarial Science, University q/' Western Ontario, London, Ontario. Canada N6A 3K7 b Department of Mathematics and Statistics, University of Windsor, Windsor, Ontario, Canada NgB 3P4
Received 1 November 1993; revised 1 October 1995
Abstract
A "smart" customer S uses a strategy of waiting and observing two parallel queues before joining. We analyze the system time of S, for two distinct strategies. One result generalizes a known previous result. Kevwords: Queueing; Join the shortest queue model; Parallel queues: Joining strategies; Hyperexponential distribution
I. Introduction
Many authors have studied queues in parallel. Haight [3] was the first to consider the problem. He was followed by Kingman [6] and others. Recent work on the subject includes Adan et al. [1], Kao and Lin [5] and Grassmann and Zhao [2]. In this paper we consider joining strategies for two queues in parallel, ard establish some o f their properties. Hlynka et al. [4] considered the situation of "observing queues before joining" a G/((MI + M2)/1 )2 system. That is, customers arrive according to a general arrival process to a system of two parallel queues, each with an exponential server. In that paper, various strategies for a "smart" customer were considered. By assuming exponential servers, the analysis o f the observer's system delay was greatly simplified. In this paper, we reverse our previous assumptions and initially consider an M/((M + G ) / I ) 2 system. It Corresponding author. I This research is supported by grants of both authors from the Natural Sciences and Engineering Research Council of Canada (NSERC).
can often be legitimately argued that an arrival process is well approximated by a Poisson process, since the arrivals come independently from different sources. It is much more difficult to argue that the service times should be exponentially distributed. By allowing one of the two servers to have a general service time distribution, we make our model much more realistic than the G / ( M / 1 )2 model. In the latter part o f the paper, we generalize the results for an M/((H2 + G)/1 )2 system. We assume that there is exactly one smart customer, denoted by S. All other customers follow the standard "join the shortest queue" (JSQ) strategy, in which customers select the shorter of the two queues, and select randomly if both queues have the same length. We assume S uses an "observe before joining" (OBJ) strategy, to be described shortly, which is a slight modification of the JSQ strategy. Nonetheless, it can result in significantly lower system times for S under certain circumstances. The original work was motivated by an actual example of traffic congestion. One of the authors encountered a traffic jam caused by a construction delay. There were two very long lanes (several miles in length) of traffic moving very slowly. The author
0167-6377/96/$15.00 @ 1996 Elsevier Science B.V. All rights reserved SSD1 0167-6377(95300056-9
238
D.A. Stanford, M. Hlynka / Operations Research Letters 18 (1996) 237-245
immediately picked one of the lanes and was soon surrounded by cars along side and behind, preventing any change in lane. It soon became apparent that although both lines were moving very slowly, in fact one of the lines was moving at almost three times the rate of the other line. By pulling off the road and waiting some distance from the end of the parallel queues before joining, he could have observed which lane moved forward first, and joined that same lane. By so doing, he would have selected the faster line with probability 0.75. The cost would have been that other cars would have joined the system and got ahead. Not only can it pay to wait when the two queues are very long, it can even pay to wait if the system is initially idle, if the service rates are sufficiently different. This is the principal case that we address in this paper. The OBJ strategy for S is as follows. If S arrives and finds more than one customer in the system, it uses the JSQ strategy. If the system is empty, then S waits until another customer or customers arrive and exactly one completes service. If S finds one customer in the system, then it likewise waits for a service completion. In both of these cases, S then selects the queue in which the service completion occurred. In what follows, we will determine S's average delay assuming that the system is initially idle. We later compare the OBJ and JSQ strategies numerically for this initially idle case. We observe that whenever it pays to wait under these circumstances, it would also pay to wait if there were a single customer in the system, because S would not need to wait for the next arrival to occur, and part of that customer's service time would have already elapsed. We initially assume that the interarrival times are exponentially distributed, that Server 1 has exponentially distributed service times with rate #l, and that Server 2 has generally distributed service times.
time random variable. Assume that T is exponentially distributed with rate 2. Let E(Q]i) be the expected delay if Ci selects server i, measured from the start of CI's service until S completes service. Let E(R]i) be the expected time from the start of service of C1 's service at server i until the next occurrence (arrival or service completion). Let A I be the event that the next occurrence is a service completion. Let Y be the time measured from the instant that both servers become busy until S completes service. Let E(Y] 1) be the expected value of Y given that Cj selects the exponential server. Let Xl denote an exponential service time. For case 1, let X( denote Ci's residual service time (if any) from the moment that C2 arrives and selects the general server. Let X2 denote a general service time. Let ¢bx:(s) be the Laplace-Stieltjes transform of the service time distribution for the general server. Let P(AI [i) be the probability that the next occurrence is a service completion given that the first customer chooses server i, and measured from that point in time. Clearly, since C1 is equally likely to choose either server, we have the following: Theorem.
E(TCS) = E ( T ) + ~I (E(Q[1) + E(QI2)).
(1)
We now proceed to determine all of the quantities in E(TCS) in the following lemmas. We consider the two cases in turn.
Case 1 : C1 selects the exponential server Lemma 1. (a) E(YI1) = (1 - @~ (~t I ))A ÷ E ( X 2 )~I)x2(/-/1)
2. Analysis Let Ci (C2) denote the first (second) customer to arrive after S. We consider two cases. In Case 1, Cl selects the exponential server. In Case 2, Cj selects the general server. Notation. Let TCS be the time to completion of the smart customer. Let T be the interarrival
- B d ~x,_(s)ls=~,
(2)
where A=
2+
(b) E(QI1) - /~1 + ~ +
and B = ~
E(X2)-
E(Y]I).
.
(3)
D.A. Stanford, M. HlynkalOperationsResearchLetters 18 (1996) 237-245 Proof. (a) E( Y[1 ) = E(Y • I(X( < )(2)[1 ) + E(Y * I
×(X(>~X2)[1), where I(W) is an indicator variable with I(W) = 1 if event W occurs, and I(W) = 0 otherwise. From the moment that both servers become busy, S's remaining delay consists of the observation period, the delay due to customers that get ahead of S, and the customer's own service time. The expected number of customers who get ahead of S is 2x/2, where x is the duration of the observation period. Thus
+ E(X2)
Combining the two terms gives the result. (b) E(Q[I ) = E(R[1) + P(A1 ]1 )(1//2,)
Thus (3) follows.
×/21e-m¢d~dP(X2 <~t).
E(Y[1) =
3 /22 +/21
()2 - - + 1)E(X2))dP(X2<.t).
Next we simplify the two conditional expectations above.
p--
r/21e-~'~dTdP(X2 ~< t) 0
+
2 (/22 +/21 )2
3
p
/22 -~- /21
/22 q- /21
where
E(Y*I(X( < X2)[1) 1+
[]
Corollary 1. For the M/(M/1) 2 case,
E(Y • I(X( >~X2)I 1 )
+ -/21
I1 ))E(YI 1).
+(1 - P(AI
+
=
e -mr dP(X2 <~t) 0
+ E(X2)ClSx:(/2!).
E ( Y . I ( X ( < X2)l 1)
= 1 _:~ oe-mt(t+
f
239
-
2
-
/22 -~- /21
0
Proof
(1 - e-rot) dP(X2 ~< t) o
(/~X:(S) = /22/(/22 -}- S).
--(1 +/21t)e-mt)dP(X2<..t)
×
d ~(S)
-/22 -- (/22+ s) 2'
Thus, by the lemma,
1
+ - - (1 - ~ ( / 2 1 ))
/21 = l (1l _ ~ b/ x . (2/ 2_, ) ), + {
E(Y]I) =/22 +/21
P-T+2P-~11 ( 2 ) }
2/212
1
+--+ x
1 - ~(/21) +/21~
x~(S)ls=.,
•
E(Y • l(X( >~X2)I 1 ) = ~=O e-rot (t + ( - ~ + 1) E(X2))dP(X2<~t) =
1 + dE -
)
te -mr dP(X2 ~< t) 0
+)
/22
(/22 +/21)2
3
2
/22+/21
2/21(/22+/21)
2 2(/22 + ]22)2 - - -3+ ]../2 +/21
~
+
2/22 2/21(/22 +/21 )2
2 (/22 -[-/21 )2"
[]
D.A. Stanjbrd, M. Hlynka / Operations Research Letters 18 (1996) 237 245
240
Case 2:C1 selects the general server
and
E(Q*I(T
Lemma 2
< T+XI)I2)
=
E(QI2) =
It +
+ 1) E(X2)]
x 2e -;'~ dr dP 0(2 ~
2 - Pl
× ( xAm) + E(X2)~x~(2)
B2 d 2 - #1 ds (~x2(s))Is:u,,
(for 2 ¢ Pl ),
We will combine the second and third expectations to aid in the simplification. To this end, we first evaluate the innermost integral I1 in E(Q * I(T + )(1 < X2)12). So
(4)
where A and B are as defined in Lemma 1. Proof. In Case 2, we assume that the first customer selects the general server. To find E(QI2 ), we break up the expected value into three terms. The first term corresponds to the subcase where C2 arrives after Ci's service has ended. As a result, according to our strategy, S selects the general server immediately upon C1 's completion of service. The second term corresponds to the subcase where C2 selects the exponential server, but completes service before C1 completes service. As a result, S selects the exponential server. However, there may be other customers that have joined ahead of it. The third term corresponds to the remaining subcase where C2 arrives, selects the exponential server, but completes service after CI. As a result, S selects the general server. Again, there may be other customers that have joined ahead of it. Thus
I1 =
z+x +
+ 1
l~le-mX dx
o
=
r +
(1 - e m(t-~)) +
1+
x (l(1-e-U,(t-~))-(t_z)e-m(t-r)). Thus the sum of the second and third expectations, which is equal to E(Q * I(T < X2)12), is reduced to a double integral. The integrand of the inner integral is (r+l)(l-e-U,tt-~))(2e
+
1+~1
+ (1)(1_
-;~r)
(-(t-z)e
-m('-~t
e-U,(t-~)))(2 e ;~)
E(QI2) = E ( Q • I(T > X2)12)
+E(Q,I(T+XI
<
+E(Q*I(T
<
X2)12) T+Yt)12).
Here
f
When the inner integral is evaluated, one obtains
(t +E(X2))e-;JdP(X2 <<.t), E(Q • I(T < )(2)12)
o
E(Q.I(T+X~ =
l,,(t-~)
+ B(t - r)e -~'(t-~)) 2e ;L
E(Q * I ( T > x2)12) =
= (z + A + (E(X2) - A ) e
=
< X2)12)
z+x+
= ft=o [ ( ~ + A ) ( 1 - e - ; t ) - t e - X t +1
x#le-"X dx2e-;~r d~dP(X2 <~t),
+ (E(X2) - A)2(e u,t _ e-;.t) 2 -/q
D.A. Stanford. M. Hlynka/ Operations Research Letters 18 (1996) 237 245 B2
/
1
)t
e-U,t)
-te-m')]dP(X2<<.,,. When this is added to E(Q * I(T > )(2)12), and the outer integration is performed, the expression for E(QI2 ) given in (4) is obtained. [~ In the special case 2 = lq, we get a different expression tbr E(Q]2), namely
E(QI2)= (A + ~ ) (I - 'I)x,_(2)) + E(X2)cl)x2(2) + [E(X2) - A] ). ( /B2 d2
\
In an attempt to determine how the OBJ strategy performs when both servers are nonexponential, we discuss generalizations of the preceding results. One popular choice for modeling fairly general distributions while retaining some of the properties of the exponential distribution, is the family of phase type distributions (cf. [7]). The distribution in question is modeled as the time to absorption in a transient continuous time Markov process with a single absorbing state. The submatrix (9 of transition rates among the transient states plays a key role in the model. If one replaces the exponential distribution for the service times of server 1 by a phase-type distribution, it becomes necessary to evaluate expressions of the o(3 form X = f,=o exp (Ot) dP(X2 ~
where ¢b = diag [ ~ ( - d l
The other necessary computations for diagonalizable 0 are tractable, but lengthy, and for brevity, we omit the discussion here. A popular subset of phase type distributions which provides some extension of the results here is the family of mixtures of exponentials (i.e. hyperexponentials). Hyperexponential distributions are sometimes used as simple models for distributions whose variance exceeds the square of the mean. We describe below the case of a M/((H2 + G)/1 )2 system, where/42 represents a mixture of two exponentials at rates /ti with probability pi, i = 1 , 2 , respectively. The extension to more than two exponentials is straightforward. The mean of this hyperexponential mixture is
E(Xt ) = Pl/Pl + P2/~2.
d
3. Generalizations
X = A
241
), @x2(-d2) . . . . . ¢bx2(-d,,)].
We maintain the same notation as previously, with the exception that now E( Yi] 1 ) denotes the mean delay from the instant that both servers become busy until S completes services, given that the first customer selects the hyperexponential server at rate #i, i = 1,2. Thus by analogy to the exponential case, we have for Case 1, E(QI1 )
=
Pi
+
E(XI )
i=1
We then obtain
E(Yi *I(X[ < X2)[1)
x I~ie-~"¢dzdP(X2 <<.t), E(Yt- * I(X( >~X2)]I )
D.A. Stanford, M. Hlynka/ Operations Research Letters 18 (1996) 237-245
242
and after the integrations are performed, one obtains
E(Yi]I) = (1- qbx2(,i) ) ( l + E(Yl ) ( l -}- 2+i ) ) q- E(X2)q~X2(,i) -- B~x2(s)ls:u, where B = (2/2)[E(X2) - E(XI )]. Turning to Case 2, where the initial customer chooses the general server, note that E(Q • I(T > X2)12) is unchanged, as the hyperexponential server is not even involved. The other two terms are
E(Q*I(T+X1< X2)12) 2 × ~ 7Zi.i e-l~'x i--1
E(Q*I(T
dx 2e -'~r dr dP(X2
~< t),
< T+XI)[2)
=/f° f'_
0 Z Ot l
x 2e-ndrdP(X2 <~t). The resulting expression for E(QI2) after the various integrations and rearrangements have been performed is E(QI2)
= E(X2)q~x2(2) + (A + l-2) (1- qbx2(2)) ~- Z Pi i:1
~
{(~X2("i)
-
-
~X2(2)}
B 2-,;]-Bd~x2(S)ls=~'} where A = 2E(X1 ) + (1/2)2E(XI)2. These revised expressions for E(Q I1 ) and E(QI2 ) are now substituted into Eq. (1) to obtain E(TCS).
4. Numerical examples In Hlynka et al. [4], the OBJ and JSQ strategies were tested for various pairings of two exponential servers and an initially idle system. We established that for small 2, the difference in average service times would have to be very large in order for it to pay to wait, as one would expect. As 2 increased, the OBJ strategy became viable for more moderate pairings, but in no case did it pay to wait when the service times were equal (again, this was predictable). A rough rule of thumb for moderate and high occupancy levels was that the average service times had to differ on average by a factor of 10 or more before the OBJ strategy became preferable to the standard "toss a coin" approach. (We define the occupancy as the arrival rate divided by the cumulative service rate of the two servers). The question naturally arises as to how sensitive these conclusions are to the exponential service assumption. Furthermore, in the nonsyrnmetric case where the service distributions of the two servers differ in shape as well as in mean, which arrangement of mean service times leads to a lesser average delay under the OBJ strategy? These questions been addressed in Tables 1 and 2, where various assumptions about the shapes and means of the service times have been made. In each table, the mean delays under the OBJ strategy for four configurations have been compared for occupancy levels ranging from 0.1 to 0.9. We consider mean second-server times of 10, 1 and 0.1 respectively. Without loss of generality, the mean first-server time has been set to 1 in all cases. In each case, the mean delay under the JSQ strategy has been listed at bottom for comparison. In Table 1, we have paired an exponential server with (alternatively) a deterministic, Erlang-2, exponential, and hyperexponential second server. (In all examples involving hyperexponential servers, we have assumed balanced means: Pl/,I = P 2 / . 2 = E{X/}/2.) The table leads to the same conclusions as in Hlynka et al. [4]. It never pays to wait when the average service times are equal. For moderate and high p, average service times must differ by a factor of the order of 10 or more before it pays to wait, and this factor decreases slightly with an increase in p. By comparing the first and third columns for each configuration, we can see which arrangement of fast
D.A. Stanford, M. Hlynka/ Operations Research Letters 18 11996) 237 245
243
Table 1 M/((M + G)/1 )2 comparisons
t,
1
0~1
t,
M/((M + D)/I )2 0.1 15.651 0,2 9.211 0.3 6,828 0.4 5.641 0.5 4.958 0.6 4.527 0.7 4.239 0.8 4.038 0.9 3.893 JSQ 5.5
6.965 4,454 3.626 3.226 3.001 2,867 2,785 2.737 2.712 1,0
1.563 0.974 0.760 0.650 0.584 0.541 0,512 0.492 0,478 0.55
M/((M + E2 )/1 )2 0.1 15.643 0.2 9.537 0.3 7.293 0,4 6,143 0.5 5.462 0.6 5.025 0.7 4.730 0.8 4.527 0.9 4.384 JSQ 5,5
+ M)/1)2 15.631 9.752 7.631 6.551 5.912 5.505 5.233 5.048 4.922 5.5
6.925 4.386 3,535 3.117 2.875 2.724 2.627 2.563 2.523 1.0
1.563 0,975 0.763 0,655 0.591 0.551 0.523 0.505 0.492 0.55
M/((M 0.1 0.2 0.3 0,4 0.5 0.6 0.7 0.8 1),9 JSQ
E{X2}- 10
and slow server leads to the smaller average delay. The two columns should differ by a factor o f exactly 10 if there were no real difference, because this is the amount by which the arrival, slow service, and fast service rates have all increased. In all cases, we see that the less variable server should have the longer service time, as intuition would suggest. By comparing the same columns o f each of the four configurations, one sees the effect on the average delay of an increase in the variance of Server 2's service times. For equal mean service times, an increase in variance leads to a decrease in delay. This stands to reason: since the mean service times are equal, the difference centers around the duration o f the observation period. In cases where Cj departs before 6"2 arrives, there should be no difference. Otherwise, both servers will be simultaneously busy at some point, and there is a greater chance o f an earlier service completion if both servers are variable. An early service completion means a shorter observation period, during which typically fewer customers would bypass S.
E{X2) = 10
1
0.J
6.944 4.418 3.577 3.165 2.931 2.786 2.695 2.637 2.603 1,0
1.563 0.975 0.762 0,653 0.588 0.547 0.519 0.499 0.486 0.55
M/((M + H2)/1)2 C2 = 9 0.1 15.617 6.798 0.2 10.095 4.237 0,3 8.091 3.375 0.4 7,060 2.947 0.5 6.450 2,695 0.6 6.061 2.534 0.7 5.804 2.426 0.8 5.633 2.351 0.9 5.521 2.300 JSQ 5.5 1.0
1.554 0.964 0.752 0.645 0.583 0,544 0.518 0.502 0.491 0.55
When E{X2} = 10, two trends are observed with increased variability. At moderate to high load (here, for p > 0.1), the average delay increases together with Server 2's service time variance. The difference here from the previous case is that as the variance increases, there is a greater chance of a "trick" Server 2 service time being less than Server l's, despite the fact that its mean is 10 times greater. Thus S might select the wrong server on the basis o f an exceptionally short service time. It is worth noting that this is the case where the average TCS has the greatest sensitivity to the shape of the Server 2 service time, and the impact grows with the occupancy. In contrast, at low load, increased variability leads to a slightly smaller average delay. A more variable server has more frequent long service times as well as short ones. Thus there is a greater chance that if C~ happens to select the slow server, C2 will arrive and may even depart before CI. In this case, S will manage to correctly select the fast server.
D.A. Stanford, M. Hlynka/Operations Research Letters 18 (1996) 237-245
244
Table 2 M/((H2 + G)/1) 2 comparisons; C~ = 9 p
E{X2} = 10
M((H2 + D)/I )2 0.1 15.599 0.2 9.247 0.3 6.974 0.4 5.893 0.5 5.309 0.6 4.972 0.7 4.772 0.8 4.657 0.9 4.597 JSQ 5.5 M/((//2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 JSQ
+ / / 2 )/I )2; C2 15.500 9.832 7.771 6.711 6.082 5.680 5.414 5.236 5.117 5.5
1
0.1
p
6.837 4.301 3.459 3.046 2.808 2.659 2.562 2.499 2.459 1.0
1.562 1.009 0.806 0.701 0.638 0.597 0.570 0.550 0.537 0.55
M((H2 + M)/I )2 0.1 15.537 0.2 9.642 0.3 7.524 0.4 6.454 0.5 5.830 0.6 5.438 0.7 5.184 0.8 5.017 0.9 4.909 JSQ 5.5
= 4 6.732 4.157 3.290 2.859 2.605 2.441 2.330 2.252 2.197 1.0
1.557 1.003 0.801 0.698 0.636 0.~598 0.572 0.555 0.545 0.55
M/((//2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 JSQ
The case where E{X2} = 0.1 are somewhat similar to E{X2} = 10, for similar reasons: more variable Server 2 times can cause the wrong decision at moderate to heavy load, and a shorter observation period at low load. There is an anomaly for p > 0.1, though, in that the average delay seems to peak at a variability C~ between 2 and 4 (not shown), and then decrease above that point. It is hard to be sure of the probabilistic phenomena causing this effect, but it would appear that there is a range over which the chance of a mistaken choice grows, and then drops beyond some point. This strange behavior may be limited to the class of bursty distributions we have considered, namely hyperexponential distributions with balance means. Table 2 (for a hyperexponential Server 1) leads to the same qualitative conclusions as Table 1 (where Server 1 was exponential), although the degree is somewhat changed in most cases. Comparing the two tables, the average delay is smaller for equal service times, and generally greater when they are not. An exception to this is the
E{X2} = 10
1
0.1
6.798 4.237 3.375 2.947 2.695 2.534 2.426 2.351 2.300 1.0
1.562 1.010 0.809 0.706 0.645 0.606 0.580 0.563 0.552 0.55
+ / / 2 )/1 )2; C~ = 9 15.487 6.680 9.903 4.108 7.861 3.245 6.804 2.816 6.174 2.563 5.770 2.398 5.502 2.286 5.321 2.207 5.200 2.151 5.5 1.0
1.549 0.990 0.786 0.680 0.617 0.577 0.550 0.532 0.520 0.55
case where E{X2} = 10. There is less change in the average delays as the Server 2 times become more variable for a hyperexponential Server 1 than for an exponential one. Numerical results in Adan et al. [ 1] indicate that in highly unbalanced systems, the steady-state probabilities of finding zero or one customers in the system are very small. Thus, although the OBJ strategy has been shown to be effective if the service rates of the two servers are very different, there may be only a small probability of encountering a system where OBJ applies.
Acknowledgements The authors would like to thank Dr. M. Magazine of the University of Waterloo for suggesting that the OBJ strategy be expanded to include the case where one customer is present upon arrival. Thanks also to the anonymous referee for suggested improvements.
D.A. StanJbrd, M. Hlynka/ Operations Research Letters 18 (1996) 237 245
References [1] I.J.B.F. Adan, J. Wessels and W.H.M. Zijm, "Analysis of the asymmetric shortest queue problem", Queueing Systems 8, 1-58 (1991). [2] W.K. Grassmann and Y. Zhao, "The shortest queue model with jockeying", Naval Res. Logist. 37, 773-787 (1990). [3] F.A. Haight, "Two queues in parallel", Biometrica 45, 401410 (1958). [4] M. Hlynka, D.A. Stanford, W.H. Pooh and T. Wang, "Observing queues before joining", Oper. Res. 44, 365-371 (1994).
245
[5] E.P.C. Kao and C. Lin, "A matrix-geometric solution to the jockeying problem", European J. Oper. Res. 44, 67 74 (1990). [6] J.F.C. Kingman, "Two similar queues in parallel", Ann. Math. Statist. 32, 1314-1323 (1961). [7] M. Neuts, Matrix Geometric Solutions in Stochastic Models: An Algorithmic Approach, Johns Hopkins University Press, Baltimore, 1981. [8] D.A. Stanford and W. Fischer, "Characterizing interdeparture times for bursty input streams in the queue with pooled renewal arrivals", Comm. Statist. Stochastic Models 7, 31 I 320 (1991).