Comparison of gradient estimation techniques for queues with non-identical servers

Comparison of gradient estimation techniques for queues with non-identical servers

Pergamon 030'3-0$4W94)e006.~l Computers Ops Res. Vol 22, No. 7, pp. 715-729, 1995 Copyright © 1995 Elsevier Science Ltd Printed in Great Britain, Al...

842KB Sizes 2 Downloads 60 Views

Pergamon

030'3-0$4W94)e006.~l

Computers Ops Res. Vol 22, No. 7, pp. 715-729, 1995 Copyright © 1995 Elsevier Science Ltd Printed in Great Britain, All rights reserved 0305-0548/95 $9.50+0.00

C O M P A R I S O N O F G R A D I E N T ESTIMATION T E C H N I Q U E S FOR Q U E U E S WITH N O N - I D E N T I C A L SERVERS Michael C. Fu, lt Jian-Qiang HU 2+ and Rakesh Nagia§ 1College of Business and Management, University of Maryland, College Park, MD 20742, U.S.A., 2Department of Manufacturing Engineering, Boston University, Boston, MA 02215, U.S.A. and aDepartment of Industrial Engineering, State University of New York at Buffalo, Buffalo, NY 14260, U.S.A.

(Received December 1993: in revised form August 1994) Scope and Parpee, e The modeling and analysis of many complex modern systems such as manufacturing systems, inventory control systems, communication and computer networks, often require the use of discrete-event simulation, which can be costly, both in computational and developmental terms. A critical component that must accompany any simulation study is sensitivity analysis. For example, in the simulation of a machine tool manufacturing system designed to estimate throughput, work-in process, and lead times, one should assess the sensitivity of these various performance measure estimates to the cutting speeds of the various machines in the system. The traditional means of performing sensitivity analysis for simulation is to use finite differences, which requires multiple simulation runs for each parameter of interest, e.g. for p parameters, the usual symmetric difference estimate would require 2p additional simulations, which would be very costly for realistic systems where p is large. In the past decade or so, new methodologies have been proposed to perform sensitivity analysis more efficiently. In this study, we compare and contrast the implementation and performance of two gradient estimation methodologies for sensitivity analysis by applying them to a small queueing system. The results of this study should provide insight to practitioners on the applicability, i.e. strengths and weaknesses, of these estimation techniques. Abstract--Gradient estimation techniques for stochastic discrete-event simulation have been a major topic of research over the past decade. In this paper, we apply two of the techniques--perturbation analysis and the likelihood ratio method--to a single.queue system with non-identical multiple servers. We derive estimates for derivatives of mean steady-state system time with respect to parameters of the underlying timing distributions. In terms of perturbation analysis, we consider both an infinitesimal perturbation analysis estimator, which is biased for this problem, and two smoothed perturbation analysis estimators, one unbiased but not very practical and one approximate but easily implementable. For two servers, we provide an analytical proof of unbiascdness in steady state for the Markovian case. For the likelihood ratio method, we apply the regenerative likelihood ratio estimator. We provide simulation results for both Markovian and non-Markovian examples, and compare the performance of the various estimators. We conclude that no one method performs universally well, and provide recommendations as to when one is likely to be preferred to the others.

1. INTRODUCTION

Estimation of gradients of performance measures with respect to parameters of the system from a sincjle simulation has recently become an important research topic, two major applications being sensitivity analysis and stochastic optimization Ill. The traditional means of performing sensitivity tMichael C. Fu is an Associate Professor of Management Science & Statistics in the College of Business and Management, with a joint appointment in the Institute for Systems Research, at the University of Maryland. His Ph.D. is in Applied Mathematics from Harvard University. His research interests include simulation and queueing, stochastic gradient estimation and stochastic optimization, with applications towards manufacturing systems, inventory control, and financial derivatives. ++Jian-Qiang Hu is an Assistant Professor of Manufacturing Engineering at Boston University. His Ph.D. is in Applied Mathematics from Harvard University. His research interests include queueing network theory, optimization and control of discrete-event systems and sensitivity analysis for stochastic systems, with applications towards manufacturing systems, inventory control, communications networks, and financial derivatives. §Rakesh Nagi is an Assistant Professor of Industrial Engineering at the State University of New York at Buffalo. His Ph.D. is in Mechanical Engineering from the University of Maryland at College Park. His current research interests lie mainly in the design of manufacturing facilities and in hierarchical production planning and scheduling management for manufacturing systems. 715 I?,AO9 22:7-0

716

Michael C. Fu et al.

analysis for simulation is the finite difference method, which requires multiple simulation runs for each parameter of interest, e.g. for p parameters, the usual symmetric difference estimate would require 2p additional simulations, which would be very costly for realistic systems when p is large. In the past decade or so, new methodologies have been proposed to perform sensitivity analysis more efficiently. Overviews of the state-of-the-art of gradient estimation can be found in Ho and Cao [2], L'Ecuyer [31 and Strickland [4]. The easiest technique to apply and the one generally posssessing the lowest variance is infinitesimal perturbation analysis (IPA) I-2,5]. Unfortunately, IPA does not work universally, one example being the particular queueing system that we consider in this paper: a single-queue (unlimited capacity), two non-identical server system. Taking the performance measure of interest to be the mean steady-state system time and considering a parameter in the underlying service time distributions, it can be easily shown that IPA, in general, gives a biased estimate. In fact, Fu and Hu 1-6] prove that for the Markovian case, the bias is proportional to the square of the difference in the mean service times. Experimental results for non-exponential cases found in Fu et al. [7] seem to indicate that this proportionality holds for non-exponential cases, as well. Thus, there is a need to consider alternative methods for this problem. In this paper, we apply the techniques of smoother perturbation analysis (SPA), introduced by Gong and Ho [8], and the likelihood ratio (LR) or score function (SF) method (both are described in Refs [2] and I-3]), to develop estimators for single-queue, multi-server queueing systems with non-identical servers. For SPA, we apply the results of Fu and Hu [9-1, where a general framework for applying SPA is established. The resulting estimator, though theoretically exact, may incur prohibitively large amounts of additional simulation, so we propose an approximation which requires no additional simulation. For the Markovian case, the approximation provides exact results, and we give an analytical proof of unbiasedness in steady state. For LR, we derive regenerative likelihood ratio gradient estimators. We then investigate the performance of the estimators for various interarrival time and service time distributions via numerous simulation experiments. The rest of the paper is organized as follows. In Section 2, we present the perturbation analysis estimators. The likelihood ratio gradient estimation method is presented in Section 3. In Section 4, we investigate the performance of the estimators by conducting simulation experiments. We present some conclusions in Section 5. The practitioner less interested in the theory may skip to Section 4, where a brief subsection summarizing relevant implementation issues is included.

2. THE SPA ESTIMATOR We consider a single-queue, first-come, first-served (FCFS) queueing system with m non-identical servers, a general renewal arrival process and general independent service time distributions. Steady-state system time, denoted by T, is our performance measure of interest, and we wish to estimate dET/dO, where 0 is a parameter of the service time distribution and E T is the expected value of T. Consider an arbitrary busy period of the system, and let Xi(O) be the service time of the ith (to arrive) customer in the busy period and Ai be the interarrival time between the ( i - l)th and ith customers in the busy period. We will take time 0 to be the arrival of the first customer. Let y(.) and G(' ) be the respective p.d.f, and c.d.f, of the interarrival times, and fj(. ) and Fj~. ) be the respective p.d.f, and c.d.f, of the service times at server j, j = l . . . . . m. We first present the IPA estimator, which will be one part of the full SPA estimator. To describe the IPA estimator, we introduce the important concept of a server's local busy period: the interval between two adjacent idle times of the (same) server. Using this idea of local busy periods, and defining the set of customers preceding i in the same local busy period L(i)--{j < i: j in the same local busy period as i~, we have (see Fu and Hu [10)], for details dT~ dO

-

~

dX~

2.

--÷

j~L(i) dO

dX~ dO'

(!)

where T~ is the system time of the ith customer (in the busy period), and dX/dO is the derivative

Gradient estimation techniques for queues

717

of the service time random variable given in Suri and Zazanis [11]

dX dO

- OF~dO OF/dx

(2)

where F(x; 0) is differentiable with respect to both x and 0. Note that the subscript on F indicating the server has been omitted here for notational convenience. Equation (1) can also be written in recursive form: dTi ~dXi/dO ifi initiates local busy period,

dO = ~( d T¢/dO + d X i/ dO otherwise.

(3)

where 7= maxj~u0 j, i.e. customer i'is the index of the customer preceding i in the local busy period. Intuitively, the change in system time of customer i is the sum of the change in system time of the customer just preceding him/her in the same local busy period (if any such customer exists) plus the change in customer i's own service time. The intuitive reason why IPA does not work is that small perturbations in the parameter could cause a switching of the assignment of servers from which customers receive service. Of course, in the identical server case, such a switching has no effect on the performance measure of interest, since the service times are i.i.d. Even in the case of non-identical servers, there are many situations in which there is a symmetry that gives a cancelling out of the expected effect of the switching (e.g., exactly two customers switching between their respective two servers). However, there are two cases where this cancelling out does not hold. In the terminology of Fu and Hu [12], there are two types of critical adjacent event pairs, which we describe as follows: (1) m + 1 in system, departure at server j, followed by departure at server f # j ; (2) < m in system, departure at server j, followed by arrival. We now give intuitive explanations as to why (1) and (2) are critical. In (1), there is a single customer in queue (since there are m + 1 in the system). In the sequence of events described, the queued customer would go to server j, whereas were the events to change order, the queued customer would go to server f, causing a finite change in that customer's system time. On the other hand, in (2), there is at least one server idle. In the sequence of events described, the arriving customer may be served by server j (depending on the serving policy for the assignment of idle servers), whereas were the events to change order, the arriving customer would have no chance of going to server j, since it would be busy. Thus, a finite change in the arriving customer's system time occurs with the probability that the arriving customer be served by server j in the original sequence. More formally, we define D~= the departure time of customer i,

S(i) = the server of customer i, N(t) = the number of customers in the system at time t, tr(t)=next event after time t~ {~, flj, j = 1. . . . . m}, where

ct = arrival event,

flj = departure event at server j. We then have the following sets:

L~j)= {i: S(i)= j, N(D +)= m, a(Di)= flf, f # j}, Ltj)= {i: S(i)=j, N(D~-)
Michael C. Fu et al.

718

goes to server 2. We let 0, be a parameter in FI(" ) and let 02 be a parameter in F2(" ). Then our estimators are given by

L -dot - F2ixi + Y3- F2(x3 k~=l a-b +p i,ELS"dO, l-b/z,) aT,," + i,L~',

seA=

E



where

i

isLg:'

d02 1 -

X tiT` G(zi)

'

isLe:'

f,(x,i

d02 Fdxi +Yi)- F,(xi)

,

t4)

15) "

xi-- age of the service time at other server at Di, y, = minimum of the residual service time and the residual interarrival time, z,--age of the interarrival time at D~,

L~J)= {i: S(i)=j, N(D+)=2, a(D~)=flf, f # j}, j= 1, 2,

L~'= {i: S(i)=j, N(D+)=O}, j= 1, 2, and g(zi) and G(zi) are the p.d.f, and c.d.f., respectively, of the interarrival time at D i. Note that L~ ~ is notationally simplified for the two-server case, because satisfying the original set of conditions means the departure leaves the system empty; hence, the next event must be an arrival. Intuitively, the summed terms are products of two terms: the rST`,j terms representing the effect due to a critical adjacent event order change and the other two terms representing the rate at which the change occurs. To precisely define these quantities, we define the augmented state by taking the physical states 0, (1, 0), (0, 1), 2, 3. . . . --where "0" indicates that the two servers are idle, "(1, 0)" indicates that server 1 is busy and server 2 is idle, "(0, 1)" indicates that server 2 is idle and server 2 is busy, "2" denotes two busy servers, "3" denotes both servers busy and one customer in queue . . . . - - a n d adding to it the appropriate ages of the random variables. We then condition on the state at D~ as follows:

bT,,,=(k~i Tk [(0, 1): ~o=Zi, ~2=O]--k~ ' T, [(1, 0): ~o=Zi, ~t = 0 ] ) , 6T`.2=

(k~ Tk[(1, O): ~ O = g i " ~1 =0"]-- ~ T/[(O, l): ~O=Zi, ~2=0] ) , i

k=i

where ~o = age of interarrival time, ~, =age of service time at server 1, ~2 =age of service time at server 2. Thus, the "initial" states here depend on the age of the interarrival time. For more than two servers. there would be a dependence on the ages of service times at other servers, as well. Under some mild conditions, we can use Theorem 3 in Fu and Hu [12] to show the following:

Theorem 1. The SPA estimators (dT/dOi)sPA.,, i= 1, 2, are unbiased for all n. Since in this paper we consider steady-state performance, we can simplify the expressions for r~T`.j for large n to the following: ¢~T,.j=Si.j-MI.jE[T]

,

j= 1, 2,

where we define

E E Mi. j = E[ll(i, J)l], where l(i, I) denotes a sample path starting from state [(0, !): ¢o=Z~, Cz=0] and ending the first time it reaches state [(!, 07; ¢o = z~, ~ = 01, and similarly l(i, 2)denotes a sample path starting from state [(0, 1): ¢o=Z~, ~2 = 0 ] and ending the first time it reaches state [(1, 0): ¢o=Z~, ~ =0]. The [l(i, j)l notation indicates the number of service completions in l(i, j) (in some sense its "length").

Gradient estimation techniques for queues

719

However, it is clear that this "simplification" actually leads to intractability, since the probability of reaching such a single state from an uncountable state space is 0 in the general case. On the other hand, for the Markovian ease, the state is completely defined by the physical state (without the age) since the distributions are all memoryless. As long as the queue is stable (hence ergodic), the probability of returning to a given state is 1. Thus, we actually have independence from i. Defining S, =expected sum of system times from state (0, 1) to (1, 0), $2 =expected sum of system times from state (1, 0) to (0, l), M, = expected number of service completions from state (0, 1) to (1, 0), M2=expected number of service completions from state (1, 0) to (0, 1),

As=Sj-MjE[T ], j= 1, 2, we have E[f Ti,s] = As, j = 1, 2. In addition, we also have St = - S 2 , Mr = - M 2 , and hence A, = -A2. We can thus get estimates from the sample path itself whenever the appropriate physical states are encountered, regardless of the values of the ages of the other random variables. Thus, instead of estimating each time a critical adjacent even is encountered, we simply estimate two terms simultaneously, i.e. our estimator is as follows: \ ~ - ~ ) S P A = n \ , = , -d0, + dT']

2. - - 1-G(z,) ~-,~Lv, 2. -dO, - F2(xi+ yi)-F2(xi) A, L P ,~L~,,dO,

l f ~ dT, [- _ dTi

#(zi)

I,'\,:, a- +lq i~L7 L -d02- 1 -

G(2i)

X

dT,

ieLll-i~

ft(xi)

(6)

-]^ \

F,(z i

where As, J = I, 2 are estimates of the quantities of As, j = 1, 2, which can be estimated on the given sample path without the need for additional simulation. This is the general idea in Fu and Hu [10] for any Markov chain. The approximation uses equations (6) and (7) for more general distributions, instead of estimating separate 6 Ti.s for each value of age zi encountered, as in equations (4) and (5). Thus, over the entire sample path, we are in some sense "averaging" over the possible initial state vectors. The approximation eliminates the need for additional simulation. For the Markovian case, the approximation is in fact exact, i.e. we have:

Theorem 2. In steady state, the SPA estimators (dT/dOi)spA, i= 1, 2, are unbiased when the arrival process is Poisson and service times are exponential. Proof. Provided in the Appendix. In Section 4, we will check empirically via simulation experiments the bias of the approximation for more general distributions. 3. GRADIENT ESTIMATION VIA LIKELIHOOD RATIOS The likelihood ratio (LR) method, also called the Score Function (SF) method, is another technique for gradient estimation in stochastic simulation models (see e.g. Refs [13-15]). Here, we briefly present an overview of the LR technique, and derive regenerative LR estimators for our problem. In general, the LR method has wider applicability than IPA, but when IPA works, it usually has much lower variance. Variance comparisons between SPA and LR, on the other hand, seem to be problem dependent (see e.g. Ref. [16]). Let E[L(X)] = .I L(x) dF(0, x)

(8)

be the performance measure of interest, where X is a random vector with joint cumulative distribution function F(O, .) and density f(O,.)/d, depending on a parameter (or vector of parameters)

Michael C. Fu et al.

720 0. Differentiating equation (8), we have

OE[L']-f--of f O0

L(x)f(O,x)dz=

=

f

L(x) Of(O'X)dx 30

f"'alnf(O'x)f(°,x)dx

L(x) Of(O, x_____f(O, ~) x) dx = Llxl 00 f(O, x)

00

= E l L ( X ) 01n f(O, X)] 00 "

(9)

Thus, in a single simulation, one can estimate the derivative of the performance measure along with the performance measure itself. Higher derivatives can be handled in a similar manner. A set of mild assumptions relating to the differentiability of the performance measure (e.g. Ref. [15,]) allows the interchange of differentiation and integration in the first line. However, as we shall see, the "naive" estimator for equation (9) leads to unbounded variance for steady-state performance measures. For the system under consideration, the interarrival times and the service times comprise the random vector. Since these times are all independently generated, the density function f will simply be the product of the density functions of the interarrival and service time distributions. Thus, the joint density over N service completions for the two-server queue would be given by N

f(O, A, . . . . . Au, X , . . . . . XN)= 1--I 9(A,) r-I f,(x,) l-I f2(x,) , i=l s0)=Li
(1o)

where A~, X , i= 1. . . . . N are the interarrival times and service times, respectively. For example, in the Markovian case with arrival rate 2 and service rates #l and/~2, equation (10) becomes N

f(O, A, . . . . . AN, X t . . . . . XN) = I-I 2 e- ~A,

l-I

lq e-

H

lllXi

S0} = l.i<~N

i= 1

U2 e- u,_x,,

(11)

Sli)=2.i<.N

and we have (Pk = 1/0k, k = 1, 2) N

ln f(O,,O2, Ai ..... Au, X t ..... XN)= ~ (In 2 - 2 A , ) + i=1

+

~

(ln#l-lqX,)

S(i) = 1,i<~N

~

(11z -l~2Xi),

(12)

S(i) = 2,i <~N

and 01nf_ 00 k

E

\0~

Ok]'

k=l,2.

(13)

Sli}=k.i<~N

The natural estimators would then be given by

(dr]

l

d~k/L.=Ni=lE T, S,)=k.~
0,)'

k= 1, 2.

(14)

The problem with these estimators is that if they are used to estimate steady state quantities by increasing the horizon length N, then it is obvious that the variance of the estimator will increase linearly with N, resulting quickly in a useless estimator. To resolve this problem, we derive a regenerative estimator instead. Using regenerative theory, we can express the mean steady-state system time as a ratio of expectations:

E[ T-] = E[Q.....~]

(15)

where r/is the number of customers served in a busy period and Q is the sum of the system times

Gradient estimation techniques for queues

721

of customers served in a busy period. Differentiation of equation (15) yields dE[T]_dE[Q]/d0k dOk E[r/]

dE[q]/dOkE[T],

k= 1, 2.

(16)

E[r/]

Let qj be the number of customers served in the jth busy period, and Lid and Xi.j be the system time and service time for the ith customer in the jth busy period, respectively. Then, employing equation (9) in conjunction with equation (13), we have the following regenerative estimators over M busy periods:

1 {i__~l ¢31nf~ 1 ~ [ Oln f ) T . ~kk/LR=~ j=~l Li'J~'k'-k)--N j=l ~ l j ~ - ; ,

(dT~

k= 1, 2,

(17)

where N -- ~ j =M, qj is the total number of customers served, and T = ~ = , Tj/N is the estimate of mean system time. The advantage of these estimators is that the summations are bounded by the length of the busy periods, so as long as the busy periods are not too long, the variance of the estimators should be reasonable. Further reductions in variance can be achieved via variance reduction techniques such as conditional expectation, but will not be pursued here. 4. SIMULATIONEXPERIMENTS In this section, we report the results of numerous simulation experiments designed to compare the performance of the IPA estimators, the approximate SPA estimators, and the LR estimators. First, we provide a brief summary of some implementation issues and practical application implications, intended to give practitioners a feel for the techniques without going into the theory of the previous sections.

4.1. Implementation issues and applications To summarize the previous two sections, the IPA estimator and the LR estimator can be easily implemented without much trouble. For the IPA estimator, it is just a matter of a few lines of code to implement the simple recursion given by equation (3). For the LR estimator, the estimation is done via equation (17). Both require simple calculations of derivatives, which can be easily done given the service time distribution. For example, for exponential service times--where X i is the ith service time generated and Ok is the mean service time of server k--IPA gives dXi/dOk=Xi/Ok, whereas LR would use equation (13) in equation (17). The SPA estimator, on the other hand, requires some additional bookkeeping having to do with quantities defined on two short trajectories of interest. The quantities that must be estimated are the expected sum of system times and the number of service completions, and the short trajectories are those which start with only server 2 busy and ends with only server 1 busy, and one which is the reverse, i.e. starts with only server 1 busy and ends with only server 2 busy. These terms are used to compute and /~2 in equations (6) and (7). To make the estimators more accurate, sometimes additional short simulations, which we will later refer to as "offiine" simulations, are used just to estimate these terms. The remainder of the terms in equations (6) and (7) are not difficult to implement, employing recursions similar to those used in the IPA estimator. The queueing system of interest was chosen for study because it provided a rich but simple testbed for comparing and contrasting the various estimators. In terms of practical application, the system could model a manufacturing cell with disparate machines. As we said, a critical component that must accompany any simulation study is sensitivity analysis. For example, in the simulation of a machine tool manufacturing system designed to estimate throughput, work-in process, and lead times, one should also assess the sensitivity of these various performance measure estimates to the cutting speeds of the various machines in the system. The traditional finite differences requires multiple simulation runs for each parameter of interest, e.g. for p parameters, the usual symmetric difference estimate would require 2p additional simulations vs. a single simulation for the gradient estimation techniques considered here (with the proviso "offline" simulations for the SPA estimator). Finally, we note that the simulations to be described next were meant to cover a wide set of parameter values for the queueing system of interest, from relatively low traffic intensity--e.g, job

A1

722

Michael C. Fu et al.

shop environments--to relatively high intensity--e.g, automated manufacturing environments such as flexible manufacturing systems.

4.2. Numerical results The following systems were studied: 1. 2. 3. 4. 5.

M/M/2 M/U/2 M/We/2 U/M/2 U/We/2. For the distributions of service times, the following expressions were used:

Exponential

Uniform

Weibull

fo(x)

(I/P) e -x/°, x > 0

1/28 (0-~
flOx °-1 e -ax°, x > 0

Fo(x )

1 - e -x/°

x/20 (0 ...20)

1 - e -~x°

X/O

X/O

- X(ln

dX dO

---

-t3F/t30 c~F/Ox

-

In fo(x)/dO

(x - 0)/82

X)/O

1/0 + ( 1 - fix °) In x

The following parameter values were chosen: 2 = 1.0 01 = 0 = 0.2, 0.4, 0.8 (for exponential and uniform); 8 = 0.5, 2.0 (for Weibull) 02=c0; c= 1.0, 1.1, 1.15, 1.2, 1.25, 1.3, 1.4, 1.5 p=0.0, 0.5, 1.0. Note that when the servers are nearly equal (i.e. c is close to unity), 0 = 0.2, 0.4 and 0.8 correspond to approximate traffic intensities of p = 0.1, 0.2 and 0.4, respectively, for the cases of exponential and uniform service times. For the Weibull case, (8, fl)=(0.5, x/~) as well as (2.0, rt/0.64) correspond to/9 = 0.2; these values were chosen because they result in different coefficients of variation. For systems 1, 2 and 4, 72 experiments were performed, while 48 experiments were performed for systems 3 and 5. Ten replications for each experiment were simulated for 100,000 busy periods, and 90% confidence intervals were constructed. For the Markovian case, exact analytical gradients are available and were used for comparison. For the other systems (2-5), estimates of the true derivatives were obtained by symmetric finite differences (FD) using common random numbers (although, for cases 2 and 3, analytical results could also be derived), which necessitated additional experiments at perturbed parameter values of 01 + A0 and 82 + A0 to obtain the finite difference derivatives for the two servers, respectively; A0 was chosen to be 0.005. For cases having uniform service distributions, 0 is the scale parameter; hence, the service times were uniform on [0, 20d in the nominal path and [0, 2(0i + 0.005)] in the perturbed paths. In these cases, gradient estimation via Likelihood Ratios (LR) is not possible due to the discontinuity of the p.d.f, at 28 (see, e.g. Ref. [17]). We now briefly summarize the results obtained for each of the cases. Some of the results--means based on 10 replications--are displayed graphically in Figs 1-5. The detailed numbers, including 90% confidence intervals, can be found in tables provided in the technical report [18]. Generally, the SPA estimates had tighter confidence intervals than the LR to FD estimates, usually about one-half to one-third the size.

Gradient estimation techniques for queues

723

1.6 1.5 1.4 O •0.8 p • 1.0

1.3 1.2

1.0 0.0 0.0

0.1

0.2 6 0.3

0.4

0.5

Fig. i. M/M/2 Server 1.

1.4 1.3" 1.2" 8 •0.8

1.1"

p•l.0

1.0" 0.9 0.8

0.7 0.0

0.1

0.2 8 0.3

0.4

0.5

Fig. 2. M/U/2 Server 1.

1. M/M/2 (refer to Fig. 1): The lowest curve corresponds to the IPA estimates, whereas the other three curves are more closely bunched. As expected, the results indicate that IPA is biased for unequal servers [6, 7], and that SPA is unbiased. The confidence intervals widen with increasing p and 6, but are always tighter than the LR estimates. 2. M/U/2 (refer to Fig. 2): The lowest curve corresponds to the approximate SPA estimates, denoted SPA1, and the second lowest curve corresponds to the IPA estimates. As pointed out earlier, LR cannot be performed in this case. Again, as expected, and shown in Fu et al. [7], IPA is biased; however, the SPA1 estimator is even more biased, because the estimates ~ , j = 1, 2, turn out to be poor approximations for the corresponding 6T~.j. In the M/G/2 case, due to the arrival process being Poisson and the presence of only two servers, independence from i can be achieved with the following change in definitions: Sl =expected sum of system times from state [(0, 1): ~2 =0] to an arrival to state (1, 0), $2 =expected sum of system times from state [(1, 0): ~ = 0 ] to an arrival to state (0, 1), M ~= expected number ofservice completions from state [(0, 1): ~z = O] to an arrival to state (1,0), M: = expected number of service completions from state [(1,0): ~ = O] to an arrival to state (0, 1).

724

Michael C. Fu et al. 2.0 " "*

1.5"

SPA1 PA SPA2

0 =2.0

FD

~: Q

1.0

p : 0.5

LU

i

05 o ,T,,

l

i

i

i

0.0

0.1

0.2

0.3

0.4

6

0.5

Fig. 3. M/We/2 Server 1.

1.3 1.2"

i :e

1 .1

0 = 0.8

1.0 •

P = 1.0

ILl

i

0.9"

o.8: 0,7

FD II

0.1

|

0.1

!

0.2 8 0.3

|

0.4

0.5

Fig. 4. U/M/2 Server 1.

These quantities can be estimated from the given sample path without the need for additional simulation, except in the extreme cases: (i) p = 0 as Sz and M x cannot be estimated, and (ii) p = 1 as $2 and M2 cannot be estimated. To test the accuracy of this implementation without rerunning the entire set of simulations, we simply ran much shorter "offline" simulations-meaning simulations separate from the main simulation of the system--to estimate Sj and M r, j = I, 2, obtaining the revised SPA2 estimator. Ten replications for each experiment were simulated for 10,000 instances of the appropriate starting and ending states. Each replication was associated with one replication of the SPA run, and 90% confidence intervals were constructed. The revised SPA2 estimator performs well, and provides tighter confidence intervals than FD. 3. M/We/2 (refer to Fig. 3): The bias of the SPA1 estimator is extremely clear, as it lies way above the other four curves, which appear to be a single curve in the graph. Again, employing the revised definitions of quantities described in the M/U/2 case and conducting the "offline" simulations gives the unbiased SPA2 estimator. Compared to LR and FD, the revised SPA2 estimate reveals significantly tighter confidence intervals. 4. U/M/2 (refer to Fig. 4): The approximate SPA estimator, represented by the lowest curve, is dearly biased. The IPA estimator, represented by the second lowest curve, is also biased. Thus, conclusions

Gradient estimation techniques for queues

725-

1.2 1"0"

~

i

0.8

0 = 2.0

=e

0.6

p =0.5

o o..il

4.

0.0

|

I

0.1

0.2

8

=



I

I

l /

0.3

0.4

0.5

Fig. 5. U/We/2 Server 1.

similar to the previous two cases hold here, except that no improved versions of the SPA estimator can be implemented, because the arrival process is not Poisson. 5. U / W e / 2 (refer to Fig. 5): The observed behavior is basically the same as that found in the M / W e / 2 queue example, with the SPA estimator clearly baised, and the curves other than the SPA estimator basically coinciding. 5. SUMMARY AND CONCLUSIONS We have c o m p a r e d the implementation and performance of various perturbation analysis and likelihood ratio gradient estimators for the problem of derivative estimation for a single-queue system with non-identical servers. The exact SPA estimator is generally not practical except for low traffic conditions, so we proposed an approximation similar in spirit to the one in Fu and H u [12], where an a p p r o x i m a t i o n performed well and is exact in the M a r k o v i a n case. We then performed a set of simulation experiments for more general distributions to c o m p a r e the various estimates. The bias of the approximate SPA estimator appears to be unpredictable, and, in fact, sometimes does worse than the I P A estimator. F o r the cases where it does work, it appears to exhibit lower variance than the LR estimate. Overall, we would recommend the use of the much simpler and easier-to-implement I P A estimator over the approximate SPA estimator when the service rates are relatively close; otherwise, the regenerative LR estimator is to be preferred over the approximate SPA estimator in all but the Markovian cases. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. I I.

M. C. Fu, Optimization via simulation: a review. Ann. Ops Res. 53, 199-247 (1994). Y. C. Ho and X. R. Cao, Discrete Event Dynamic Systems and Perturbation Analysis. Kluwer Academic (1991). P. L'Ecuyer, An overview of derivative estimation. Proc. Winter Simulation Conf. pp. 207-217 (1991). S. Strickland, Gradient/sensitivity estimation in discrete-event simulation. Proc. Winter Simulation Conf., pp. 97-105 (1993). P. Glasserman, Gradient Estimation Via Perturbation Analysis. Kluwer Academic (1991). M. C. Fu and J. Q. Hu, Bias properties of infinitesimal perturbation analysis for multi-server queues. Proc. Winter Sinndation Col!/~, pp. 377-381 0990). M.C. Fu, J. Q. Hu and R. Nagi, Bias properties of infinitesimal perturbation analysis for systems with parallel servers. Computers Ops Res. 19, 409-423 (1992). W, B. Gong and Y. C. Ho, Smoothed perturbation analysis of discrete-event dynamic systems. IEEE Trans. Automatic Control AC-32, 858-867 (1987). M, C. Fu and J. Q. Hu, Extensions and generalizations of smooth~l perturbation analysis in a generalized semi-Markov pro,~ss framework. IEEF, Trans. Automatic Control 37, 1483-1500 (1992). M. C. Fu and J. Q. Hu, Consistency of infinitesimal perturbation analysis for the GI/G/m queue. Eur. J. Opl Res. 54, 121-139 0991). R. Suri and M. A. Zazanis, Perturbation analysis gives strongly consistent sensitivity estimates for the M/G/I queue. M#mt Sci. 34, 39-64 (1988).

Michael C. Fu et al.

726

M.C. Fu and J. Q. Hu, Second derivative sample path estimators for the GI/G/m queue. M.qmt Sci. 39, 359-383 { 1993). P. W. Glynn, Likelihood ratio gradient estimation for stochastic systems. Communs of the ACM 33, 75-84 0990). M. I. Re)mann and A. Weiss, Sensitivity analysis for simulations via likelihood ratios. Ops Res. 3% 830-844 {1989). R. Y. Rubinstein, Sensitivity analysis and performance extrapolation for computer simulation models. Ops Res. 37. 72-81 (1989). 16. F. J. V~.quez-Abad and P. L'Ecuyer, Comparing alternative methods for derivative estimation when IPA does not apply directly. Proc. Winter Simulation Conf., pp. 1004-1011 (1991). 17. P. L'Ecuyer, A unified view of the IPA, SF, and LR gradient estimation techniques. Mgmt Sci. 1364-1383 (1990). 18. M. C. Fu, J. Q. Hu and R. Nag), Gradient estimation for queues with non-identical servers. Institute for Systems Research Technical Report, TR 93-1 !, University of Maryland at College Park (1993). !2. 13. 14. 15.

A P P E N D I X : P R O O F O F T H E O R E M 2 - - - U N B I A S E D N E S S F O R T H E M A R K O V I A N CASE We calculate analytically the expected value of the SPA estimator in steady state, and show that it equals the derivative of mean steady-state system time. We do the proof for 0t, with a symmetrical proof following for 02. Let ). be the rate of the Poisson arrival process, and/at = l/0t and #2 = 1/02 be the service rates at the exponential servers 1 and 2, respectively. Recall that if both servers are available, then the first server is chosen with probability p and the second server with probability q = 1-p. We introduce the following notation: ~=/at +#2

p=)./# /a.= /al/*2 (/at+/a2+2).~ /at +/a2 \P/a2 +q/at + ) - / ' Since the system can be represented as a continuous-time Markov chain, the stationary state probabilities p,, s¢ {0,(1, 0),(0, I), 2.... }, can be easily found to be l--p

po =

{18}

I --p +).//a*

Pt.o

).(P/a* + P/a2) Po /a*(/at +).)

(19)

Poa

).(q/a* + P/a, ) Po #*(/a2 +).)

(20)

). p,=--~p~-tpo;

(21)

n=2, 3. . . . .

(22) Hence, the expected number in the system is given by

E[N] =

).//a*

(23)

(1 -p)(l - p +).//a*)'

and applying Little's Law, the expected system time is

E[T] =

l//a* (! --p){! -- p +)./**)"

(24)

Differentiating (24), we get dE[T]= (p+;.O2XOt +02) (1 +2).02) dOt ( l - p ) O - p + ).//a*~Ot +02 + 2).Ot02) (I-p){I - P + ).//a*)[Ol+O2 + 2).OtO2)/a*

-~

pOt + ).0102 + Ozq ).0~ + (1 --pXl --p+).//a*l(Ot +02+220,02) (0 t +02)2(I --p)2(I --p +)./It*l/a* 201

).{p+20~XOt+02)

).11+22021

21pOt+).O~Oa+O2q)~ ']

(Ot+O'l)--"~t+ Ot+O2+2/'Ot02 (Ot+O2+2)'OtO2)l't* + ~ (! --p){1 - p + )./1,*)21~*

125)

G r a d i e n t estimation techniques for queues

727

We proceed to check that in steady state, the expectation of the SPA estimator given by equation (6)gives the mine expression. We consider a customer with index i* that arrives in steady state, having system time T. We denote the customer by Ci,. We have

\dOIlsr ^

\dOll

(

J l_l-C,(zi)J

Ld0,1

+qr.fiez.: / ~ / - - I / ¢ 1 . , Ld011

/~/ - /t/x~. J L F=l.~i+yi)- F~lxllJ J

1261

W e first compute the expectation of the I PA part of equation (26), d T/dO,, given by equation (1). For exponential services with means 01 and 02 at the two servers, respectively, we have

dXs=fxs/o d0~

ifS(j)= 1. ifS(j)= 2.

(0

(271

Thus, for C~,, equation (1) becomes

dT

1 ~L,.)X~+Xi.

ifS(i*)=l,

= ~ (0

if S(i*) = 2.

(28)

But ~./~L,7 Xj + X : is simply the time from the beginning of the local busy period to the departure of Cf, so we can rewrite equation (28) as dT

ifS(i*)=l,

I f~'~+T =~ < (0

ifS(i*) = 2,

1291

where ~b(t) is the length of the local busy period at server I at the arrival time of C:. Hence. r dT -I

I

Ld01J

01

El--/=--(~")+EETI

S(i*) =

I])P{S(i*)= 1}.

1301

We calculate the various quantities in equation (30) by conditioning on the state found upon the arrival of C:. in Fu and Hu [10], reversibility was used to do similar calculations, but for the ease of unequal servers, we can no longer use this technique, due to the unaggregated states (0, 1) and ( 1, 0). We instead use the embedded Markov chain and conditioning. In the embedded Markov chain, "time" is indexed by the occurrence of arrival and departure events. Let n* be the discrete-time index of the event just prior to the arrival of Cf, so that C:, arrival is event n* + I, and let Z, be the state of the embedded Markov chain at time n. We will be conditioning on Z... We calculate ~'~ by first calculating the quantity Zc*~,defined as the length of the local busy period at server at the event just prior to the arrival time of C:. Let Z~=(~t~lZ,.=s), i.e. X~ is XTM under the condition that the state found upon the arrival of C: is s. Similarly, let ~b~= (~b~lZ .. = s), i.e. ~b=is ~"~ under the condition that the state found upon the arrival of Ci" is s. First, by definition, we have

Zo =Zo.i = ~ o = ~o4 =0.

(31)

We also have the following relationships between 7,, and ~, for the other states: 1

~LO=/.LO+/~ 1+,;., 1

~" = / " + / ~ + 2 "

(321

n~>2.

1331

We now derive expressions for the remaining Z, by conditioning on the state at n* - I and obtaining a set of linear recursive equations similar to Poisson's equation. The desired conditional probabilities are computed simply by Bayes" Rule:

PlZ.-,=tlZ.=s}

P{Z,=slZ,-I =tip, P.,

Thus. we have ;.

/

i ~

;.

/p,o~/

l

--|Z3 + ':---~ )+.-----7 [ . . . . l l Z t o + . - - - - - | /'2--~;.+/l\ /.+/~/ /.+/.l \ P2 , / \ " / - + / 1 1 /

ZLo=:.+/q

Z2+ 1

Z.=/.._~ +[~_;,.,

n>~3.

,

(34)

(35)

1361

M i c h a e l C. F u et al.

728 Solving equations (341, (35) and (36), we obtain

(f~+.,+P,.o~

2

Z2 =m(;.+~-"-~ \ ~-;-

Zl,O=

.

#2f, ^ ^

.

#,(,.+ #x#-,.) n--2

7., = Z2 +/~_,;.,

(37)

P2 /

( -p2 "X~ -

;4,2

#,(:.

(38)

+ #,)'

n~>3.

1391

We now consider the second term in equation 130), E[TiS(i*)= 1], again by conditioning. Defining r , = E [ T I S(i*)= I,

Z,,. = s], we have ro = 01,

(40)

1:1.o=0,

(41)

to.2 = 01,

(42)

n-I

~.=

+01 ,

n~>2.

(43)

Lastly, we must also express the probability term in (301, P{S(i*)= l }, conditioned on Z¢ =s. Defining .~ = P{S(i*)= 1 [ Z,, =s}, we have :/o = P, (44) .:~t.o = O,

(45)

.~o.2 = 1,

(46)

#1

.~, -

,

n/> 2.

147)

#t+#2

Finally, unconditioning the terms in equation (30) by summing over all the states, we get

fdT\

E| |--

1

=--

\dO,: o2

)-'. #t

~

1

/

n-2

n- I

~PoP+Po,] + #1+l.t2,,=2p.'~ltt~2+f,_)+ =pop+pot+

/

~ +01 /

,4,,

#i

lq+#2k#*Jkl--p/\

\#--/.

#~t-p~

We now proceed to the other terms in equation (26). By definition of the sets of L~,~), k = 1, 2, and ~b~,we have

Er dT' ieLI2n]=dpL° J

Ld02

149)

02 '

VdTil 11,7 ¢3 E -- i~L 2 =-Ldotl

J

02,

(50)

Of course, the probability terms in equation (26) also follow from the definition of the sets of L(k1~, k = l, 2. For L(~n, the condition requires a visit to state 3, followed by a departure at server 1 and then a departure at server 2. For L(22~, the conditions requires a visit to state (1, 0), followed by a departure at server 1 and then an arrival. Thus, we have

p,,:.12,,_(. #2+;.~( #, ~,1, lq(l-p)(Plt*+P#2) 2,~2 s-/vl.o ~tt----~/I,p /- / \ # 2 + / - /

\

P{iEL~I'~=

(#2 +).)(#*(I -p)+).)"

p3 ~/. /...f~+~/\i,+~ /

(51)

(52)

Since interarrival times and service times are exponential, we have .q(2i)

--=2,

(53)

1 -- G(7.i)

j

:2,.,

Lre(x,+y~)-F~{x,)J

"2

1=,.+,,2+,,2.

I._1-expi-#2y~)J

{541

the latter following because by definition of the set L(~1), ),~~ ( m i n ( X 2, A)lminiX j, A ) > X2), where A ~ G, X t ~ F 2, X2 ~/:2.

Gradient estimation techniques for queues

729

Finally, we compute A1 as follows. Let A ~ be the expectation of the integral of the number in system process from state i to and M~.j be the corresponding expected number of service completions. These quantities can be computed from the Markov chain by conditioning on states and obtaining a set of linear recursive equations, as was done to obtain Z~- We omit the details here. After solving the equations, we obtain

j,

A2_,t.o,=((pl~2+).)#' + 2+~)/[1t)'(3~-2)'/[^ )~

(p,u2 2)]';'/~' + ~

A~o.11~..o~= ( 1 + 2A 2~.,oj)/(P/12 + 2) M2~II.ol =

/~q"

/,t 1/,t2( p/.t 2 + ,.;.)

F

1 -- ,;,//~

/~--

(p/~2 + 2,/

MIo. :~ ~11.0~ = (#2 + ';-M2 -~1.0~)/( Pg2 + 2).

Then, l~ 1 = A,o.1)-H.o~ - M , o .

11~(I.o~E[T].

(55)

Using equations (48), (49), (50), (51), (52). (53), (54). and (55). the expression for the expectation of the SPA estimator, given by equation (26), was computed and compared with the analytical derivative given by equation (25) by using Mathematica. which verified that the expressions were equal. []