Optimal assignments in a markovian queueing system

Optimal assignments in a markovian queueing system

Compur. & Gps i&r. Vol. 8. pp. 1:-Y Pergaman Press Ltd.. MI. Printed in Great Bnwin OPTIMAL ASSIGNMENTS IN A MARKOVIAN QUEtJEING SYSTEM JAMESP. JARVI...

653KB Sizes 0 Downloads 69 Views

Compur. & Gps i&r. Vol. 8. pp. 1:-Y Pergaman Press Ltd.. MI. Printed in Great Bnwin

OPTIMAL ASSIGNMENTS IN A MARKOVIAN QUEtJEING SYSTEM JAMESP. JARVIS* College of Sciences,

Department

of Mathematical Sciences. Clemson U.S.A.

University.

Clemson.

SC 1’-Wl.

Scope and purpose-A typical problem in the design of a service system is the assignment of servers to customers. In the context of emergency services such as fire, police. and ambulance systems, the servers may be classified by location or specialized training and the customers (calls for service) by location or type of incident. Such systems can be analyzed using the theory of queues. This paper addresses the problem of determining the assignment of servers to incidents which minimizes the average cost of assignment H hen individual servers may be unavailable due to previous assignments. Applications in emergency services include the minimization of average response time to an incident or the fraction of incidents incurring a queueing delay. Abstract-A generalization of the Hypercube queueing model for exponential queueing systems is presented which allows for distinguishable servers and multiple types of customers. Given costs associated with each server-customer pair, the determination of the assignment policy which minimizes time-averaged coFts is formulated as a Markov decision problem. A characterization of optimal policies is obtained and used in an efficient algorithm for determining the optimum. The algorithm combines the method of successive approximations and “Howard’s method” in a manner which is particularly applicable to Markov decision problems having large, sparse transition matrices.

INTRODUCTION

A variety

of models for resource allocation

past. Many among

in public systems have been developed

such models are noted in the survey

these

is the

“hypercube”

queueing

article

model

by Chaiken

developed

by

in the recent

and Larson[3]. Larson[B]

Notable

and used by

Chelst[4], McKnew and Jarvis[lO], and others in the analysis of urban emergency services such as fire, police, and ambulance. Jarvis[6,7] developed a generalization of the hypercube model to include a class of Markovian formulation

is presented

of servers

to customers

distributed

systems

customer,

which

where

experience

minimizes

is equivalent

for the two-server is presented

some characteristics programming

system.

to an independent independent

of an M/M/N

servers and customers.

to determine

cost

by Carter solution

distance et al.(2]

algorithm by Varaiya

For from

spatially server

to

and expanded

by

and computational

server case. The solution developed

A

the assignment

of assignment.

of travel

algorithm

has

et a!.[ I41 using a

FORMULATION

queueing system. Customers

are classified

classes. Requests for service from each customer

stationary

are indexed 0 to N-l

multiple

to those of a procedure

MODEL

exclusive

An efficient

theory

approach.

We consider a variation

R mutually

in terms

to that given

here for the general

similar

decision

the average

costs are measured

the formulation

Wrightson[lS]

nonlinear

queueing systems having distinguishable

here which uses Markov

into one of

class p.rrive according

Poisson process, the nth class having call rate A,,. The N servers

inclusive,

the ith server having an exponential

of the type of customer being served or the past history

server is assigned to each customer

if a server is available.

service time with mean l/p, of the system. Exactly

No preemption

is allowed.

one

Initially

we restrict ourselves to the loss system (zero line capacity). Customers arriving when the system is saturated (all servers busy) are assumed to be irrevocably lost or served by some means extraneous to the system being examined although the system may incur a cost associated with such customers. This system can be modeled

as a continuous-time

Markov

process as follows.

The state

*James P. Jarvis received his Ph.D. from the Massachusetts Institute of Technology (1975) in the field of Operations Research. His research interests include resource allocation in public systems such as emergency services and recreation systems and the development and implementation of algorithms for solving such problems. He is an Associate Professor of Mathematical Sciences at Clemson University, teaching courses in Operations Research and related topics. I7

J.

I8

P. J.ARL.IS

space has 2” elements, each state being a binary vector of length N where the ith element (zero origin, indexing from the right) is zero if the ith server is idle and one if the ith server is busy. Transitions between states involve a change in status of exactly one server. ,4n upward transition refers to an idle server being assigned to a customer (an element of the binary vector changes from 0 to 1); a downward transition (1-O) refers to a service completion and, consequently, a busy server becoming idle. Downward transition rates depend solely on the service rate of each particular busy server while upward transition rates are determined by the assignment rule which specifies the server to be assigned to each class of customer as a function of server availability. For convenience, index the states from 0 to ?‘-I with state s having binary representation the N-vector, B(s). The assignment of servers to customers in state s can be specified by an assignment vector K(s) of length R, where the nth component of K(s), denoted by K(s, II), is the index of the server to be assigned to customers of type n in state s. The state transition diagram for a representative system (N = 3, R = 4) is given in Fig. I. For example, in state 4, binary (100). servers 0 and I are idle; server 2, busy. With K(4) = (001 I), customers of type 1 and 2 are assigned to server 0; types 3 and 4 to server 1. With these definitions, we can write the equations of balance for the continuous-time Markov process. Let P(S) denote the steady-state probability of state s; the total customer arrival rate, A, is the sum of the individual rates An.Then, P(S) . (a(~) . A + W(S))=

2

j: B(s. j)=O

~~. Pb

+ 29+

C PC.3- 2i)m:K,33i m)=iL

(W

j:Bls,i)=l

fors=O,l,...,

2N-l.

(lb)

Fig. I. State transition diagram for 3 server, 4 customer system with K(0) = (0011). K(I) = (2111). K(2) = (0020). and K(4) = (001 I). Customers can be assigned only fo.an idle server. Hence, in state 0, all servers are available; in state 1, only servers I and 2, in slate 2, only servers 0 and 2. in state 3. only server 2; etc. A 3 server, 4 customer system must have K(3) = (2222). K(5) = (I I I I). and K(6) = (0000). as only one server is idle in each of these states.

Optimal assignments in a Markovian

19

queueinp system

where 6(s) = 0 for s = TN _ - 1 and 6(s) = 1 otherwise. W(s) is the sum of the service rates for those servers busy in state 5.

c

W(s) =

Iii.

i:B(s.i)=l

The coefficient of P(s) on the left side of (la) is the total rate at which the system leaves state s and is policy independent. The first summation on the right corresponds to downward transitions into state s (service completions): the second summation details upward transitions into state s (customer arrivals). Note that each state is adjacent to exactly N other states. A detailed discussion of the solution of these equations and their use in calculating system performance measures is given elsewhere [8]. There is a unique solution to (I) since the Markov process consists of a single recurrent class. To focus on the cost of assigning servers to customers, we introduce two cost terms. Let C(i, n) be the cost of assigning server i to a customer of type n. In addition, let S(n) denote the cost associated with a customer of type n arriving when the system is saturated (all servers busy, state 2N - 1). The expected cost of a transition out of state s under assignment K(s), denoted g(s, K), is given by R

4(% K) = nz,C(K(s, n), n) 9&/(A + W(S)) for s = 0, I, . . , ZN - 2.

(3)

The term &/(A + W(s)) is the probability that a transition is due to the arrival of a customer of type n. A slight modification is made for the saturation state 2N - I. 4(2N - 1, K) = 2 S(n). &/(A + W(2N - 1)). Note that the costs associated with the saturation state arise from a virtual transition into state TN c - 1 and is policy independent. The average cost per unit time, g, can be written 2N-I g =

c P(s). s=O

q(s,

K).

0

+

(5)

W(s))

where P(S) is the equilibrium probability given in (1). SOLUTION

PROCEDURE

In this setting, Markov decision theory can be applied to determine the optimal assignment policy to minimize the long-run cost of assignment. However, even for moderate-sized systems, a straight-forward application of Howard’s algorithm[5] is computationally infeasible. For example, in state 0 alone, each of the N servers could be assigned to any of the R customer types. Hence, there are NR alternatives to consider. For emergency services applications, a small system might have 4-8 servers and 30-50 customer types. This yields a minimum of 430or roughly IO” alternatives. Another approach is indicated. It can be shown (e.g. Ross[l2]) that a necessary and sufficient condition for the determination of an optimal policy with cost g’ is the existence of state values u(s), s = 0, . . . , ZN - 1, such that

2

(6(S). A + W(s)) *u(s) = m:[q(s, K) . (A + W(s)) +

pi . 4s

- 2;)

j:t?(s.j)=l

+

2

i: B(s.j)=O

u(s+2’)

c

A,-g’]fors=O,l,...,

3N-l,

m:Kkm)=j

where the optimal assignment achieves the minimum over all assignment vectors K in state S. This functional equation can be used to obtain a characterization of the optimal policy.

20

J. P. JARVIS

Consider a state s in which there is a policy alternative; that is, at least two servers idle. Index these servers i and j. Let K’(s) denote the optimal assignment in state s and suppose that K’(s, n) = i; customers to type n are assigned to server i. We consider the alternative decision K(s) where K(s) = K’(s) except K(s, n) = j; type n is assigned to server j. K’ achieving the minimum in (6) implies

0 + W(s))’ q(s, K’)+-

c

vfs+ 2”)

a:B(s,a)=O

c

(7)

a:B(s.n)=i

m:Kfr.mt=o

A,

m:K'(s.m)=a

Canceling terms which appear on both sides of (7), we have C(i,n)+v(s+2i)S

C(j,n)iu(sf’Zi)

Of

C(i, n)- C(j, n)S u(S‘+~~)- v(s

+2’).

(8)

Using (8), it is a simple matter to prove: If C(i, n) - C(j, n) > C(i, m) - C(j, mf, then the optimal assignment rule will never assign customers of type n to server i and customers of type m to server j in any state where both servers i and j are idle. Suppose the contrary. Then in some state s, K’(s, n) = i and K’(s, m) = j. Using (8), v(s + 2j) - v(s -I-2’) 2 C(i, nf - C(j, n) > C(i, m) - C(j, m) 2 u(s + 2’) - v(s + 2’) which is a contradiction. Heuristically, if in any state it is unprofitable to pay the difference Cfi, ml-- C(j, m) to assign server i to customers of type m rather than server j, then it would be inconsistent to pay the larger difference C(i, n) - C(j, n) to assign server i to customers of type n instead of server j. This type of categorization is possible since service times are independent of customer type. The result is similar to those developed by Albright[l] but the costs and service times employed here need not possess the structure required by Albright. In the case of geographically derived systems where the costs represent travel times or distances, the characterization of the optimal policy has a particularly intuitive interpretation. The cost differences determine the boundaries of assignment regions for spatially distributed servers. In state s, a customer at “location” n would be assigned to server i rather than j if the travel distance from i to n minus the travel distance from j to n was less than or equal to u(s + 2’) - v(s + 2’). This interpretation parallels that given for the two-server case by Carter et al. [2]. Using the characterization of the optimal policy, Howard’s iteration in policy space can be applied in the usual manner except all policies need not be considered in the policy improvement phase. Letting A denote the matrix of transition rates for the continuous time Markov process under a current policy and Q the diagonal matrix with elements (6(s). ,I + W(S)), we solve the system of equations g=D*q+A.v

(9)

for g and v (setting some v(s) equal to a constant). Now, for each state s with server i idle, we assign customers of type n to i provided that C(i, n) f v(s f 2’) 5 C(j, n) + ~4s + 2’) for every other available server j. Using the new policy determined in this manner, return to (9)

Optimal assignments in a Markovian

queueingsystem

!I

continuing until no changes in policy are indicated. The optimal assignments have been determined. Unfortunately, this solution procedure is of limited practical use because of the large linear system to be solved at each iteration for even moderate N. Instead we use a scheme of successive approximations for Markov renewal programs developed by Schweitzer[l3] following a procedure given by Odoni[ 1I] for discrete time systems. In the usual notation, consider a Markov renewal problem with states s = 0, I _ . . . . TN _ - 1. Under decision K in state s, we have expected costs q(s, K), transition probabilities p(s, t, K), and expected holding times T(s, K). Then ~t(s. x) and y(s, x) are defined at the xth iteration by Y-l

Y(Kxl = mjntq(s,K)+ [T,pfs,t,K) . ~0. X) - N’(s, x)]/T(~, K) for s=O,I ,...,

(loaf

P-l,

and w(s, x i- I) = w(s, x) f T . [y(s, x) - y(0, I)]

fors=l.7

. . . . . Y-1,

where ~(0, x) = 0, 0 < T < min[T(s, K)/( I - p(s, s, K))], and the w(s, 0) are arbitrary. Then w(s,x) converges to u(s), L”(X) =max[y(s,x)] decreases monotonically to g’ and L’(x) = min[y(s, .r] increases monotonically to g’ where the u(s) and g’ satisfy (6) if the Markov chain is irreducible under every policy. This algorithm is modified to solve the assignment problem in the following manner. As noted in determining a characterization of the optimal policy, it is not necessary to consider all possible decisions in order to find the minimum in (lOa) at the xth iteration, Instead, if server i is available in state s, we assign to it those customer types n for which C(i, n) - C(j. n) 2 w(s + Zi,X) - w(s + 2’, x) for any server j who is idle is state s. After some simpli~cation, the condition on T becomes O< T < min [I/h, l/p] where p = W(2” - I). Note that the queueing system defines an irreducible Markov process under every policy since all customers are provided service if any server if idle. (The state with all servers busy communicates with every state). Hence convergence is guaranteed. In practice, if the minimization in (10a) does not result in a policy change at a particular iteration, the iterative scheme can be applied to the current policy as a value determination procedure for several steps without checking for policy improvement. The w(s, x) and bounding terms will converge toward the relative values and gain associated with the current policy. These relative valuescan be viewed as new initial conditions for the full iteration (10). The algorithm is terminated on the basis of the bounds for thegain of the system. Hence suboptimal policies may be used provided their associated gain is within an arbitrary prespecified tolerance of the optimal value. It should be noted that this iteration scheme is somewhat intermediate between the usual successive approximations procedure and Howard’s iteration in policy space. in effect, since large sparse systems of linear equations are ordinarily solved by iterative techniques, we use the intermediate information obtained from such iterations to determine improved policies. When no policy improvement occurs, the iteration for the solution of the linear system is continued alone. The bounds generated on the gain, whether for the Markov decision problem as a whole or for a particular policy provide the mechanism for determining convergence. This procedure should prove useful in any Markov decision problem having a large, sparse transition matrix. Although the formulation given above is for the zero line capacity case, the non-zero case can be handled as well by considering the system as a semi-Markov process. All transitions are as defined above except that self-transitions in state ‘N L - I (saturation) correspond to excursions of the Markov process into the states with customers waiting for service. Equation (4) for the expected cost of such a transition is multiplied by the expected number of upward transitions before returning to state _ TN- I (all calls arriving during saturation incur the same cost). The term T(P’ - I, K) is modified similarly. For example, with a total of M waiting positions and utilization p = Alp, the number of transitions is I!( I -p) for M = r, p < 1:

22

1. P. JARVIS

(I-(l+p).~~)I(l -p) for lM
COMPti-fATIONAL

EXPERIENCE

Typical numerical results for a sample of eleven randomly generated problems are presen in Table 1. Under low congestion, the optimal assignment is to that available server with lowest cost of assignment. As congestion increases, two phenomena are observed. First, it may be more efficient in the long run to assign other than the “minimum co server to a particular call in order to leave the system in a state which better anticipates fut events. As is usually the case in Markov decision problems, the long-run optimum can different from the best myopic or short-run decisions. The second characteristic concerns saturation state (all servers busy). Depending on the relative cost of calls arriving dur saturation as compared to the ordinary assignment costs, it may be advisable to avoid (fav the saturation state. Hence the optimal policy may tend to assign fast (slow) servers in order bias the state distribution away from (toward) the saturation state. The results in Table 1 are for systems having intermediate congestion; server utilizatior the total call rate divided by total service rate, is equal to one-half. Generally. the decrease the cost of assignment provided by the optimization is only a few percent as compared to myopic optimum. Such behavior is particularly characteristic of geographically derived syste of certain types, such as urban police systems and fire departments, where the call ty correspond to the locations of requests for service and the cost of assignment to travel time distance from server to customer. It has been noted by Larson and Stevenson[9], Jarvis[t and Varaiya et a/.[14] that average travel distance for multiserver, geographically distribu systems is relatively insensitive to changes in the assignment scheme. Besides decreasing the cost per call, the optimal policy has the additional benefit of tend to equalize the workloads (fraction of time busy) of the servers. Heuristically, the optii policy incurs slightly larger immediate costs by assigning other than the myopic optimal cho and, in doing so, avoids overworking servers and facing higher costs when those servers are available to service otherwise expensize customers. In practical usage, the optimization appears to be of little benefit for minimizing costs many spatially derived systems. For emergency services, the uncertainty in estimates of syst parameters would often exceed the expected improvement in cost afforded by the optimizati The utility of the approach would be largely restricted to workload balancing or reduc saturation probabilities at no cost increase or in situations where a cost improvement of a f percent is judged significant. A measure of computational effort for each of the sample problems is given by the C seconds required to solve each system in Table I. All results were obtained from a PI program run on an IBM 370/165 computer. The CPU time required to solve a 12 server syst Table cost

N

R

4 4 4 6 6 6 8

5 IO I5 10 20 30 10

I .7% 0.5 1.9 1.9 2.1 2.5 I.1

: IO IO

30 20 2s 50

2.4 2.3 3.9 3.8

I. Typical

Saturation

5.4% 4.3 8.0 - I.? 3.2 7.0 2.6 4.5 7.7 14.4 14.8

numerical Workload

20.97C 31.1 Il.3 26.0 32.2

26.5 40.8 39.0 30.2 41.8 51.8

results POL-IMP 9 iter 8 8 I9 28 32 63 57 63 IO6 IIS

CPU 0.30 set 0.33 0.37 I.11 I.81 2.73 10.30 IS.43 23.15 155.92 312.99

Percentage decrease in cost per call, saturation probability, and maximum workload imbalance in changing from the myopic to the global optimal policy for eleven systems. POL-IMP is the number of policy improvement steps made before convergence to the optimal assignments. CPU is the computer execution time in seconds (IBM 370/165). All system parameters were generated ar random from a uniform distribution on zero-ten.

Optimal assignments in a Markovian

queueing system

13

is estimated at ten minutes. X rough practical limit of problem size is curently II servers and 50 customer types using the algorithm described above (a system of roughly 4000 states with 50 assjgnments to be made in each). SUMM.fiRY

We have presented a generalization of the Hypercube queueing model for multiserver exponential queueing systems. The determination of the optimal assignment of customers to servers is formulated as a Markov decision problem. A characterization of the optimal policy in terms of the differences in assignment costs is obtained and incorporated in an algorithm to find the optimum by a variation on the method of successive approximations. The optimization also tends to equalize workload imbalances and saturation probabilities in the system. Acknow(rdgements-This problem was suggested by my thesis advisor. R. C. Larson, and was largely completed while an NSF Fellow at the Massachusetts Institute of Technology. The computations presented herein were completed at the Clemson University Computer Center. REFERENCES I. S. C. Albright. Structure of optimal policies in complex queueing systems. Ops Res. 26. IOX-1027 (1977). 2. C. M. Carter, J. M. Chaiken and E. Ignall. Response areas for two emergency units. Ops Res. 20, 571-594 11972). 3. J. M. Chaiken and R. C. Larson. Methods for allocating uban emergency units: A survey. Management Sci. I!+. ii~l3O(l~~). 4. K. R. Chelst, Truns~e~ng tile Hyper~~be Queue&g coded to the New Haven Police ~pu~rne~t: A Case Study in Technoloav Trunsfer. The New York citv Rand institute. WN-9034-HUD (1975). I. R. A. Howard, &rrumic Programming and Markou Processes. MIT Press’Cambridge. MA (1960). 6. J. P. Jarvis. Opiimal Dispafch Policies for Urban Server Sysfems. Operations Research Center. JRP-TR-02-73. Massachusetts Institute of Technology (1973). 7. J. P. Jarvis. Opfimizafion of Stochastic Service Sysfems with Distinguishable Servers. Operations Research Center. IRP-TR-19-75, Massachusetts institute of Technology (1975). 8. R. C. Larson, A hypercube queueing model for faciiity location and redistricting in ubran emergency services. ~o~~~r. Ops Res. 1.67-95 ( 1974). 9. R. C. Larson and K. A. Stevenson, On insensitivities in urban redistricting and facility location. Ops Res. IO, j95-612 Il9721. IO. M. A.‘McKnew and J. P. Jarvis. Applying the hypercube in Arlington, Massachusetts. In Police deploymenf: New fools for planners (Ed. R. C. Larson), Lexington Press, Lexington, MA (1978). it. A. R. Odoni. On finding the maximal gain for Markov decision processes. Ops Res. 17.857-840 (1969). 13. S. M. Ross. AppEed ~~o~u~i~ify Models wifh Opfim;za~ion App~icafioRs. Hotden-Day, San Francisco (1970). 13. P. 1. Schweitzer, iterative sot&an of the functional equations of undiscounted Markov renewal programming. 1. ~fathemu~ics Analysis and Appfi~afions 34.495-501 (1971). 14. P. Varaiya. U. Schweizer and J. Hartwick, A Class of Murkovian Problems Relafed to the Districting Problem /or Urban Emergency Services. Electronic Systems Laboratory, ESL-P-594. Massachusetts Institute of Technology (1975). IS. C. W. Wrightson, The design of response areas for emergency service units. Paper presented af Joinf MUI. Meefing, ORSAlT/MS San Francisco (1977).