Computers & Industrial Engineering 61 (2011) 842–847
Contents lists available at ScienceDirect
Computers & Industrial Engineering journal homepage: www.elsevier.com/locate/caie
Analysis of an unobservable queue using arrival and departure times q Jinsoo Park a, Yun B. Kim a,⇑, Thomas R. Willemain b a b
Department of Systems Management Engineering, Sungkyunkwan University, Suwon, South Korea Department of Decision Science and Engineering Systems, Rensselaer Polytechnic Institute, Troy, NY, USA
a r t i c l e
i n f o
Article history: Received 27 October 2010 Received in revised form 23 May 2011 Accepted 26 May 2011 Available online 31 May 2011 Keywords: Queueing Queue inference
a b s t r a c t Suppose one wants to assess the efficiency of a queueing system but is unable to observe directly its internal operations. This situation might arise if one works with a restricted set of historical data or because secrecy restricts access to the queueing facility. One might still be able to observe, from outside the system, the exact arrival and departure times of each customer. Using observations of arrival and departure times and knowledge of whether the service discipline is either first-come-first-served or last-come-first-served, one can exactly reconstruct the unobserved queue delays and service times of any sequence of arrivals during busy periods when the number of customers is greater than the number of servers. If the number of servers is also unknown, it too can be estimated. In this paper, we propose optimization models which determine the unknown number of servers. Ó 2011 Elsevier Ltd. All rights reserved.
1. Introduction This paper is in the tradition of queue inference studies. Much of that work derives from Larson’s seminal paper on the Queue Inference Engine (QIE) (Larson, 1990, 1991), which assumed that the only available data are the times at which service starts and stops. Our assumptions are complementary since Larson assumes only the internal operation of the service facility is visible, while we assume system arrivals and departures are visible but both the queue and the internal operations of the service facility are invisible. Since Larson’s original work, a number of related papers have appeared which improve computing times and/or generalize assumptions. Bertsimas and Servi (1992) reduced the calculation time of Larson’s method and extended it to include time-varying Poisson arrivals. Daley and Servi (1992) further extended the arrival process to Erlang-k interarrival times. Dimitrijevic (1996) presented an algorithm for inferring the most likely length of an M/G/1 queue from service completion times. Bingham and Pitts (1999) inferred the arrival rate and mean service time from queue length data non-parametrically. Jang, Suh, and Liu (2001) estimated the waiting time in GI/G/2 systems from server observations. Jones (1999) inferred balking behavior with Larson’s QIE. Mandelbaum and Zeltyn (1998) estimated the queues of open Markovian queueing networks from service data and a Poisson arrival assumption. Toyoizumi (1997) presented a new proof of Sengupta’s invariant relationship between virtual waiting time and q
This manuscript was processed by Area Editor Paul Savory.
⇑ Corresponding author. Tel.: +82 10 33624419; fax: +82 31 2907610. E-mail address:
[email protected] (Y.B. Kim). 0360-8352/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.cie.2011.05.017
attained sojourn time. Ross, Taimre, and Pollett (2007) proposed estimates of arrival and service rates from queue length data. Pickands and Stine (1997) estimated the arrival rate and holding time distribution of a discrete time queue using information about the size of the queue. To our knowledge, the work presented here is the first to draw exact inferences about queueing and service times from arrival and departure times when the number of servers is unknown. This paper proposes new approaches that can analyze the unobservable queues using external observations. Suppose one wants to assess the efficiency of a queueing system that one is unable to observe directly its internal operations. One must then exploit the arrival and departure times. Given exact arrival and departure times of each customer starting from an empty and idle state, the number of servers, and either a first-come-first-served (FCFS) or last-come-first-served (LCFS) service discipline, then one can exactly compute both the time in queue and the time in service of every customer in a busy period. If these same assumptions hold except that one does not know the number of servers, one can accurately estimate the number of servers in addition to the queueing and service times. We can imagine several applications for this work. One is when the analyst is working from incomplete historical data. For instance, one may have the times at which jobs go into and out of a manufacturing facility, but not information about the details of the production process, including the number of parallel servers available for processing jobs (for example, one line multiple server situation). The analysis developed below can be used to infer the production capacity from minimal information available on each job. Another application involves competitive intelligence. One may be forced to observe the operation of a competitor’s facility
843
J. Park et al. / Computers & Industrial Engineering 61 (2011) 842–847
from outside because the competitor would never invite one inside. The analysis developed below would allow one to estimate the efficiency of the competitor’s processing system, as measured by queueing delays, as well as the competitor’s production capacity. Our analysis of an unobservable queuing system can also be used to analyze internet traffic. Modeling the internet traffic is difficult because web services use the packet switching and measuring service quality is complicated. Identifying the internal operation of the network makes modeling more difficult. Additionally multi-core processors are used for packet operation since most internet service providers (ISP’s) want to know packet source, destination, service type, and so on. All ISP’s have their own processing algorithms to avoid patent conflicts, which makes defining the queueing process almost impossible. From external observations of arrival and departure times, we deduce the internal operations of the underlying queueing system. Therefore, our research is a first step toward overcoming these internet modeling problems. The organization of the rest of this paper is as follows. In Section 2, we develop an optimization-based inference procedure for the first-come-first-served (FCFS) service discipline. In Section 3, we extend the procedure to the last-come-last-served (LCFS) discipline. In Section 4, we state conclusions.
2.1. The number of servers is known We assume here that the service discipline is FCFS, the number of servers is known, and we can only observe the times of arrival to and departure from the system of individual customers. We also assume that customers do not remain inside the facility after being served, so that service completion times equal system departure times. Using this information, we infer the unobservable variables of delay in queue and time in service during each busy period, i.e., defined as the time period when at least one server is busy. First, we fix notation:
Ti Ai Di
i ¼ c þ 1; . . . ; N
ð2Þ
Using (2), the service and waiting times can be calculated as
Si ¼ Di Bi
ð3Þ
Q i ¼ Bi Ai
ð4Þ
Using (3) and (4), one can deduce the individual values of service and queueing times within each busy period. Results for this simple case have previously been presented by Hall (1991) using cumulative arrival and departure diagrams. 2.2. The number of servers is unknown Now consider the case in which the number of servers is unknown. In this case, the primary objective of the analysis may be to infer the number of servers, c, though one may continue to be interested in the distributions of queueing and/or service times. Once the number of servers is found, the service and waiting times can be calculated via Eqs. (3) and (4). In order to obtain the number of servers, we observe the behavior of service times as the number of servers is changed. Theorem 1 is the key to estimating the number of servers, c. Theorem 1. The variance of the inferred service times has a unique minimum at c, where c is the actual number of servers in the FCFS queueing system, if the service times are independent.
2. Analysis for the FCFS service discipline
N D(k,m) Bi Qi c Si
Bi ¼ maxfAi ; Dðic;i1Þ g;
Total number of customers kth order statistic among the first m departure times Time service begins for customer i Queueing time (before service) of customer i Number of servers Service time of customer i (assumed to be independent) System sojourn time of customer i (Ti = Di Ai = Qi + Si) Arrival time of customer i Departure time of customer i
Proof. The reasoning behind the proof is as follows. There are two situations to consider: either c is underestimated or overestimated. In each case, we use a reductio ad absurdum argument. We try to show that if we under- or over-estimate the number of servers, the variance of the estimated service times will increase. To compensate for an incorrect estimate of the number of servers, one must shift the beginning times of service either forward or backward, which increases the variance of the service times. The case of underestimation Suppose the actual number of servers is c but we estimate it as b and b ^c ¼ c k for some positive integer k. If we denote by B S i i the estimated service beginning and service times of customer i, we can find them as we did using Eqs. (2) and (3).
b ¼ maxfAi ; Dðicþk;i1Þ g; B i
i ¼ c k þ 1; . . . ; N
b S i ¼ Di maxfAi ; Dðicþk;i1Þ g
ð5Þ ð6Þ
Using Eq. (6) into Eq. (7) to show the difference between the true and estimated values of S gives Here, N and the Ai’s and Di’s are known values (i.e., historical data), while the D(i;m)’s and Ti’s can be directly calculated from these. The capacity c may or may not be known. Our purpose is inferring the value of c (if it is unknown), the service times Si and the queueing delays Qi. Analysis is straightforward if the number of servers is known. With c known, if the system starts empty and idle, then the first c customers will begin service immediately upon arrival, so
Bi ¼ Ai ;
i ¼ 1; . . . ; c
ð1Þ
When all c servers are busy at the time of a customer’s arrival, that customer joins the queue, begins service immediately upon the next service completion, and then departs immediately after being served. Thus the general expression for the time that the ith customer’s service begins is
b S i ¼ ½Di maxfAi ; Dðic;i1Þ g ½maxfAi ; Dðicþk;i1Þ g maxfAi ; Dðic;i1Þ g
ð7Þ
The first term on the right hand side is the service time under the actual number of servers, c; it is denoted as Si in the Section 2.1. The error in the estimated service time is determined by the last term of the right hand side, which depends on the arrival time of the ith customer and the departure times of two other customers. There are three cases: the arrival occurs before, between, or after the two customers’ departures. These three cases can be described as follows: Case (1)
Ai < Dðic;i1Þ < Dðicþk;i1Þ so b S i ¼ Si ½Dðicþk;i1Þ Dðic;i1Þ
ð8Þ
844
J. Park et al. / Computers & Industrial Engineering 61 (2011) 842–847
Case (2)
Dðic;i1Þ < Ai < Dðicþk;i1Þ so
b S i
¼ Si ½Dðicþk;i1Þ Ai
ð9Þ
the mean service time. The solution is the number of servers that minimizes the spread of the estimated service times. Stated formally, the problem is:
Case (3)
Dðic;i1Þ < Dðicþk;i1Þ < Ai so b S i ¼ Si
ð10Þ
Minimize ^c
ri P 0
ð11Þ
The error term, ri, represents the sum of underestimated service times of k customers except the ith in cases (1) and (2) (see Figs. 1 and 2), or equals 0 in case (3). The independence of service times guarantees the independence of Si and ri. Generalizing these three cases, we conclude that
Varðb S i Þ ¼ VarðSi Þ þ Varðri Þ ð)ri P 0 and Si and ri are independentÞ
ð12Þ
Thus, if we underestimate the number of servers, the variance of service times will be increased. The case of overestimation In the case of overestimation, assume that the actual number of servers is c but the estimated number of servers is ^c=c + k. By similar arguments, one can show that if we overestimate the number of servers, the variance of service times increases. Thus, the decision variable ^c will achieve a unique minimum when the estimated service capacity equals the actual value c. h Now that we have established that an incorrect estimate of capacity produces an increase in the variance of the population of service times during a busy period, we can exploit this fact in formulating an optimization problem that will produce the correct estimate of capacity. The decision variable is the unknown number of servers. The objective function is the variance of the estimated service times, or, equivalently, the sum of squared deviations from
Fig. 1. Illustrating underestimation in a FCFS system (case 1).
ð13Þ
i¼1
b i ¼ Ai ; i ¼ 1; . . . ; ^c s:t: B b i ¼ maxfAi ; Dði^c;i1Þ g; B
Generalizing these three cases, we represent the estimated service time as
b S i ¼ Si ri ;
N X ðb Si b SÞ2
ð14Þ i ¼ ^c þ 1; . . . ; N
b b i ; for all i S i ¼ Di B b S i > 0; for all i 1 6 ^c 6 N ^c integer
ð15Þ ð16Þ ð17Þ ð18Þ
where b S i the estimated service time of customer i, b S the mean of the b S i is the estimated service begin time for customer i, and the S i ’s, b rest of the notation is as shown in Section 2.1. We show above that there is always a unique optimum, provided that the number of customers in the busy period exceeds the number of servers. This optimization problem can be solved easily by one-dimensional grid search program over the range given in (18). Three observations are appropriate here. Observation 1. Given an assumed number of servers ^c, (14) and (15) provide estimates of service start times via (16), which lead to estimates of queueing times via (4). Therefore estimating the correct number of servers also leads directly to correct estimates of the queueing and service times of each customer. In some applications, the primary goal of the analysis may be to obtain these estimates. Observation 2. Theorem 1 explains the unimodality of variance of service times. Sufficient observations make the estimate of variance be close to actual variance; that is, as the observations accumulated, objective function (13) successfully finds the exact solution of c. Observation 3. There are two situations in which the optimization fails. First, if the interarrival and service times are deterministic, then there are either no delays if the service time is sufficiently brief or a buildup toward infinite delay if the service times are too long. Second, if the workload is so low that the number of customers in the system is always less than or equal to the number of servers, then any capacity equal to or greater than the number of arrivals will produce the same (zero) queueing delays. In such cases, service always begins immediately, so we choose to estimate a capacity equal to the maximum number of arrivals within the busy period. Naturally, this number will vary across busy periods. Fortunately, when the system has very low server utilization, there may be less interest in analyzing the queue, so this breakdown may not be critical.
3. Analysis for the LCFS service discipline 3.1. The number of servers is known The data known to us and the assumed independence of service times are the same as in the FCFS model. Once again, let: N Ai Di c Fig. 2. Illustrating underestimation in a FCFS system (case 2).
Total number of customers Arrival time of ith customer Departure time of ith customer Number of servers
845
J. Park et al. / Computers & Industrial Engineering 61 (2011) 842–847
We can derive the following terms.
Proof. The proof parallels that for the FCFS discipline. Let the actual number of server be c.
Number of customers that customer i sees upon arrival Number of customers that customer i leaves behind at departure The first departure time, after time t,
N Ai ND i D1(n, t)
that leads to N D i ¼ n
If a customer arrives when there are fewer customers in the system than the number of servers, service begins immediately; otherwise, the customer joins the queue. In a LCFS system, the customer in queue is served only after all succeeding customers have been served. Fig. 3 shows mechanism of LCFS. Therefore a queued customer starts service when the number of customers is the same as when he arrived. As a result, the service beginning time can be obtained using Eq. (19):
( Bi ¼
if NAi < c
Ai
ð19Þ
D1 ðNAi ; Ai Þ otherwise
We proceed to compute Si and Qi as in Eqs. (3) and (4) for the FCFS case. 3.2. The number of servers is unknown We can derive the optimization model for the LCFS discipline by analogy to the FCFS case. N X
Minimize ^c
ð20Þ
i¼1
( s:t:
SÞ2 ðb Si b
bi ¼ B
b bi; S i ¼ Di B b S i > 0;
if NAi < ^c
Ai A
D1 ðNÞi ; Ai Þ otherwise
;
for all i
ð21Þ
Si ¼ Si ri
for all i
ð22Þ
for all i
ð23Þ
1 6 ^c 6 N
ð24Þ
b i ¼ Ai and B b i ¼ maxfAi ; Dði^c;i1Þ g in the FCFS case are Equations B changed to Eq. (21) in the LCFS case. Theorem 2 assures an optimal solution to the optimization problem. Theorem 2. The objective function (20) is unimodal with unique minimum at the actual number of servers c if the service times are independent.
# of customers in system
The case of underestimation First, consider the case of underestimation, i.e., ^c ¼ c k for some positive integer k. In a LCFS system, we compute the service beginning times from the number of customers in the system. If the number of customers in the system upon arrival is greater than or equal to c, or less than c k, no bias will be incurred in estimating the service starting times. If an arriving customer sees c + n customers upon arrival for a positive integer n (the first case), service starts whenever there are again c + n customers remaining in the system. Also, if an arriving customer sees fewer than c k customers upon arrival, service starts at arrival time. Therefore, the estimated service times for either case will not be different from the actual service times. However, if an arriving customer sees more than c k 1, but fewer than c customers upon arrival, the estimated service beginning time will be different from its true value. Therefore, we have to observe the customers who see c k 6 N Ai 6 c 1 customers when they arrive. Either the number of servers is c or ^c ¼ c k, if a customer finds fewer than c k customers in the system upon arrival, service starting time is determined by the first term of Eq. (19). In the same way, if a customer sees more than customers at that time for n( P c), service starting time would be determined by the second term of Eq. (19). Consider the case c k 6 N Ai 6 c 1. For example, one sees c k + 1 < c customers at his arrival. If the estimated number of servers is c (the actual number), one’s service start time can be determined by the first term of Eq. (19). If we, however, underestimate the number of servers (^c ¼ c k), the service start time would be determined by the second term of Eq. (19). Therefore, we deal with only the case of c k 6 N Ai 6 c 1 . As shown in Fig. 4, the service starting time in an underestimated system is S i instead of Si. Therefore, we can derive Eq. (25) as in the FCFS case.
ð25Þ
Note that ri is the time interval between the moment the ith customer arrives seeing (c k) customers until a customer departure occurs leaving (c k) customers. It is the sum of the remaining service times of the customers who are being served and the service times of the successive customers, whose number we do not know. Since service times are independent, Si and ri are independent, as with the FCFS system. The case of overestimation As with the case of underestimation, assume that the actual number of servers is c but the estimated number of servers is c + k. With similar arguments, one can show that if we overestimate the number of servers, the variance of service times
i th customer sees c-k customers except him
i th customer sees n customers
estimated service starting point of i th customer
service starting point of i th customer
n
Di
c k Si
∇i Ai Fig. 3. Service starting times in a LCFS system.
t
Ai
Si B i-
Fig. 4. Illustrating underestimation in a LCFS system.
846
J. Park et al. / Computers & Industrial Engineering 61 (2011) 842–847
increases. Thus, the objective function will achieve its minimum when the estimated service capacity equals the actual value c. h
of servers was unknown and estimated it, then estimated the queue delays and service times.
Observations 1, 2 and 3 in the FCFS case also apply to this model. To find the exact solution of the LCFS case, we can use those observations.
4.2. Results and analysis
4. Computational experiments 4.1. Experimental design We simulated heavily loaded GI/G/c queues with two queue disciplines (FCFS and LCFS), three server utilization levels (0.8, 0.9 and 1.0), three numbers of servers (3, 7 and 15), three different distributions of interarrival and four service times (deterministic, exponential, absolute standard normal, and absolute normal, i.e., absolute values of normal variates). This design resulted in 99 (=3 3 3 4 9, since we exclude deterministic arrivals with deterministic services) scenarios for each service discipline (FCFS and LCFS), all with the mean arrival rate standardized at 1.0 and 1000 simulated customers. We include absolute normal service time distribution to test performance for a service time distribution with large variance. In each case, we assumed the true number
In every simulation run, the estimated number of servers was correct, from which it followed that all estimates of queueing and service times were also correct. Figs. 5 and 6 plot the behaviors of objective function for some queueing systems with both FCFS and LCFS. We used the variance of service times instead of sum of squared deviations from the mean service time as objective function at these inferences. The values of c shown in the graphs are all the feasible solutions to the optimization problem, with c = 7 and c = 15 being the correct solution. The points without the value of objective function are infeasible cases; for example, some service times or queueing times are calculated as negative value. These graphs show unimodality of the objective function and validate our analysis. All the other systems have similar results, with perfect estimates. Fig. 7 shows two results of additional systems which have service times with large variances. From these graphs, we can reconfirm that our methods perform as we intended.
Fig. 5. Behavior of objective functions for FCFS systems.
Fig. 6. Behavior of objective functions for LCFS systems.
Fig. 7. Behavior of objective functions for service times with large variance.
J. Park et al. / Computers & Industrial Engineering 61 (2011) 842–847
5. Conclusions We developed a methodology to deduce the queueing and service times of individual customers whose processing cannot be observed. This methodology assumes only (1) independent service times, (2) knowledge of stochastic of arrival and departure times of each customer, (3) either FCFS or LCFS service discipline, and (4) sufficient workload for queues to form. It is not required that the distributions of service or interarrival times be known or even stationary. All the results follow from the logic of the service discipline operating within a busy period. More importantly, we also developed an optimization framework for estimating an unknown number of servers under the same assumptions. We showed that the result of the optimization is an exact estimate because of the objective function’s unimodality. (Although not shown here, extensive simulations confirmed that correctness of the analysis.) 6. Future research Further studies are ongoing to find solutions with even more relaxed assumptions. The first task is to analyze additional service disciplines, such as SIRO (service in random order) or disciplines based on priority. Another task is to drop the assumption that every customer is observed without error, by exploiting simple order statistics of the arrival and departure times instead of matching them to customers. This work might be extended even further. In this paper, we assumed that there is one unknown but fixed value of system capacity. However, many real-world systems, such as call centers, have variable capacity, adding servers during times of anticipated higher workload and possibly making state-dependent accommodations,
847
such as having servers work faster when queues are longer. It could be useful in applications to reduce this complexity to a single ‘‘equivalent service capacity’’, i.e., the fixed capacity that would most closely mimic the behavior of the more complex service facility. It is of interest to think of this equivalent capacity as having a distribution over its possible values, with some values being more consistent with actual queue behavior than others. In this case, we could conceive of a ‘‘maximum likelihood’’ capacity. References Bertsimas, D., & Servi, L. (1992). Deducing queueing from transactional data: The queue inference engine, revisited. Operations Research, 40, S217–S228. Bingham, N., & Pitts, S. (1999). Non-parametric estimation for the M/G/infinity queue. Annals of the Institute of Statistical Mathematics, 51, 71–97. Daley, D., & Servi, L. (1992). Exploiting Markov-chains to infer queue length from transactional data. Journal of Applied Probability, 29, 713–732. Dimitrijevic, D. (1996). Inferring most likely queue length from transactional data. Operations Research Letters, 19, 191–199. Hall, R. (1991). Queueing methods for services and manufacturing. Englewood cliffs, NJ: Prentice-Hall. Jang, J., Suh, J., & Liu, C. (2001). A new procedure to estimate waiting time in GI/G/2 systems by server observation. Computers and Operations Research, 28, 597–611. Jones, L. (1999). Inferring balking behavior from transactional data. Operations Research, 47, 778–784. Larson, R. (1990). The queue inference engine: Deducing queue statistics from transactional data. Management Science, 36, 586–601. Larson, R. (1991). The queue inference engine: Addendum. Management Science, 37, 1062. Mandelbaum, A., & Zeltyn, S. (1998). Estimating characteristics of queueing networks using transactional data. Queueing Systems, 29, 75–127. Pickands, J., III, & Stine, R. (1997). Estimation for the M/G/infinite queue with incomplete information. Biometrika, 84, 295–308. Ross, J. V., Taimre, T., & Pollett, P. K. (2007). Estimation for queues from queue length data. Queueing Systems, 55, 131–138. Toyoizumi, H. (1997). Sengupta’s invariant relationship and its application to waiting time inference. Journal of Applied Probability, 34, 795–799.