Ranking flexibility structures in queueing systems

Ranking flexibility structures in queueing systems

European Journal of Operational Research 281 (2020) 77–86 Contents lists available at ScienceDirect European Journal of Operational Research journal...

585KB Sizes 0 Downloads 47 Views

European Journal of Operational Research 281 (2020) 77–86

Contents lists available at ScienceDirect

European Journal of Operational Research journal homepage: www.elsevier.com/locate/ejor

Stochastics and Statistics

Ranking flexibility structures in queueing systems Sigrún Andradóttir a, Hayriye Ayhan a, Douglas G. Down b,∗ a b

H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0205, USA Department of Computing and Software, McMaster University, Hamilton, Ontario L8S 4L7, Canada

a r t i c l e

i n f o

Article history: Received 10 February 2018 Accepted 3 September 2019 Available online 18 September 2019 Keywords: Queueing Flexible servers Scheduling policy Heterogeneous variability

a b s t r a c t We consider the problem of comparing flexibility structures in queueing systems. We find that an important issue in evaluating flexibility is that one cannot separately consider the design of a flexibility structure and the choice of a server scheduling policy. We propose a policy that leverages flexibility in a more predictable (and typically more effective) manner than several common policies and find that the proposed policy is a useful basis for comparisons. In terms of evaluating flexibility, we take as a starting point the CF index of Iravani, Kolfal, and van Oyen (2011) and find that by breaking it down to its components, we are able to identify scenarios in which the CF index may incorrectly compare flexibility structures in systems with heterogeneous variability in the underlying interarrival time and service requirement distributions. For such scenarios, we propose a distribution-dependent metric for performing the ranking. A consequence of our observations is the ability to construct partial rankings that appear to be relatively insensitive to the underlying distributions. © 2019 Elsevier B.V. All rights reserved.

1. Introduction In today’s competitive business environment, it is more important than ever to be well-prepared to react to congestion that can arise due to randomness. We examine a queueing network model of production or service systems, where our goal is to determine effective strategies for hiring, training, and assigning workers to tasks to achieve a high level of overall performance. We refer to the resulting set of resources and their respective abilities as a flexibility structure. There is a body of work on characterizing effective flexibility structures (see below) in the sense that they satisfy some basic properties, such as being throughput optimal, for example. In such situations, it may be desirable to further choose/rank flexibility structures based on criteria such as work in process (WIP). One may also wish to explore the sensitivity of structure rankings to the underlying interarrival time and service requirement distributions. Given the complexity of the structures that one may wish to consider, it is unlikely that the performance of different flexibility structures can be determined in closed form. Using computer simulation to evaluate and compare many different flexibility structures under several choices of underlying distributions may be too time consuming to be a realistic option. This suggests that it is important to develop efficient and accurate procedures to rank and compare a number of flexibility structures, so ∗

Corresponding author. E-mail address: [email protected] (D.G. Down).

https://doi.org/10.1016/j.ejor.2019.09.002 0377-2217/© 2019 Elsevier B.V. All rights reserved.

that computer simulation is only needed to compare a small set of promising finalists. We consider a network where there are K queues in parallel. Arrivals occur to queue k according to a renewal process with rate λk and where the interarrival times have squared coefficient of 2 . We will call arrivals to queue k class k customers. variation Ca,k We assume that the service requirements of class k customers are independent and identically distributed, with mean 1 and squared 2 . We are concerned with evaluating coefficient of variation Cs,k a given set of flexibility structures. In this set, the ith flexibility structure will be denoted by Fi , and consists of Mi servers, where server j works at rate μij,k at class k. If μij,k = 0, then server j is incapable of working at class k. Servers can collaborate on a customer, in which case their service rates are additive. The remaining service requirement of the customer at the head of each queue is decreased at the total service rate assigned to that queue. The performance measure of interest is WIP. This choice of model is common in the literature, however it is expected that our insights extend to more general systems, such as those where there is routing between the classes. One can think of ranking flexibility structures as a particular instance of the more general problem of identifying structural features that impact system performance. In our case, the structural features that are crucial to consider are the flexibility structure (specializations of servers), the server scheduling policy employed, and the underlying variability of the interarrival and service requirement distributions. We would expect that the

78

S. Andradóttir, H. Ayhan and D.G. Down / European Journal of Operational Research 281 (2020) 77–86

importance of these features would carry over to more complex systems. A detailed study of effective flexibility structures is presented in Andradóttir, Ayhan, and Down (2013), where simple conditions are given to determine whether an arbitrary flexibility structure is both throughput optimal (i.e., has the same throughput as full flexibility) and robust to fluctuations in the arrival and/or service rates. Beyond identifying effective flexibility structures, there has not been much work in attempting to rank flexibility structures. Note that such rankings may or may not be effective in terms of the work cited above, as other considerations may be taken into account, including cost. There is a line of work by Iravani et al., begun in Iravani, van Oyen, and Sims (2005) and refined in Iravani, Kolfal, and van Oyen (2011), that uses only mean information to rank a number of candidate flexibility structures. In particular in Iravani et al. (2011), they present a number of examples under which their proposed metric has almost no room for improvement in terms of ranking structures (the performance measure of interest being WIP). It appears that for systems that exhibit a significant degree of homogeneity (in both the service rate structure and the variability), the work in Iravani et al. (2011) provides a very effective tool. It is not clear that homogeneity in the underlying variability is always a reasonable assumption. For example, Tekin, Hopp, and van Oyen (2009) suggest that such heterogeneous systems may arise in a call center setting. In this work, we are interested in ranking flexibility structures under increased heterogeneity, mainly through differing degrees of variability for interarrival times or service requirements. When we embarked on this work, we did not foresee that there would be a significant complication that would arise. We found that in general settings, the choice of server scheduling policy is very important. In particular, it may be the case that the ranked performance of flexibility structures depends on the choice of server scheduling policy. In other words, given two flexibility structures where one structure outperforms the other under a particular scheduling policy, changing the scheduling policy may lead to a reversal of which structure performs better. This is extremely problematic in developing ranking procedures. Ideally, one should compare flexibility structures that are operating under an optimal policy. However, determining the optimal policy for any particular system is challenging – there are no general results available – all indications are that such policies would be quite complicated and would need to be computed on a systemby-system basis. In Iravani et al. (2011), flexibility structures are evaluated under the Serve the Longest Queue (SLQ) scheduling policy. The authors also examine the Maximum Weight (MW) policy, suggested by Stolyar (2004) (which is essentially a version of Serve the Longest Queue, where the queue lengths are weighted by the appropriate processing rates) and did not see significant differences. The MW policy has been shown to have desirable asymptotic properties, as heavy traffic is reached. In particular, it attempts to “pool” the service capacity. As one backs off from heavy traffic, it is not clear how well such a policy performs. There have been a number of studies to suggest that “pooling” of service capacity may in fact degrade performance, when there is heterogeneity in terms of underlying variability and/or service rates. Argon and Andradóttir (2006), Mandelbaum and Reiman (1998), van Dijk and van der Sluis (2008), and Tekin et al. (2009) give examples where such pooling results in poor mean waiting time performance in heterogeneous systems. These works all explicitly pool (to varying degrees) capacity in the network. Another approach for pooling is to implicitly pool capacity through the choice of server scheduling policy. In terms of determining effective scheduling policies for systems with flexible servers, there have been a few recent works that have motivated our studies. First, Tsitsiklis and Xu (2013) have

suggested that for system topologies with limited flexibility, the choice of server scheduling policy is crucial, and the design of such policies is non-trivial. Most important to our work, Markakis, Modiano, and Tsitsiklis (2009), Jagannathan, Jiang, Naik, and Modiano (2013), and Jagannathan, Markakis, Modiano, and Tsitsiklis (2012) suggest that under MW scheduling, performance may be severely degraded when there is significant variability in the system. One conclusion that can be drawn from this prior work is that optimizing for the heavy traffic regime is not sufficient when designing server scheduling policies. In this work, we propose a scheduling policy that is motivated by previous work on effective flexibility structures and that appears to have performance that is not overly sensitive to the flexibility structure itself (however, as will be seen, we have not completely eliminated this sensitivity) and aim to provide a metric/ranking that is effective for this policy. The work on flexibility metrics in Iravani et al. (2005, 2011) does not consider the effects of variability. If there are K arrival streams (classes) in the system, the Capability Flexibility (CF) index produces K components (one for each arrival stream) by solving K LPs. From these components, a single value is produced, the CF index, to evaluate a flexibility structure. The CF index depends only on the underlying means. While this is a very effective approach for the examples in Iravani et al. (2011), if underlying variability is more heterogeneous, it seems plausible that one may prefer flexibility structures that are able to assign more capacity to those classes which exhibit greater variability. To determine when such preferences may arise, we suggest that it is appropriate to examine the components of the CF index, producing what we call the componentwise CF index. This allows us to do the following: •



Identify scenarios for underlying distributions for which one must use caution when employing the CF index. In particular, if for a pair of flexibility structures, a component of the CF index is ordered opposite to how the flexibility structures are ranked using the CF index, then it appears that for sufficiently high variability in the corresponding class, the ranking given by the CF index may be incorrect. Produce a partial ranking of a set of flexibility structures using the componentwise CF index. For a pair of structures that are ranked in such a manner, our observations suggest that the dominating flexibility structure is preferred for a wide range of underlying distributions of interarrival times and service requirements. This is an important insight as it allows designers to concentrate their efforts where needed, guiding both initial designs and how to augment existing designs.

Further, given that the componentwise CF index produces a partial ranking, we also: •

Provide a means to resolve rankings that are not possible using the componentwise CF index. Such a ranking depends on the underlying interarrival and service requirement distributions.

Our evaluations of the ranking procedures are performed empirically, so in addition to the insights that our experiments provide, a number of theoretical questions arise that are worth exploring. These theoretical questions are described in Section 6. The structure of the paper is as follows. In Section 2, we describe the starting point for our work, the CF index. In Section 3, we provide several small examples to illustrate issues in developing metrics, as well as problems in scheduling that arise. A scheduling policy to be used as a basis for comparing flexibility structures is proposed. Section 4 presents our proposals for ranking flexibility structures. Section 5 provides some illustrative examples, while Section 6 includes concluding remarks.

S. Andradóttir, H. Ayhan and D.G. Down / European Journal of Operational Research 281 (2020) 77–86

structures. The first, F1 , has μ11,1 = μ11,2 = μ12,2 = 1 and μ12,1 = 0.

2. The CF index The CF index from Iravani et al. (2011) computes a single number to evaluate the effectiveness of a given flexibility structure. Consider the following set of LPs, one for each class , where the decision variables for the th LP are m , δ j,k , for j = 1, . . . , M, k = 1, . . . , K, and the corresponding solution for the th LP is m∗ . If the LP has no solution, we set m∗ = 0. (For convenience, we have suppressed the dependence on the flexibility structure i in μij,k .)

max m s.t. M 

μ j,k δ j,k ≥ λk , k = 1, . . . , K, k = ,

j=1 M 

μ j, δ j, ≥ m ,

j=1 K 

δ j,k ≤ 1,

j = 1, . . . , M,

δ j,k ≥ 0,

j = 1, . . . , M, k = 1, . . . , K.

k=1

m∗

The quantity is the maximum arrival rate for class  that can be met while providing sufficient capacity to stabilize the remaining classes. The CF index is calculated as



1 K

K  =1

λ

m∗

79



.

(1)

The smaller the value of (1) the better, so if (1) is smaller for Fi as compared to Fj we write Fi < CF Fj . A series of examples is studied in Iravani et al. (2011), suggesting that the CF index is indeed an effective means to rank flexibility structures. We wish to explore the limitations of ranking flexibility structures using a single value based on mean information only. In addition to computing the CF index, another important issue is that a scheduling policy must be chosen as a basis on which to compare systems. The CF index itself does not take this into account, but in Iravani et al. (2011) the examples are evaluated under the SLQ scheduling policy. The MaxWeight policy of Stolyar (2004) is also considered, with the conclusion that it yields the same rankings as SLQ. 3. Impact of variability and scheduling policy As discussed in the previous section, the CF index suggests a simple method to rank flexibility structures (for a flexibility structure, computing the CF index requires solving K LPs, each dependent on mean information only) with simple scheduling policies used as a basis for comparison. In this section, through several examples, we explore limitations of this approach. In the next section, we then provide a means to address these limitations. Throughout this section, all simulation results have 95 percent confidence intervals reported, based on running 30 replications, each of 50 0,0 0 0 time units in length. With respect to using only the mean information to rank flexibility structures, the performance of a queueing system obviously depends on the underlying variability. In the examples below we show how system variability may result in the CF index making incorrect rankings. We will also show that system performance (and as a consequence ranking of structures) can be very sensitive to the choice of scheduling policy.

The second, F2 , has μ21,1 = μ22,1 = μ22,2 = 1 and μ21,2 = 0. In terms of the means, these two structures are identical (after a relabeling of classes), so the CF index is the same for both structures. Now, consider the scenario where the interarrival times and service requirements for class 1 are constant (zero variance), while the corresponding quantities for class 2 are exponentially distributed. One would expect that F1 is a better flexibility structure than F2 , as all of the rates being equal, it is intuitively better to add flexibility to cope with higher variances. Simulating these two structures under SLQ confirms that this is indeed the case, with F1 having an average number in system of 8.64 ± 0.03, while F2 has an average number in system of 10.11 ± 0.09. Looking deeper at these numbers reveals that for F1 , the average number in system for class 1 is 5.61 ± 0.02, while for class 2 it is 3.03 ± 0.01. The class with lower underlying variability actually has a larger average number of customers in the system. This is due to the queue length at class 1 growing while server 1 is at class 2, i.e., SLQ is not controlling server 1’s effort at class 1, the only server that is capable of working there. This server interaction and resulting impact on queue length behavior appear very difficult to predict, unless one sees that SLQ in this case leads to pooling of the servers. For F2 , the corresponding numbers are 0.84 ± 0.01 and 9.27 ± 0.09 for classes 1 and 2, respectively. 3.2. Example 2 – unequal arrival rates, variance higher for lower rates With the previous example in mind, we now consider an example where the classes have unbalanced arrival rates. Consider the same system as in Example 1, but increase λ1 to 0.95. We now have a situation where the class with higher arrival rate has lower variance. In evaluating F1 versus F2 , the question is now – is it more valuable to train a server for a higher arrival rate, but with underlying low variability, or a lower arrival rate with underlying high variability? The CF index for F1 is 0.904 and for F2 is 0.882, suggesting that the former is true. Under SLQ, the average number in the system for F1 is 13.37 ± 0.01 (9.01 ± 0.03 for class 1 and 4.36 ± 0.02 for class 2), while for F2 it is 10.21 ± 0.12 (0.88 ± 0.01 for class 1 and 9.33 ± 0.11 for class 2). So, it appears that the CF index is correct here. However, as for Example 1 we again see that for F1 , the average number in the system for the class with lower underlying variance is much higher than for the class with higher underlying variance. Looking at Examples 1 and 2, the question is: is SLQ allowing the flexible server (the one that is capable of working at both classes) to spend “too much” time at the class that both servers can serve? For F1 , the shared class (class 2) has high variance in the interarrival times and service requirements, so the resulting burstiness can result in server 1 spending long periods of time at class 2, which in turn has an effect on the other class – providing an opportunity for improved performance by isolating the impact of the high variability for the shared class. In other words, if more attention is paid to control the proportion of time that a server spends at a class, the high variability is isolated in the classes where it arises, and as a result the overall performance is improved. Let us consider one more example that shows that if there is heterogeneity in the rates, then indeed SLQ may result in servers spending “too much” time at the wrong classes. 3.3. Example 3 – unequal service rates

3.1. Example 1 – equal rates, unequal variances Consider a system with K = 2 classes. The arrivals follow independent Poisson processes with λ1 = λ2 = 0.9. The service requirements are exponentially distributed. There are two flexibility

Consider a system where the arrivals follow independent Poisson processes with λ1 = λ2 = 0.9. The service requirements are exponentially distributed. The flexibility structure F1 now has full flexibility with μ11,1 = μ12,2 = 1 and μ11,2 = μ12,1 = 0.5. Under SLQ,

80

S. Andradóttir, H. Ayhan and D.G. Down / European Journal of Operational Research 281 (2020) 77–86

Fig. 1. Example 3 – instability under SLQ.

simulation suggests that the system is unstable – see Fig. 1. However, if we restrict the servers to only work at the classes for which they have the highest service rates, i.e., let F2 be μ21,1 = μ22,2 = 1

and μ21,2 = μ22,1 = 0, the mean number in system is 18 (no simulation is required, as the system is just two M/M/1 queues in parallel). F1 is clearly a better choice under the CF index. Clearly, SLQ is allowing the servers to work “too long” at classes where they are inefficient. Under SLQ, less flexible servers are preferred in this case. However, it simply cannot be the case that this is true in general – the problem must be with the scheduling policy. (Note that Baharian & Tezcan, 2011 suggest that this system could be analytically shown to be unstable, but for our purposes, the suggestion from simulation should give sufficient motivation.) This same issue arises with FCFS. We have not presented the details here, but it also appears that FCFS is unstable for F1 , while it is stable for F2 – again, less flexible servers are preferred. One possibility to combat this inefficiency would be to consider a “Max Weight” type policy, variations of which are known to perform well in heavy traffic. Here, let server j choose to serve the class k that maximizes Qk μj,k , where Qk is the queue length of class k. For Examples 1 and 2, this policy is identical to SLQ, however in Example 3, a server only serves the class for which it has slower rate if the queue length is double that of the class for which it has faster rate. For this policy (we will call it MaxWeight for simplicity), we find that the average number of customers in the system for F1 is 16.08 ± 0.14, certainly better than for F2 . The issue with policies such as SLQ and FCFS is that there is no control over the proportion of time that a server spends at a class and thus no guarantee that the inherent flexibility is effectively leveraged. We will see below (in Section 3.5) that this issues also arises for the MaxWeight policy. It appears difficult to bound such proportions for these policies, since this would require a deeper study of their dynamics. Existing flexibility metrics (the CF index and our proposed approach) depend on these proportions, so deviations could result in inconsistencies in rankings. We next present an alternative scheduling policy to address this issue,

where the proportions of time that a server spends at a class are controlled. 3.4. Stabilizing Control (SC) policy We will first give our proposed policy for a two-server, twoclass system, then generalize it to a system with an arbitrary number of servers and classes in Section 3.6. We suppress the superscript for μj,k that denotes the flexibility structure under consideration. Suppose that we solve the following LP, where the decision variables are γ , δ 1,1 , δ 1,2 , δ 2,1 , and δ 2,2 . The corresponding optimal values have a ∗ superscript.

max γ s.t. 2 

μ j,k δ j,k ≥ γ λk , k = 1, 2,

j=1 2 

δ j,k = 1, j = 1, 2,

k=1

δ j,k ≥ 0, j = 1, 2, k = 1, 2. The equality is without loss of generality. Assuming γ ∗ > 1 (the system is stabilizable, see Theorem 1 below), our policy is then as follows. For each server j, if Q1 > 0 and Q2 > 0, then server j should do head of the line processor sharing, devoting δ ∗j,k of its effort to class k (and hence working at rate δ ∗j,k μ j,k there). If either Q1 = 0 or Q2 = 0, then server j should work at rate μj,k at the non-empty queue. (The work in Andradóttir, Ayhan, & Down, 2003 addresses how one could approximate this policy without the need for processor sharing – we would expect the WIP to be larger for the approximate policy.) The intuition for this policy is that when the queue lengths are “large” (in this case non-zero), the policy controls the system so that it pushes the overall system towards being empty (see Andradóttir et al., 2003). In addition, as the system moves to heavy traffic, the analysis

S. Andradóttir, H. Ayhan and D.G. Down / European Journal of Operational Research 281 (2020) 77–86

in Stolyar (2004) suggests that the resulting δ ∗j,k are the correct proportions of time that each server must spend at each class. So the suggestion is that if we do this at all points in time, we do not leave the scheduling policy enough slack to make bad decisions. This hopefully results in overall better performance and, also, less unpredictable behavior. Given that our proposed policy stabilizes and controls the proportion of time that a server spends at a class, we call our policy the Stabilizing Control (SC) policy. On the negative side, it is not clear how far this policy would be from optimal, in particular as one backs off from heavy traffic. One would expect that the optimal policy would take the form of a multiple threshold policy, where the thresholds depend on the underlying distributions. The SC policy depends only on means, so does not take the variances into account, for example. 3.5. Examples revisited under the SC policy For Example 1, the SC policy gives the following simulation results. For F1 , the average number in system is 5.40 ± 0.02, while for F2 it is 9.87 ± 0.12. For F1 , the average number in system for class 1 is 0.90 ± 0.01 (as opposed to 5.61 ± 0.02 for SLQ), so the SC policy does a better job of controlling the number in system at the class that is not shared between the servers. We see here that F1 is preferred by a much wider margin than for SLQ and that the performance of the SC policy with F1 is much better than that for SLQ with F1 . For Example 2, under the SC policy, the average number in system for F1 is 6.95 ± 0.05, while for F2 it is 12.75 ± 0.17. In contrast to SLQ, F1 is now preferred, and the average number in system is the lowest of the policies considered. The average number in system for class 1 is 0.95 ± 0.01, so we see that the SC policy does a good job of controlling the number at the low variance class, while keeping the overall number in system low. Here, the CF index produces an incorrect ranking. Finally, for Example 3, under the SC policy, the average number in system for F1 is 11.58 ± 0.08 (the mean number in system for dedicated servers is 18.00). This is an improvement over the MaxWeight policy, which suggests that MaxWeight is spending too much time employing the slower rates – the SC policy forces the servers to not spend too much time at the classes with slower rates. We now give the SC policy in general. 3.6. SC policy for general systems Moving beyond two-server, two-class systems, we propose the following policy. We solve the LP in Section 3.4, where the ranges of servers (j) and classes (k) run from 1 to M and 1 to K, respectively. We call the resulting LP, LP-1. Given a solution {δ ∗j,k } to LP-1, define the set Sj to be the set of all classes k for which δ ∗j,k > 0. The SC policy is as follows. Let Qk be the number of class k customers in the system. For each server j, 1. If ∃k ∈ Sj such that Qk > 0, then for all k ∈ Sj , work at rate

μ j,k δ ∗j,k 1{Qk > 0}  . 1 − ∈S j δ ∗j, 1{Q = 0} 2. If ࢜k ∈ Sj such that Qk > 0, serve the queue k with the highest value of Qk μj,k (ties can be arbitrarily broken). Note that the second item is actually an arbitrary choice. If there are no customers available in Sj , server j can work anywhere it likes. Note that if γ ∗ < 1, then it is not possible to stabilize the system under any policy, see Theorem 1(ii) of Andradóttir et al. (2003). If γ ∗ > 1, under the mild additional assumption that each arrival

81

process has an underlying interarrival time distribution that is unbounded and spread out (see Section 2.3 of Andradóttir et al., 2003), it is not difficult to show that the SC policy guarantees stability of the system. In the interests of space, we will not introduce all of the notation of the fluid model approach in Andradóttir et al. (2003), but if (Q¯ k (t ), T¯ j,k (t )) is a fluid limit for the system (see Andradóttir et al., 2003), then for the SC policy, if Q¯ k (t ) > 0, then

 d μ j,k δ ∗j,k Q¯ (t ) ≤ λk − dt k M

j=1

and the right-hand side is strictly negative. Hence we have that each of the fluid-scaled queue lengths in the system converges to zero and we can state the following result, where Q(t) is the Kdimensional queue length vector with kth entry Qk (t). The proof is similar to that of Theorem 1 of Andradóttir et al. (2003). Theorem 1. Assume that all K interarrival time distributions are unbounded and spread out. (i) If γ ∗ > 1, then the SC policy results in a queue-length process {Q(t)} which converges to a steady-state distribution ϕ as t → ∞. (ii) If γ ∗ < 1, P (|Q (t )| → ∞ ) = 1 for any policy. We suggest that the SC policy is a reasonable basis for comparing flexibility structures. It is guaranteed to stabilize the system if it can be stabilized (unlike SLQ, see Section 3.3) and has tight control over the proportion of time that a server spends at a class. We have observed that this policy outperforms SLQ for the examples introduced to this point, but cannot make this conclusion in general, in particular SLQ should perform well for homogeneous systems. As a result, the SC policy does appear to be a reasonable basis for comparison under the proposed ranking methodology. Before we proceed with our proposed ranking methodology, it is instructive to relate the development of the SC policy to the “heavy traffic” analysis of queues. In particular, static planning problems similar to LP-1 arise in Harrison (20 0 0), Ata and Kumar (20 05), and Stolyar (20 04), for example. The goal in these works is to determine the assignment of servers to classes to guarantee heavy-traffic optimality, as well as to design scheduling policies that enforce this assignment. Similar to the heavy-traffic approach, we suggest that a policy that always follows the optimal proportions {δ ∗j,k } is a reasonable basis for comparison, however there are key differences. We are interested in comparing flexibility structures at arbitrary utilizations. It is not difficult to construct a set of flexibility structures that have identical heavy traffic behavior (under a heavy-traffic optimal policy). This can be seen intuitively as for heavy-traffic optimality, typically only a subset of the flexibility structure is employed, while at lower utilizations there is potential to employ the entire flexibility structure. We would like to evaluate the impact of these additional capabilities. In addition, the heavy traffic literature excludes flexibility structures with a chainlike structure (which typically perform well, see Chou, Chua, Teo, & Zheng, 2010; Graves & Tomlin, 2003; Hopp, Tekin, & van Oyen, 20 04; Iravani et al., 20 05 for discussions of the desirable properties of chains), as one cannot guarantee so-called “state space collapse” in such cases. Chain structures are considered in the example in Section 5.2. 4. Ranking techniques In this section, we propose breaking down the CF index into its components to create a partial ranking of flexibility structures (Section 4.1). A consequence of this partial ranking is that we are able to identify scenarios for which the CF index must be used

82

S. Andradóttir, H. Ayhan and D.G. Down / European Journal of Operational Research 281 (2020) 77–86

with caution. We then propose a metric in Section 4.2 that can address such scenarios, but is dependent on the underlying distributions. 4.1. Generalizing the CF index Using the CF index as a starting point, we first examine when the CF index may not produce correct rankings. To motivate our proposed ranking procedure, we return to Example 2 and examine the components of the CF index individually – the individual terms that are averaged in (1). We define ci,k to be the term λk /m∗k for Fi . The corresponding vector ci has kth element ci,k . For F1 , we have c1,1 = 0.95 and c1,2 = 0.9/1.05. For F2 , the corresponding quantities are c2,1 = 0.95/1.1 and c2,2 = 0.9. The average of the two components is smaller for F2 than for F1 , so the CF index prefers F2 . However, c2,1 < c1,1 and c1,2 < c2,2 . While the load on class 1 can be decreased by a greater degree than the load on class 2 (hence the lower CF index for F2 ), higher variability at class 2 may mean that the ability to decrease the load at class 2 may be more desirable, and hence F1 may be preferred. Example 2 is precisely a scenario where this occurs. We propose that it is desirable to look at a generalization of the CF index, by comparing the terms that give rise to the index. This will allow the identification of scenarios where underlying variability should be taken into consideration. In general, with ci,k defined as above, we suggest that an appropriate means of ranking is to say that Fi is preferred over Fj (Fi < CCF Fj ) under a componentwise CF index if ci,k ≤ cj,k for all classes k with the inequality being strict for at least one k. (We will also write this as ci < cj .) Note that Fi < CCF Fj implies that Fi < CF Fj , but the converse does not necessarily hold. Also, if Fi and Fj cannot be ranked according to the componentwise CF index, the ranking can potentially go in either direction. A case in point is Example 2. Here, the componentwise CF index cannot rank the two structures, and in Section 3.5 it was shown that for the underlying distributions considered there, F1 is preferred to F2 under the SC policy. However, suppose that the underlying distributions are now all exponential. Then the average number in system for our policy for F1 is 30.19 ± 0.46, while for F2 it is 20.60 ± 0.17, so here F2 is preferred. In general, the componentwise CF index identifies classes for which high variability causes issues for ranking. In fact, if Fi < CF Fj but cj,k < ci,k , Fj may be preferred if the underlying variability in class k is sufficiently high. The value of the componentwise CF index is that if two flexibility structures can be ranked according to this approach, then the ranking does not appear to depend on the underlying distributions. However, it is not difficult to see that < CCF induces a partial ordering on the set of flexibility structures under consideration. While the componentwise CF index does identify which classes can lead to rankings that can go in either direction depending on underlying variability, it does not resolve these rankings for particular instances of underlying distributions. One could resort to simulation at this point, but we introduce a metric that can be efficiently computed and appears to work reasonably well, at least in cases where the componentwise CF index cannot do so and there is significant heterogeneity in the underlying variances. 4.2. The Variance metric As WIP is the performance measure of interest, our Variance metric attempts to address directly how a flexibility structure impacts WIP. For the CF index and componentwise CF index, WIP is addressed indirectly, as the underlying idea for these approaches is that increasing capacity (reducing load) decreases WIP, but that ignores variance effects. We would like to measure how well the system can simultaneously handle all of the different classes, by

modelling each queue as a G/G/1 queue and using a diffusion approximation for the mean queue length at each queue. Note that this metric is not an approximation for the mean total number of customers in the system as we have not included the effects of pooling – we will discuss this issue after presenting the metric. We compute the metric Vi corresponding to structure Fi , as the optimal objective function value for the following optimization problem (the superscript i has been omitted throughout):

min

λk (C 2 + C 2 ) M a,k s,k  s.t. 2 j=1 μ j,k δ j,k − λk k=1

K 

M 

μ j,k δ j,k ≥ λk , k = 1, . . . , K,

j=1 K 

δ j,k = 1, j = 1, . . . , M,

k=1

δ j,k ≥ 0, j = 1, . . . , M, k = 1, . . . , K. This is a convex problem, so is amenable to solution by commercial solvers. According to this metric, we say that flexibility structure Fi is preferred over Fj (Fi < V Fj ) if Vi < Vj . While this metric has some attractive features, it is not clear how well it would perform in general. In particular, the following example provides insight into the applicability of the Variance metric and the relationship between the Variance metric and the componentwise CF index. Consider the flexibility structures F1 with μ1,1 = μ2,2 = 1.1 and μ1,2 = μ2,1 = 0, and F2 with μ1,1 = μ1,2 = μ2,1 = μ2,2 = 1.0. The arrival rates are λ1 = λ2 = λ. All underlying distributions are exponential. It is not difficult to see that F1 is stable for λ < 1.1 and F2 can be stabilized for λ < 1.0. It is straightforward to show that F1 < V F2 for all λ < 1.1. On the other hand, the ranking according to the componentwise CF index is dependent on λ. To see this, for λ < 1.0, the components of the CF index are both λ/1.1 for F1 and λ/(2.0 − λ ) for F2 . Combining this with the fact that F2 is unstable for λ > 1.0, one then has for 0.9 < λ < 1.1, F1 < CCF F2 and for λ < 0.9, F2 < CCF F1 . This leads us to the following observations: (i) It is not the case that ranking using one approach implies a corresponding ranking in the other (one might have expected Fi < CCF Fj implies that Fi < V Fj , but this example shows that this is not the case in general). (ii) The componentwise CF index ranking is load dependent. One might have expected a flexibility structure to be preferred if it could stabilize a strictly larger set of arrival rates than another flexibility structure, but as can be seen by this example, this is not necessarily the case. (iii) The Variance metric cannot capture the benefits of the ability to pool service capacity, as it partitions the system into single server queues. Thus it appears to be most effective when variability is heterogeneous, and the benefits of pooling are decreased. (iv) For exponentially distributed service requirements, the comparison between F1 and F2 amounts to comparing a system of two M/M/1 queues (F1 ) with an M/M/2 system (F2 ), which can be done analytically. To two digits, for λ < 0.89, F2 performs better than F1 , otherwise F1 performs better than F2 . Here, the underlying variability is homogeneous, so the CCF approach is applicable (as the CCF ranking can capture pooling effects). Furthermore, the values of the arrival rate for which one structure is preferred over the other are in close agreement with the CCF ranking. (v) For those pairs of flexibility structures that cannot be ranked by the componentwise CF index, we believe that the Variance metric is a useful proxy that admits a ranking of the

S. Andradóttir, H. Ayhan and D.G. Down / European Journal of Operational Research 281 (2020) 77–86

structures. It has an objective of minimizing WIP and captures the ability to assign more capacity to classes with higher underlying variance, in addition to the diminishing returns for having large amounts of capacity available for a particular class. In Section 5, we will see that the two metrics indeed work in concert as indicated in point (v) above. Before moving on to these examples, we note that an interesting avenue for future research may be to examine if the policy induced by the Variance metric LP (using the server allocations from the Variance metric LP rather than those from the solution of LP-1 in the policy in Section 3.6) performs well. Our initial attempts at doing so have produced mixed results. While this policy appears to perform well in systems with significant heterogeneity in underlying variance, it does not appear to work as well in more homogeneous settings. With the above observations in mind, we propose a ranking procedure that first produces a (partial) ranking using the CCF ranking. For any unranked pairs of flexibility structures, we use the Variance metric. The resulting complete ranking we will call the CV (CCF plus Variance) ranking. We now proceed to validate the CV ranking for several examples. 5. Examples

For Example 3, c1,1 = c1,2 = 0.9/1.05 and c2,1 = c2,2 = 0.9. F1 is preferred over F2 using the componentwise CF index, again agreeing with the results in Section 3.5. 5.2. A 3 by 3 Example We consider a system with three classes and three servers. Our goals in presenting this system are twofold. The first is to give another detailed example of the use of the CV ranking. The second is to provide further evidence on the problematic nature of employing SLQ as a scheduling policy. We first provide the complete details for the CV ranking for a particular choice of interarrival time and service requirement distributions. The three classes have arrival rates all equal to 0.9. The interarrival times for class 1 are exponentially distributed, and the service requirements are also exponentially distributed. For classes 2 and 3, the interarrival times and service requirements are constant. There are six flexibility structures under consideration (all μj,k ’s not specified are zero). • • • •

We present six examples to validate the CV ranking procedure. The first three (in Section 5.1) are small enough that we can provide complete details of the procedure. The fourth example (in Section 5.2) also explores the issue of the efficacy of SLQ. The final two examples (in Sections 5.3 and 5.4, respectively) present a more comprehensive comparison for larger systems under a range of interarrival time and service requirement distributions. Throughout this section, we use the SC policy as the basis for comparing the performance of systems. 5.1. Examples 1–3 revisited For Example 1, in terms of the means, the two structures are identical. As a result, the CF index is the same for both systems. For the componentwise CF index, we have c1,1 = 0.9, c1,2 = 0.9/1.1, and c2,1 = 0.9/1.1, c2,2 = 0.9, so no ranking is possible. So, we 2 = use the Variance metric to rank the flexibility structures. As Ca, 1 Cs,2 1 = 0, and server 2 is only capable of serving class 2, we have that for F1 , the metric V1 is calculated from

min

0.9 s.t. δ1,2 + 1 − 0.9

δ1,1 ≥ 0.9,

δ1,2 + 1 ≥ 0.9, δ1,1 + δ1,2 = 1, δ1,1 , δ1,2 ≥ 0. This problem has the trivial solution δ1∗,1 = 0.9, δ1∗,2 = 0.1 and thus V1 = 9/2. In a similar manner, V2 = 9. So, F1 is preferred, consistent with the observed performance for these two systems (see Section 3.5). Suppose that we also consider full flexibility (μ31,1 =

μ31,2 = μ32,1 = μ32,2 = 1), call this structure F3 . The components of

the CF index are (to four decimal points) c1 = [0.90 0 0 0.8182], c2 = [0.8182 0.90 0 0], and c3 = [0.8182 0.8182], so F3 < CCF F1 , F2 . Combining this with F1 < V F2 , we get F3 < CV F1 < CV F2 (Fi < CV Fj denotes that flexibility structure Fi is preferred over Fj using the CV ranking). The average number in system for F3 is 5.13 ± 0.03, so our ranking procedure makes the correct choice. For Example 2, c1,1 = 0.95, c1,2 = 0.9/1.05, c2,1 = 0.95/1.1, and c2,2 = 0.9. The componentwise CF index cannot rank the two structures. We then have V1 = 6 and V2 = 9, so F1 is preferred, agreeing with the observations in Section 3.5.

83

• •

F1 : μ11,1 = μ12,2 = μ13,3 = 1

F2 : μ21,2 = μ22,3 = μ23,1 = 0.4, μ21,3 = μ22,1 = μ23,2 = 0.5 F3 : μ31,1 = μ32,2 = μ33,3 = 1, μ31,3 = μ32,1 = μ33,2 = 0.5 F4 : μ41,1 = μ42,2 = μ43,3 = 1, μ41,2 = μ42,3 = μ43,1 = 0.4 F5 : F3 with μ51,2 changed to 0.4 F6 : F4 with μ62,1 changed to 0.5

These flexibility structures use a common set of underlying service rates μj,k , but different rates in each flexibility structure are infeasible (set to zero). F1 corresponds to dedicated servers, with the servers each having their fastest possible rates. F2 , F3 , and F4 are the three possible chains (each server can serve exactly two classes and each class is served by exactly two servers), while F5 and F6 add one skill to the chains given by F3 and F4 , respectively. Note that F3 employs the fastest two rates for each server, while F4 uses the fastest and slowest rates in its chain. F6 adds a skill to F4 for the higher variance class, while F5 adds a skill to F3 for a lower variance class. First, we compute the CF index for each of the flexibility structures: • • • • • •

F1 : F2 : F3 : F4 : F5 : F6 :

0.90 0 0 ∞ 0.8372 0.8523 0.8334 0.8434

So, using the CF index, the ranking is:

F5
c1 = [0.90 0 0 0.90 0 0 0.90 0 0] c2 = [∞ ∞ ∞] c3 = [0.8372 0.8372 0.8372] c4 = [0.8523 0.8523 0.8523] c5 = [0.8372 0.8257 0.8372] c6 = [0.8257 0.8523 0.8523] So, we have the following using the componentwise CF index:

Fj
F5
84

S. Andradóttir, H. Ayhan and D.G. Down / European Journal of Operational Research 281 (2020) 77–86

We would have a complete ranking if this step could rank the pairs F5 and F6 (and F3 and F6 if F5 is preferred over F6 ). So, we compute the Variance metrics



V5 = 2.5714 V6 = 2.3684, and so F6 < V F5 . So, the CV ranking produces

F6
Constant (deterministic). Uniform distribution with squared coefficient of variation 0.1. Erlang distribution with squared coefficient of variation 0.5. Exponential distribution. Hyperexponential distribution with squared coefficient of variation 2. 6. Hyperexponential distribution with squared coefficient of variation 5. 7. Hyperexponential distribution with squared coefficient of variation 10. As a reminder, the arrival rate is fixed at 0.9 and the mean service requirement is fixed at 1. Each experiment required six independent choices from the above set of distributions, one for the interarrival time distribution and the service requirement distribution for each class. The 10 experiments were run for 10 0,0 0 0 time units. For all 10 experiments, F2 , F3 , F4 , F5 , and F6 were unstable under SLQ. So, here we have the trivial result that F1 is the best flexibility structure, with the remaining structures not even being viable in the sense that they are unstable. This is easily seen to be inconsistent with both the ranking produced by the CF index and the CV ranking. This is a clear indication of the problematic nature of SLQ. (It may be of interest to note that in terms of how unstable the systems are, for all of the experiments, the number in system increased at the fastest rate for F2 , followed by (in order) F6 , F4 , F5 , and F3 , which is also not consistent with the rankings.) 5.3. A six-server, six-class Example (I) Here, we consider an example taken from Iravani et al. (2011), corresponding to structures 2–1 through 2–6 (see Fig. 3 in that paper). There are six classes. The corresponding arrival rates are λ1 = λ2 = λ3 = 1.50 0 0, λ4 = λ5 = λ6 = 0.50 0 0. The service rates are (all unspecified values are 0): •

μ11,1 = 1.50 0 0, μ11,3 = 1.0 0 0 0, μ12,2 = 2.250 0, μ12,5 = 0.750 0, μ13,1 = 1.50 0 0, μ13,3 = 1.0 0 0 0, μ14,3 = 1.250 0, F1 :









μ14,4 = 0.9375, μ15,1 = 1.1250, μ15,2 = 1.1250, μ16,2 = 2.2500, μ16,6 = 0.7500 F2 : μ21,1 = 1.50 0 0, μ21,3 = 1.0 0 0 0, μ22,2 = 1.50 0 0, μ22,3 = 1.0 0 0 0, μ23,1 = 1.8750, μ23,4 = 0.9375, μ24,3 = 1.50 0 0, 2 2 2 μ4,5 = 0.7500, μ5,1 = 1.1250, μ5,2 = 1.1250, μ26,2 = 2.2500, μ26,6 = 0.7500 F3 : μ31,1 = 1.1250, μ32,1 = 1.50 0 0, μ32,2 = 1.50 0 0, μ33,2 = 1.8750, μ33,3 = 1.2500, μ34,2 = 1.50 0 0, μ34,4 = 0.750 0, 3 3 3 3 μ5,3 = 1.50 0 0, μ5,5 = 0.750 0, μ6,3 = 1.50 0 0, μ6,6 = 0.750 0 F4 : μ41,1 = 2.2500, μ41,2 = 2.2500, μ41,3 = 1.50 0 0, μ42,1 = 2.6250, μ42,2 = 2.6250, μ42,3 = 1.7500, μ43,3 = 1.2500, 4 4 4 μ3,4 = 0.9375, μ4,4 = 0.7500, μ4,5 = 0.50 0 0, μ45,5 = 0.750 0, μ45,6 = 0.7500 F5 : μ51,1 = 1.50 0 0, μ51,2 = 1.50 0 0, μ52,2 = 2.6250, μ52,3 = 1.7500, μ53,1 = 1.8750, μ53,4 = 0.9375, μ54,2 = 2.2500, μ54,5 = 0.7500, μ55,3 = 1.50 0 0, μ55,6 = 0.750 0 F6 : μ61,1 = 2.6250, μ61,2 = 2.6250, μ61,3 = 1.7500, μ61,4 = 1.3125, μ61,5 = 0.8750, μ62,1 = 2.6250, μ62,2 = 2.6250, μ62,3 = 1.7500, μ62,5 = 0.8750, μ62,6 = 0.7500, μ63,1 = 1.50 0 0, μ63,4 = 0.750 0, μ64,2 = 1.50 0 0, μ64,5 = 0.50 0 0, μ65,3 = 1.50 0 0, μ65,6 = 0.750 0 The CF index technique ranks these in the order:

F6
c1 = [0.5455 0.5714 0.6429 0.5333 0.6667 0.6667] c2 = [0.5455 0.5455 0.6429 0.5333 0.6667 0.6667] c3 = [0.5714 0.5455 0.6667 0.6667 0.6667 0.6667] c4 = [0.5455 0.5455 0.6429 0.4444 0.6667 0.6667] c5 = [0.6316 0.5455 0.6667 0.5333 0.6667 0.6667] c6 = [0.5455 0.5455 0.6429 0.4444 0.5455 0.5455] We see that the rankings of F1 , F3 , and F5 are potentially problematic if one follows the CF index ranking, as these three structures cannot be ranked against each other using the componentwise CF index. We do have

F6
S. Andradóttir, H. Ayhan and D.G. Down / European Journal of Operational Research 281 (2020) 77–86

for which the CF index was correct). What is more instructive is that the ci vectors identify conditions under which one should use caution in using the CF index. In particular, the ranking of F1 and F5 may become problematic when the underlying variance in class 2 becomes sufficiently large, as c5,2 < c1,2 , which is opposite to the CF index ranking. For example, the CV ranking was correct for the 28th experiment, which had a hyperexponential distribution with squared coefficient of variation of 10 for both class 2’s interarrival times and service requirements, while for all other classes, the underlying distributions had squared coefficient variation of at most 1. Another example is the ranking of F3 and F5 . As c3,1 < c5,1 , if the variance in class 1 is sufficiently high, the ranking of the CF index may again be problematic. Experiment 74 had high service requirement variance for class 1 (squared coefficient of variation of 5), while class 4 (where F5 is better) had much lower variability. Our approach ranked this pair of flexibility structures correctly. We also examined the suggestion that if ci < cj , then Fi is preferred over Fj for a wide range of underlying distributions. In all 100 experiments, F6 was the best flexibility structure, by a significant margin. On average, F6 had a smaller average number in system than the runner up by a factor of 0.66. This is consistent with the significant gap between the metrics. F4 was the second-ranked structure in 87 of the experiments, third-ranked in five of the experiments, fourth-ranked in seven of the experiments, and sixth-ranked in one of the experiments. On examining the 13 experiments where F4 was not ranked second, it appears that the issue is with the scheduling policy. All of these cases had very heterogeneous variability, and it appears that the SC policy could be improved in such settings. This reinforces our earlier observation that it is both difficult to determine an optimal policy and that the ranking can be sensitive to the choice of server assignment policy. F2 was preferred to F1 , F5 , and F3 in 74 experiments, was preferred to only two of these in 12 experiments, was preferred to only one of these in 12 experiments, and was ranked last in two experiments. Again, more detailed examination suggests that this was due to the choice of scheduling policy. Note that changing the scheduling policy to SLQ causes issues with ranking. For example, we ran 10 experiments using SLQ. F5 was first in two experiments, second in six experiments and third in one experiment. On the other hand, F4 was ranked second in one experiment, third in one experiment, fourth in seven experiments, and fifth in one experiment. The poor relative performance of F4 under SLQ can perhaps be explained by servers 1 and 2 spending too long at particular classes. The fact that SLQ results in F5 being one of the top three flexibility structures in nine out of 10 experiments despite being ranked no higher than fourth by both approaches, and F4 being one of the bottom three in eight out of 10 experiments despite being ranked second by both approaches, is a clear indicator that it is a problematic choice of scheduling policy to perform rankings. 5.4. A six-server, six-class Example (II) Here, we consider a second example taken from Iravani et al. (2011), corresponding to structures 1–1 through 1–5 (see Fig. 6 in that paper). There are six classes. The corresponding arrival rates are λ1 = λ6 = 1.3500, λ2 = λ5 = 0.90 0 0, and λ3 = λ4 = 0.4500. The service rates are (all unspecified values are 0): •

μ11,1 = 1.2500, μ11,2 = 1.10 0 0, μ12,2 = 1.10 0 0, μ12,6 = μ13,1 = 1.250 0, μ13,5 = 1.10 0 0, μ14,3 = 1.0 0 0 0, μ14,5 = 1.10 0 0, μ15,1 = 1.250 0, μ15,6 = 1.250 0, μ16,4 = 1.10 0 0, μ16,6 = 1.2500 F2 : μ21,1 = 1.2500, μ21,6 = 1.2500, μ22,1 = 1.2500, μ22,6 = 1.250 0, μ23,1 = 1.250 0, μ23,2 = 1.10 0 0, μ24,2 = 1.10 0 0, F1 :

1.250 0,









85

μ24,3 = 1.0 0 0 0, μ25,5 = 1.10 0 0, μ25,6 = 1.250 0, μ26,4 = 1.10 0 0, μ26,5 = 1.10 0 0 F3 : μ31,1 = 1.2500, μ31,2 = 1.10 0 0, μ32,1 = 1.250 0, μ32,3 = 1.0 0 0 0, μ33,1 = 1.250 0, μ33,5 = 1.10 0 0, μ34,2 = 1.10 0 0, 3 3 3 μ4,6 = 1.2500, μ5,5 = 1.10 0 0, μ5,6 = 1.250 0, μ36,4 = 1.0 0 0 0, μ36,6 = 1.2500 F4 : μ41,1 = 1.2500, μ41,2 = 1.10 0 0, μ41,3 = 1.0 0 0 0, μ41,6 = 1.250 0, μ42,1 = 1.250 0, μ42,2 = 1.10 0 0, μ42,3 = 1.0 0 0 0, 4 4 4 μ2,6 = 1.2500, μ3,1 = 1.2500, μ3,2 = 1.10 0 0, μ43,5 = 1.10 0 0, μ43,6 = 1.2500, μ44,1 = 1.2500, μ44,2 = 1.10 0 0, μ44,5 = 1.10 0 0, μ44,6 = 1.2500, μ45,1 = 1.2500, μ45,4 = 1.0 0 0 0, μ45,5 = 1.10 0 0, μ45,6 = 1.2500, μ46,1 = 1.2500, μ46,4 = 1.10 0 0, μ46,5 = 1.10 0 0, μ46,6 = 1.2500 F5 : μ51,2 = 1.10 0 0, μ51,5 = 1.10 0 0, μ52,1 = 1.250 0, μ52,2 = 1.10 0 0, μ53,1 = 1.250 0, μ53,5 = 1.10 0 0, μ54,1 = 1.250 0, μ54,6 = 1.2500, μ55,3 = 1.0 0 0 0, μ55,6 = 1.250 0, μ56,4 = 1.0 0 0 0, μ56,6 = 1.2500 The CF index technique ranks these in the order

F4
c1 = [0.4454 0.4091 0.4500 0.4091 0.5279 0.4454] c2 = [0.4454 0.5279 0.4500 0.4091 0.5143 0.4454] c3 = [0.4531 0.4091 0.4500 0.4500 0.4091 0.4531] c4 = [0.4454 0.3783 0.2508 0.2428 0.3783 0.4454] c5 = [0.4569 0.4091 0.4500 0.4500 0.4091 0.5143] So, we have

F4
F3 μ36,4 , and server 6 is the only server that is capable of working at class 4 in both F1 and F3 . Consider a scenario with all classes having constant interarrival times and service requirement distributions, except for class 4, which has high underlying variability. For sufficiently low arrival rates, it is not difficult to see that a scheduling policy which dedicates server 6 to class 4 is optimal with respect to WIP for both structures, and thus F1 is preferred to F3 (the given arrival rates satisfy this). The SC

86

S. Andradóttir, H. Ayhan and D.G. Down / European Journal of Operational Research 281 (2020) 77–86

policy does not account for opportunities to adjust relative server assignments when arrival rates are below the maximal capacity. For example, the SC policy cannot have server 6 dedicated to class 4, which may be desirable at lower loads, but is not allowable close to the maximal capacity. This reinforces our observation that the choice of scheduling policy is crucial – it makes it difficult to comprehensively evaluate proposals for ranking schemes. In terms of identifying good flexibility structures, the following is the key observation. In all 100 experiments, F4 was the best flexibility structure, by a significant margin. On average, F4 had a smaller average number in the system than the runner up by a factor of 0.46. This is consistent with the significant gap between the metrics. Also, F3 was preferred to F5 in all experiments, supporting our CCF ranking. 6. Conclusion We have suggested a procedure for ranking flexibility structures (the CV ranking) that is able to account for heterogeneous levels of variability in the customer classes. The first step of the procedure determines if one flexibility structure is preferred over another (for a wide range of underlying interarrival time and service requirement variances). When a preference cannot be determined, it also identifies scenarios where heterogeneous variability can result in one flexibility structure being preferred over another. The second step takes into account the given variability to address which structures are preferred if the first step does identify possibilities where variability can affect the flexibility structure preference. An important insight from this work is that the preference between flexibility structures is sensitive to the scheduling policy employed. This insight raises a number of theoretical questions. Much work has been done on identifying policies that perform well in heavy traffic, but it is not clear that such policies perform well over all loads. We have identified a policy that may be suitable for benchmarking purposes, but our work suggests that the policy can be improved upon. In fact, choosing a suitable policy under which to perform rankings appears to be a key difficulty in evaluating flexibility structures. The overarching concern of determining optimal scheduling policies is an important open problem. One final issue that is worth exploring is to see whether it is possible to determine a ranking procedure given a particular scheduling policy (FCFS, MaxWeight, etc.). As evidenced by the examples and accompanying discussion in Section 3, this appears to be a challenging problem.

Acknowledgment The research of the first two authors has been supported by NSF under grant CMMI-1536990. The third author has been supported by NSERC under grant RGPIN-04518-2016. The authors thank the anonymous referees for their suggestions. References Andradóttir, S., Ayhan, H., & Down, D. G. (2003). Dynamic server allocation for queueing networks with flexible servers. Operations Research, 51(6), 952–968. Andradóttir, S., Ayhan, H., & Down, D. G. (2013). Design principles for flexible systems. Production and Operations Management, 22(5), 1144–1156. Argon, N. T., & Andradóttir, S. (2006). Partial pooling in tandem lines with cooperation and blocking. Queueing Systems, 52(5–30), 2006. Ata, B., & Kumar, S. (2005). Heavy traffic analysis of open processing networks with complete resource pooling: Asymptotic optimality of discrete review policies. Annals of Applied Probability, 15(1A), 331–391. 2005 Baharian, G., & Tezcan, T. (2011). Stability analysis of parallel server systems under longest queue first. Mathematical Methods of Operations Research, 74(2), 257–279. Chou, M. C., Chua, G. A., Teo, C.-P., & Zheng, H. (2010). Design for process flexibility: Efficiency of the long chain and sparse structure. Operations Research, 58(1), 43–58. van Dijk, N. M., & van der Sluis, E. (2008). To pool or not to pool in call centers. Production and Operations Management, 17(3), 296–305. Graves, S. C., & Tomlin, B. T. (2003). Process flexibility in supply chains. Management Science, 49(7), 907–919. Harrison, J. M. (20 0 0). Brownian models of open processing networks: Canonical representation of workload. Annals of Applied Probability, 10(1), 75–103. Hopp, W. J., Tekin, E., & van Oyen, M. P. (2004). Benefits of skill chaining in serial production lines with cross-trained workers. Management Science, 50(1), 83–98. Iravani, S. M., van Oyen, M. P., & Sims, K. T. (2005). Structural flexibility: A new perspective on the design of manufacturing and service operations. Management Science, 51(2), 151–166. Iravani, S. M. R., Kolfal, B., & van Oyen, M. P. (2011). Capability flexibility: A decision support methodology for parallel service and manufacturing systems with flexible servers. IIE Transactions, 43(5), 363–382. Jagannathan, K., Jiang, L., Naik, P. L., & Modiano, E. (2013). On scheduling algorithms robust to heavy-tailed traffic. In Proceedings of the 2013 WiOpt. Jagannathan, K., Markakis, M., Modiano, E., & Tsitsiklis, J. N. (2012). Queue length asymptotics for generalized max-weight scheduling in the presence of heavy– tailed traffic. IEEE/ACM Transactions on Networking, 20(4), 1096–1111. Mandelbaum, A., & Reiman, M. I. (1998). On pooling in queueing networks. Management Science, 44(7), 971–981. Markakis, M. G., Modiano, E. M., & Tsitsiklis, J. N. (2009). Scheduling policies for single-hop networks with heavy-tailed traffic. In Proceedings of the forty-seventh annual Allerton conference on communication, control, and computing. Stolyar, A. L. (2004). Maxweight scheduling in a generalized switch: State space collapse and workload minimization in heavy traffic. Annals of Applied Probability, 14(1), 1–53. Tekin, E., Hopp, W. J., & van Oyen, M. P. (2009). Pooling strategies for service center agent cross-training. IIE Transactions, 41(6), 546–561. Tsitsiklis, J. N., & Xu, K. (2013). Queueing system topologies with limited flexibility. In Proceedings of the 2013 ACM SIGMETRICS.