Optimal capacity management and planning in services delivery centers

Optimal capacity management and planning in services delivery centers

Performance Evaluation ( ) – Contents lists available at ScienceDirect Performance Evaluation journal homepage: www.elsevier.com/locate/peva Opti...

517KB Sizes 1 Downloads 44 Views

Performance Evaluation (

)



Contents lists available at ScienceDirect

Performance Evaluation journal homepage: www.elsevier.com/locate/peva

Optimal capacity management and planning in services delivery centers Aliza R. Heching, Mark S. Squillante ∗ Mathematical Sciences Department, IBM Thomas J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY 10598, USA

article

info

Article history: Available online xxxx Keywords: Human server systems Services delivery centers Simulation optimization Stochastic modeling and analysis Stochastic optimization

abstract This paper considers human server systems of queues that arise within the information technology services industry. We develop a two-phase stochastic optimization solution approach to effectively and efficiently address the capacity management and planning processes of information technology services delivery centers. A large collection of numerical experiments of real-world human server system environments investigates various issues of both theoretical and practical interest, quantifying the significant benefits of our approach as well as evaluating the financial-performance trade-offs often encountered in practice. © 2014 Elsevier B.V. All rights reserved.

1. Introduction System efficiency and process improvement have been studied extensively in computer systems, communication networks and other settings where non-human resources are used for service delivery. Significant research effort has been dedicated to the study and optimal design of these systems. Of growing business concern and research interest are innovative methods to evaluate system behavior and improve system performance in human capacity services systems; see, e.g., [1]. Such human capacity systems are characterized by the presence of human servers who manage the queues of customer requests and serve these requests. Customer requests are grouped into classes based on various attributes, which can include service performance guarantees and the skills required to respond to the request. The human servers, or agents, often have different skills and different levels of experience and expertise within these skills that restrict the request classes they may serve or impact the rate at which they serve these request classes. Agents are typically grouped into teams based on various attributes such as customers supported, similarity of agent skills, and geographic location. Agent teams are associated with physical or virtual services delivery locations (SDLs) from which services are provided to customers. An SDL may support a subset of services, specializing in a subset of skills, or support a broad range of services. Services delivery providers (SDPs) offer service support from one or more potentially globally distributed SDLs. A services delivery center (SDC) represents a collection of constituent SDLs such that the SDLs comprising a single SDC may share resources and workloads, have common processes, operate as a single profit-and-loss center, or be related in other ways resulting in coordinated decision-making across the SDLs. Examples of such SDCs abound and cover a broad array of services areas including healthcare delivery, information technology delivery, and food services. Similar to systems with non-human servers, of key operational and strategic interest for SDC environments is how to improve service delivery performance and reduce total delivery cost (driven largely by capacity staffing costs). In addition to skills matching, SDCs have some unique constraints introduced by the behavioral dynamics that are not present in



Corresponding author. Tel.: +1 914 945 3360. E-mail address: [email protected] (M.S. Squillante).

http://dx.doi.org/10.1016/j.peva.2014.01.003 0166-5316/© 2014 Elsevier B.V. All rights reserved.

2

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



non-human server systems. For example, restrictions such as shift schedules, legal limitations on agent utilization (often geography specific), training and learning effects, and fatigue effects must be considered. Moreover, contractual constraints may place restrictions on the minimum number of agents that must be available during various hours of the day as well as how agents may be shared across different customers (e.g., due to privacy concerns). SDCs where the end-customer directly observes agent behavior have additional constraints driven by expectations relating to customer experience. Given the foregoing complexities (described in more detail below), simulation-based optimization is the primary solution approach for capacity management and planning of a wide range of SDC environments. The advantages of simulation-based optimization concern accuracy and solution quality, including the ability to use a detailed stochastic performance model that captures all of the characteristics and complexities of real-world SDCs and the ability to determine a high-quality optimal solution within the context of this high-fidelity stochastic model. The disadvantages of simulation-based optimization, however, concern the prohibitive costs in both time and resources required to obtain optimal solutions in practice. There are two general categories of simulation-based optimization techniques: (1) a broad spectrum of metaheuristics, such as tabu or scatter search, to control a sequence of simulation runs to find an optimal solution (e.g., see [2] and [3, Chapter 20]); (2) methods that directly solve the problem with a more rigorous mathematical foundation, such as stochastic approximation algorithms (e.g., see [4, Chapter 8] and [3, Chapter 19]). Although the latter category has a strong theoretical foundation in the case of continuous decision variables, the underlying theory in the presence of discrete decision variables is far less well understood [4]. Since the agent team capacities of SDCs are integer valued, the metaheuristics-based approach, employed in nearly all major simulation software products that support optimization (e.g., offerings from AnyLogic and Arena), is the dominant solution approach for capacity management and planning in many SDC environments. The diverse costs in both time and resources of simulation-based optimization are particularly prohibitive in certain SDC environments (as demonstrated and quantified in Section 4). Our objective in this study is to develop a stochastic optimization solution approach for capacity management and planning that provides the advantages of simulation-based optimization while eliminating its disadvantages, within the context of a specific class of motivating SDC environments. We then apply our solution approach to support case studies of capacity management and planning decision-making in various real-world SDC environments. 1.1. Services delivery centers While most SDCs have received far too little attention, the class of call center environments (CCEs) has been the focus of many different research studies. Aspects of CCEs include distinguishing attributes such as fully flexible agents, high-volume workloads, relatively short task processing times, relatively simple tasks, and tasks that are highly repetitive in nature; see, e.g., [5]. Several studies have considered diverse stochastic models of the capacity staffing problem in CCEs with skillsbased routing. For example, Gurvich and Whitt [6] analyze a policy for assigning service requests to agent teams based on state-dependent thresholds for team idleness and service class queue length. Gurvich et al. [7] propose a formulation of the skills-based routing problem under stationary arrival rates by converting mean performance constraints to chancebased constraints such that a service delivery manager can select the risk of failing to meet the former constraints. Another body of work has attempted to more closely incorporate the complexities of various operational decisions in real-world CCEs using a wide range of approaches that combine simulation and optimization techniques. For example, Atlason et al. [8] solve a sample-mean approximation of the capacity staffing problem using a simulation-based analytic center cutting-plane method. Feldman and Mandelbaum [9] employ a stochastic approximation approach to determine optimal capacity staffing, where service levels (SLs) are modeled in the constraints or the objective and simulation is used to evaluate SL attainment. Although CCEs have received a great deal of attention in the research literature, the class of human server systems arising in information technology (IT) SDC environments has received far less research attention. Moreover, there are many fundamental differences between CCEs and the comparatively understudied IT SDCs motivating our present study. This includes the degree of agent work-shift flexibility, where CCEs tend to employ greater flexibility in agent work-shift schedules than in IT SDCs to accommodate significantly higher degrees of nonstationary workload arrival patterns and significantly higher volumes of workload intensity. CCEs also tend to have a relatively lower fragmentation in skills required to respond to the different classes of requests, whereas IT SDCs tend to require greater expertise and deeper skills relative to the volume of requests for each of the different skills. This is one reason why many classes of requests require significantly more time to resolve by agents in an IT SDC than in a CCE. We shall primarily focus on IT SDCs in this paper, though our general methodology can be applied to a wide variety of SDC environments. Arriving customer requests are tagged with attributes that include the request class, type of customer, breadth and depth of skills required to serve the request, geographic location from which the request was generated, and urgency of the request. The urgency of requests guides the order in which agent teams resolve the requests, where urgency and other factors (e.g., business needs and the criticality of supported systems) associated with each request class dictate how the request classes are prioritized for service. These attributes in combination further dictate the performance and quality guarantees associated with customer requests, which can involve system availability, sojourn time (total time in system), waiting time (total time until first touched by an agent), and residence time (difference between sojourn and waiting times). Such guarantees are provided in the form of service level agreements (SLAs), representing contractual agreements between the SDP and the customer that define the service performance the SDP must deliver to the customer. An SDP typically has multiple SLAs in place with a single customer, each of which specifies the scope, time frame, target and

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



3

percentage of attainment for the agreement. Under the terms of the SLA, the scope refers to the classes of requests whose SL or performance are measured, the time frame refers to the interval over which performance is measured, and the target and percentage attainment refer to the specifics of the SL performance that must be achieved. IT SDC environments are highly dynamic in nature, with the volume of workload varying throughout the day and exhibiting seasonal patterns. Systems that support core business activity typically experience higher volume during customer business hours, whereas scheduled workload related to system maintenance tends to generate higher workload to the agent teams during customer non-business hours (e.g., nights and weekends). Further contributing to the high degree of system dynamics, workload is often added or removed from agent teams due to the highly competitive nature of the business. Changes in agent attributes also contribute to the dynamic nature of the SDC environment, characterized by high agent attrition and its impact on skills mix and service times for the combination of remaining and newly hired agents available to serve customer requests. Within this highly complex and dynamic IT SDC environment, the SDP seeks to improve system financial and performance measures as part of the SDC capacity management and planning processes. The complexities and dynamics of the business environment dictate the need to frequently reassess system financial and performance measures to identify the impact of these changes on the SDC. Current state-of-the-art simulation-based optimization methods, however, are inadequate to support such SDC capacity management and planning processes. 1.2. Our contributions To address these needs, we present a general methodology for efficiently and effectively solving stochastic optimization problems that arise in the capacity management and planning of real-world IT SDC environments. Our methodology is based on modeling SDC environments as general stochastic human server systems of queues and on a two-phase approach for optimizing financial and performance measures in these complex stochastic systems. The first phase of our solution approach consists of deriving a stochastic analysis of key performance measures for a relaxation of the stochastic system model and then deriving an efficient solution of the corresponding stochastic optimization problem. This first phase quickly identifies a nearly optimal solution to the original stochastic optimization problem based on a financial-performance objective. The second phase of our solution approach then exploits this first-phase solution as a starting point together with a simulation-based optimization methodology to move from a nearly optimal solution to the optimal solution of the original stochastic optimization problem. This second phase determines the capacity decisions and related actions that yield optimal system financial-performance measures. Our solution methodology is similar in spirit to the recent work in [10], although the underlying mathematical methods and motivating application domain are quite different. There are several important reasons for adopting such a two-phase solution approach. From an SDP business perspective, it is critical to be able to determine optimal agent capacity levels in a highly accurate and very efficient manner. Upon leveraging our first-phase results as a starting point, the second-phase simulation-based optimization provides optimal capacity decisions within only a few simulation runs and thus reduces by several orders of magnitude the time and resources required to evaluate and optimize system financial and performance measures in comparison with state-ofthe-art approaches. Such tremendous improvements from our first-phase methodology enable a much broader and deeper exploration of the entire financial-performance space. Our second-phase methodology can then be exploited more surgically to obtain optimal capacity decisions and actions for regions of the financial-performance space of greatest importance or sensitivity, rendering significant improvements in both the efficiency and quality of the capacity management and planning processes of real-world SDCs. The remainder of this paper is organized as follows. Section 2 describes our mathematical model and optimization formulation. We present in Section 3 our general solution approach, describing the first-phase and second-phase methodologies in turn. Section 4 presents a representative sample of results from a large collection of numerical studies as part of the capacity management and planning process of real-world SDCs, including a discussion of our experience with such SDC environments and the benefits of our solution approach in these settings. Concluding remarks follow in Section 5. 2. Mathematical model and formulation In this section we present our stochastic model of a generic SDC environment and our formulation of the corresponding capacity management and planning stochastic optimization problem. Some SDC preliminaries are first discussed, followed by descriptions of the mathematical model and optimization problem. 2.1. Preliminaries We consider a general class of SDCs consisting of multiple global SDLs that serve diverse classes of requests arriving from multiple customers. Each SDL is comprised of agents that are colocated at various centralized locations from which support is provided to the multiple customers. The different classes of requests arrive from diverse sources with arrival patterns containing various forms of uncertainty/variability. Agent teams are formed such that individuals on the same team have identical skills with respect to serving customer requests, in terms of both range (breadth) and expertise level

4

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



(depth) of skills. For each request class and each agent team serving such requests, the service time patterns include various forms of uncertainty/variability. The serving of each class of requests must adhere to performance-oriented SLAs, further refining the different request classes to include both required skills and performance guarantees. Since requests arriving to agent teams may vary significantly with respect to urgency, service times and performance guarantees, the SDLs exploit a priority discipline within each team where the serving of requests often follows a fixed set of procedures and a preempted request is resumed at the point in time when it was paused. The same priority scheme is followed by every agent team. A profit-based objective is employed by each SDL where revenues are gained for requests served, penalties are incurred for SLA violations, and costs are incurred for agent team capacities. Operating hours at the SDLs may vary depending upon the customer business needs and the contractually agreed upon support that is provided. The time horizon of interest is on the order of a week or month or more, comprised of different work-shifts and timevarying request arrival and service time patterns. A detailed analysis of data from the real-world SDCs motivating our study demonstrates that the workloads vary from one work-shift to the next and that the workload within a work-shift can be partitioned into a small number of stationary intervals, each consisting of underlying stochastic behaviors that do not change in distribution when shifted in time across a sufficiently long period. 2.2. Stochastic performance model Our stochastic model of a generic SDL environment consists of first partitioning the time horizon of interest over workshifts and time-varying workloads into stationary intervals and then defining a performance model for each stationary interval such that these stationary stochastic performance models are combined to represent the entire time horizon and constituent work-shifts. Let I denote the (totally ordered) set of stationary intervals over the combination of work-shifts and time horizon, indexed by i ∈ I. For each stationary interval i, our model involves a set J of priority queueing systems, one for each agent team indexed by j ∈ J. A customer request for service belongs to one of a set K of request classes, indexed by k ∈ K. Class k requests arrive to the SDL in stationary interval i according to a stochastic process {Ai,k (t ); t ≥ 0} with Ai,k (t ) := sup{n : Ai,k (n) ≤ t } and finite rate λi,k , where Ai,k (n) :=

n

(n)

(m)

m =1

ai,k , Ai,k (0) := 0, the random variable (r.v.) ai,k (0)

is the interarrival time between the (n − 1)st and nth class k requests in interval i, and ai,k := 0, n ≥ 1, all defined on the same probability space. Upon arrival, a class k request is routed to the queueing system of agent team j in interval i according to a routing policy that renders a corresponding class-team stochastic arrival process {Ai,j,k (t ); t ≥ 0} with Ai,j,k (t ) := sup{n : Ai,j,k (n) ≤ t } and finite rate λi,j,k , where Ai,j,k (n) :=

n

m=1

(n)

(m)

ai,j,k , Ai,j,k (0) := 0, the r.v. ai,j,k is the interarrival time between the (n − 1)st

(0) and nth class k requests served by agent team j in interval i, and ai,j,k := 0, n ≥ 1, all defined on the same probability space.  (n) The class-team sequences of interarrival time r.v.s ai,j,k are such that λi,k = j∈J λi,j,k with λi,j,k fixed to be 0 whenever

agent team j does not have the appropriate skills to serve class k requests. The times required for agent team j to serve class k requests in stationary interval i are governed by a stochastic process {Si,j,k (t ); t ≥ 0} with Si,j,k (t ) := sup{n : Si,j,k (n) ≤ t } (m)

(n)

and finite rate µi,j,k , where Si,j,k (n) := m=1 si,j,k , Si,j,k (0) := 0, and the r.v. si,j,k is the time required to serve the nth class k request by team j in interval i, n ≥ 1, all defined on the same probability space. The class-team sequences of service time (n) r.v.s si,j,k reflect the breadth and depth of skills of team j agents for serving class k requests in interval i. The queueing system for each team in each stationary interval employs a fixed-priority scheduling policy such that requests of class k are given priority over requests of class k′ for all k < k′ with k, k′ ∈ K. A preemptive–resume scheduling discipline is deployed across request classes in which the serving of preempted requests is resumed from the point where they left off without any overhead. Requests within each class are served in a first-come, first-served (FCFS) manner. Let Ci,j ∈ Z+ denote the number of agents (capacity) that comprises team j in interval i. System dynamics are such that class k requests arrive to the team j queueing system in interval i according to the routing policy rendering the arrival process {Ai,j,k (t ); t ≥ 0}, wait in the corresponding class k queue to be served by one of the Ci,j agents (including possible preemptions), and leave the system upon completion according to the service process {Si,j,k (t ); t ≥ 0}. Hence, our stochastic model for each team j in interval i is based on a multiclass G/G/Ci,j fixed-priority preemptive–resume queueing system. Let Ti,j,k , Wi,j,k and Ri,j,k respectively denote the generic r.v. for the stationary sojourn time, waiting time and residence

n

(1)

(2)

(1)

(2)

time of class k requests served by team j in interval i, where Ti,j,k = Wi,j,k + Ri,j,k . Define Ti,j,k , Ti,j,k , . . . and Wi,j,k , Wi,j,k , . . . to be the sequences of sojourn time and waiting time r.v.s for class k requests served by team j in interval i, respectively. (n) (n) The contractual SLAs can be defined for each request class k as a function SLAk (·) of Ti,j,k or Wi,j,k or both. A corresponding contractual SL target SLTk indicates the performance guarantee that is required as part of satisfying the SLA, whereas the corresponding contractual percentage attainment PAk specifies the percentage of customer requests that are required to achieve the SL performance target; e.g., an SLA might require 95% of class k requests to have sojourn times within 10 min. The financial objectives of the SDC are based on a profit–performance model in which revenues are gained for serving requests and costs are incurred for both SLA violations and agent capacity levels. The SDC receives revenue as a function of Rj,k for each class k request that is served by agent team j, incurs a penalty as a function of Pj,k for the nth class k request (n)

(n)

served by agent team j whenever the SLA function SLAk (Ti,j,k , Wi,j,k ) exceeds the SL target SLTk relative to the percentage of attainment PAk , and incurs a cost as a function of Cj,k for the capacity Ci,j of team j in stationary interval i.

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



5

2.3. Stochastic optimization formulation The objective of our optimization problem formulation is to determine the capacity of agent teams j and the routing of class k requests to agent teams j that maximize profit in expectation under the foregoing stochastic performance model across all stationary intervals i comprising the time horizon together with the overlap of work-shifts, subject to model inputs and constraints. Namely, we seek to determine both the optimal capacity Ci∗,j for teams j in intervals i and the optimal routing decision process Λ∗i,j,k (·) for class k requests and teams j in intervals i over the time horizon T with respect to every SLA, given class-team routing constraints and arrival and service time processes Ai,k (·) and Si,j,k (·). The optimal class-team routing decision vector process {3∗ (t ); t ≥ 0} with 3∗ (t ) := (Λ∗i,j,k (t )) includes determining the set of stochastic arrival processes A∗i,j,k (t ) with finite rates Λ∗i,j,k = λ∗i,j,k for all i ∈ I, j ∈ J, k ∈ K. To simplify the formulation, suppose the optimal routing of class k requests satisfy λi,j,k /(Ci,j µi,j,k ) < 1. For expository convenience, we shall consider (n)

(n)

the SLAs of class k to be such that penalties are applied according to the indicator function 1{SLAk (Ti,j,k , Wi,j,k ) > SLTk , PAk }, which takes on the value 1 when there is an SLA violation and the value 0 otherwise. In accordance with the SDCs motivating our study, the revenue, penalty, and cost functions are linear in the total number of requests, number of SLA violations, and number of agents. Define the capacity vector C := (Ci,j ) and the class-team routing rate vector 3 := (Λi,j,k ). We then have the following general formulation of our stochastic optimization problem (OPT-SDC) over the time horizon T : i,j,k (T

N max

C,3(·)

s.t.

 i∈I



E

j∈J k∈K

Λi,j,k = λi,k ,



)

Rj,k − Pj,k ·

(n) (n) 1{SLAk (Ti,j,k , Wi,j,k )

 > SLTk , PAk } − Cj,k Ci,j ,

n =1

∀i ∈ I, ∀k ∈ K,

j∈J

Λi,j,k = 0, Λi,j,k ≥ 0,

if I(j, k) = 0, ∀i ∈ I, ∀j ∈ J, ∀k ∈ K, if I(j, k) = 1, ∀i ∈ I, ∀j ∈ J, ∀k ∈ K,

where Ni,j,k (t ) is the cumulative number of class k requests routed to team j in interval i through time t, and I{j, k} := 1{class k requests can be served by agent team j}. The capacity vector C and routing vector process 3(·) are the decision variables we seek to obtain, with all other variables as input parameters. 3. Two-phase stochastic optimization approach We now present our general two-phase solution approach for efficiently and effectively solving stochastic optimization problems that arise in the capacity management and planning of SDC environments. The first phase of our approach provides an accurate approximate solution, which requires further refinement that is obtained by exploiting the first-phase results as a starting point for the second phase of our solution approach consisting of a simulation-based optimization methodology. Our first-phase methodology also enables one to efficiently explore the entire design space, and then our second-phase methodology can be leveraged in a more surgical manner to obtain optimal solutions for the regions of the financialperformance space that have the greatest importance or sensitivity. The two phases of our general solution approach are presented in turn. 3.1. First-phase mathematical analysis and optimization Our first-phase solution consists of deriving a stochastic analysis of a relaxation of the performance model of Section 2.2 and then deriving an efficient solution of the corresponding stochastic optimization problem. We present each of these aspects of the first phase of our solution approach, after providing some technical preliminaries. Our stochastic analysis methodology includes derivations of large deviations, strong approximations and other results for key performance measures. Our optimization analysis methodology involves derivations of solutions based on a combination of mathematical programming algorithms and the results from our stochastic analysis. 3.1.1. Preliminaries Let D |K| be the space of |K|-dimensional real-valued functions on [0, ∞) that are right-continuous with left limits. For ℓiℓ

ℓiℓ

¯ ∈ D |K| , we write X(t ) ≈ X¯ (t ), or equivalently, X ≈ X, ¯ if as T → ∞ any two stochastic processes X, X  a.s. ∥X − X¯ ∥T = O( T log log T ). ¯ (t ) = ct for some constant A stochastic process X is said to have a functional law of the iterated logarithm approximation X |K|

ℓiℓ

c ∈ R+ , if X(t ) ≈ ct. A function X ∈ D |K| is defined to be r-strong continuous for some r ∈ (2, 4) if sup√

0≤s,t ≤T , |s−t |≤ T log log T

∥X (s) − X (t )∥ = o(T 1/r ),

as T → ∞.

6

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



Similarly, a stochastic process X = {X(t ); t ≥ 0} in D |K| is defined to be an r-strong continuous process for some r ∈ (2, 4) if, with probability one, the sample path of this version is r-strong continuous on a probability space. A stochastic process X has a strong approximation if for some r ∈ (2, 4) there exists a probability space on which a version of X and an r-strong ˆ are defined such that continuous stochastic process X .s. ˆ (t )| a= sup |X(t ) − ct − X o(T 1/r ), 0≤t ≤T

˜ = {X˜ (t ); t ≥ 0} with where c is a constant. In this case, the stochastic process X is said to have a strong approximation X ˜ (t ) = ct + Xˆ (t ). For any two stochastic processes X, X˜ ∈ D |K| , if for some r ∈ (2, 4) X .s. ˜ (t )| a= sup |X(t ) − X o(T 1/r ), 0≤t ≤T r

r

˜ (t ), or equivalently, X ≈ X. ˜ then we write X(t ) ≈ X A detailed analysis of data from the real-world SDCs motivating our study demonstrates that the times required for agent team j to serve class k requests in interval i are independent and identically distributed (i.i.d.) with more than the first two moments finite. This detailed data analysis similarly demonstrates that the interarrival times for class k requests in interval i are i.i.d. with more than the first two moments finite, and that the interarrival time sequence and service time sequence are mutually independent. Define the generic interarrival time r.v. ai,k (respectively, ai,j,k ) and the generic service time r.v. (n) d

(n)

d

(n)

d

si,j,k such that ai,k = ai,k (respectively, ai,j,k = ai,j,k ) and si,j,k = si,j,k , where λi,k = 1/E[ai,k ] (respectively, λi,j,k = 1/E[ai,j,k ]) and µi,j,k = 1/E[si,j,k ]. The exogenous arrival and service time processes Ai,k (t ), Si,j,k (t ) and Si,j,k (t ) can then be defined on an appropriate probability space such that for some r ∈ (2, 4) r

Ai,k (t ) ≈ A˜ i,k (t ) := λi,k t + Aˆ i,k (t ),

(1)

r

Si,j,k (t ) ≈ S˜i,j,k (t ) := µi,j,k t + Sˆi,j,k (t ),

(2)

r

Si,j,k (t ) ≈ S˜i,j,k (t ) := mi,j,k t + Sˆi,j,k (t ), where λi,k ≥ 0, µi,j,k > 0, mi,j,k

(3)

= 1/µi,j,k , and where Aˆ i,k (t ), Sˆi,j,k (t ) and Sˆi,j,k (t ) are r-strong continuous and given by

1/2

Aˆ i,k (t ) = λi,k Via,k Bai,k (t ),

(4)

1/2

Sˆi,j,k (t ) = µi,j,k Vis,j,k Bsi,j,k (t ),

(5)

Sˆi,j,k (t ) = mi,j,k Vis,j,k Bsi,j,k (t ),

(6)

with denoting the coefficient of variation of , denoting the coefficient of variation of si,j,k , and (t ) and (t ) denoting mutually independent standard Brownian motions. These strong approximations for Ai,k (t ), Si,j,k (t ) and Si,j,k (t ) with Eqs. (1)–(6) follow directly from well-known functional strong approximation results [11] together with the i.i.d. interarrival time sequence, the i.i.d. service time sequence, and the mutual independence of these sequences. The definitions of performance measures of interest include the sojourn time process Ti,j,k = {Ti,j,k (t ); t ≥ 0}, the aggregated workload process Zi,j,k = {Zi,j,k (t ); t ≥ 0} and the cumulative idle time process Yi,j,k = {Yi,j,k (t ); t ≥ 0}, where Ti,j,k (t ) denotes the total sojourn time of class k requests at the team j queueing system in interval i at time t, Zi,j,k (t ) denotes the total amount of existing work at the team j queueing system in interval i comprised of requests in classes 1 through k that are either in queue or in service at time t, and Yi,j,k (t ) denotes the cumulative amount of time that the team j queueing system in interval i does not serve requests in classes 1 through k during [0, t ]. Define the net-put process Ni,j,k (t ) to be the total workload input from request classes 1 through k at the team j queueing system in interval i during [0, t ] minus the work that would have been completely served if the queueing system was never idle, Ui,j,k (t ) to be the total amount of time spent serving class k requests at the team j queueing system in interval i during [0, t ], Vi,j,k (t ) to be the time that a class k request would spend at the team j queueing system in interval i if it arrived at time t, and Gi,j,k (t ) to be the time at which the first class k request arrives to the team j queueing system in interval i during [t , ∞). Let MX (ϑ) = E[eϑ X ] denote the moment generating function of a r.v. X , and let f (n) ∼ g (n) denote that limn→∞ f (n)/g (n) = 1 for any functions f and g.  + ′ Further define ρi+,j,k := ′ k ≤k, k′ ∈K ρi,j,k , ρi,j,k := λi,j,k /(Ci,j µi,j,k ), and (x) := max{x, 0}.

Via,k

ai,k Vis,j,k

Bai,k

Bsi,j,k

3.1.2. Stochastic performance analysis The stochastic analysis of our first-phase solution involves relaxations of certain aspects of the original stochastic performance model and derivations of results for key performance measures. One of our model relaxations concerns the capacity Ci,j of the queueing system for each team j and interval i. Since agents typically split their time between serving customer requests and other business tasks they must perform, the SDLs have many different staffing options for realizing any given capacity Ci,j . We therefore consider a particular relaxation of agent team capacity where Ci,j ∈ R+ is interpreted to be a capacity scaling variable for the processing rate of a corresponding multiclass GI /GI /1 fixed-priority queueing system for each team j and interval i. The sequence of r.v.s for the service times of class k requests by team j in interval i is then

A.R. Heching, M.S. Squillante / Performance Evaluation (

(1)

)



7

(2)

given by si,j,k /Ci,j , si,j,k /Ci,j , . . . . It is well known that such a single-server approximation of a multiserver queueing system (single class) is exact in the limit of heavy traffic [12]. The next relaxation of our original stochastic model concerns the class-team routing decision process Λi,j,k (·) for each request class k, team j and interval i. To simplify our first-phase stochastic analysis, we assume these routing decisions to be probabilistic such that a class k request is routed to team j with probability Pi,j,k ∈ [0, 1], independent of all else. We therefore have Λi,j,k = Pi,j,k λi,k , and thus determining the optimal routing decision variables reduces to determining the optimal routing probabilities Pi∗,j,k . This relaxation allows us to derive the optimal fraction Pi∗,j,k of class k requests that should be served by team j in interval i, which then can be refined by our second-phase simulation-based optimization that takes into account all higher order properties of the optimal class-team routing decision process Λ∗i,j,k (·), such as A∗i,j,k (t ). From (1), we further have r

Ai,j,k (t ) ≈ A˜ i,j,k (t ) := λi,j,k t + Aˆ i,j,k (t ). (7) Now we turn to our results for key performance measures, initially considering large-deviations decay rates. To elucidate (n) d

the exposition, we shall focus on quantities and SLAs related to the sojourn times Ti,j,k = Ti,j,k , noting that the waiting times (n)

d

Wi,j,k = Wi,j,k can be addressed in an analogous manner. The representation of 1{SLAk (Ti,j,k ) > SLTk , PAk } within our stochastic analysis of the first phase depends upon the details of the SLA. Whenever the SLA relates to the sojourn time tail distribution for class k requests served by team j, we seek to derive a stochastic analysis of P[SLAk (Ti,j,k ) > SLTk ]. For SLTk sufficiently large, we can take a large-deviations approach. To this end, first observe that the large-deviations decay rate of the sojourn time distribution for class 1 requests served at the fixed-priority queueing system of any team in any stationary interval is the same as the large-deviations decay rate of the sojourn time distribution in the corresponding single-class FCFS queue. This allows us to exploit well-known large-deviations decay rate results to obtain the desired tail asymptotic result for class 1 requests. Theorem 3.1. For class 1 requests served at the GI /GI /1 fixed-priority queueing system of agent team j ∈ J with capacity Ci,j in stationary interval i ∈ I, we have log P[Ti,j,1 > z ] ∼ −Υˆ ∗ z ,

as z → ∞,

where

Υˆ ∗ = sup{θ : MAi,j,1 (−θ )MSi,j,1 (θ ) ≤ 1}.

(8)

The tail asymptotic result for class 2 requests then can be obtained as in [13]. Theorem 3.2. For class 2 requests served at the GI /GI /1 fixed-priority queueing system of agent team j ∈ J with capacity Ci,j in stationary interval i ∈ I, we have log P[Ti,j,2 > z ] ∼ −Υ ∗ z ,

as z → ∞,

(9)

where

Υ∗ =

sup {θ − Υ (θ )},

θ ∈[0,Υˆ ∗ ]

Υ (θ ) = −MA−i,1j,1 (1/MSi,j,1 (θ )),

(10)

and Υˆ ∗ is as given in (8). Following [14], we now extend the above results to obtain the corresponding asymptotic tail behavior of the sojourn time distributions for all request classes at the queueing system of each team in any stationary interval. Theorem 3.3. For class k ∈ K requests served at the GI /GI /1 fixed-priority queueing system of agent team j ∈ J with capacity Ci,j in stationary interval i ∈ I, we have as z → ∞ log P[Ti,j,k > z ] ∼ −Υi,∗j,k z ,

(11)

Υi,∗j,1 = sup{θ : MAi,j,1 (−θ )MSi,j,1 (θ ) ≤ 1},

(12)

where

Υi,∗j,k

=

sup θ∈[0,Υˆ i∗,j,k ]

{θ − Υi,j,k (θ )},

1 Υi,j,k (θ ) = −M− ˆ

Ai,j,k−1

k = 2, . . . , |K|,

(1/MSˆi,j,k−1 (θ )),

k = 2, . . . , |K|,

Υˆ i,∗j,k = sup{θ : MAˆ i,j,k−1 (−θ )MSˆi,j,k−1 (θ ) ≤ 1},

k = 2, . . . , |K|,

(13)

(14) (15)

and Aˆ i,j,k and Sˆi,j,k are generic r.v.s for the aggregated interarrival and service times, respectively, of request classes 1, 2, . . . , k ∈ K.

8

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



Proof. The results for classes 1 and 2 follow immediately from Theorems 3.1 and 3.2, respectively. To obtain the corresponding results for classes k = 3, . . . , |K|, we consider the aggregation of classes 1, . . . , k − 1 into a new (single) higher priority class with class k as a new lower priority class. Due to the fixed-priority policy, the asymptotic decay rate of the sojourn time tail distribution for class k does not depend upon any lower priority class k′ = k + 1, . . . , |K|. Hence, upon applying Theorem 3.2 to the system with the aggregated higher priority class substituted for class 1 and the lower priority class k substituted for class 2, we obtain (11), (13), (14), (15) from (8)–(10).  We next turn to consider key performance measures when the SLA relates to the sojourn time tail distribution for class k requests and SLTk is not sufficiently large for a large-deviations approach or relates to the first or second moment of class k sojourn times. Our stochastic analysis in these model instances is based on deriving strong approximations of the class k sojourn time processes for the stochastic model relaxation. From the definitions in Section 3.1.1, we have for key performance measures of interest related to the dynamics of class k ∈ K requests served at the team j ∈ J queueing system in stationary interval i ∈ I: Zi,j,k (t ) = Ni,j,k (t ) + Yi,j,k (t ),

Ni,j,k (t ) =

k 

Si,j,k′ (Ai,j,k′ (t )) − t ,

k′ =1

Ti,j,k (t ) = Vi,j,k (Gi,j,k (t )),

(16)

k



Yi,j,k (t ) = t −

Ui,j,k′ (t ) = sup {−Ni,j,k (s)},

Vi,j,k (t ) = Zi,j,k (t ) +

(17)

0≤s≤t

k′ =1 k−1 

[Si,j,k′ (Ai,j,k′ (Vi,j,k (t ) + t )) − Si,j,k′ (Ai,j,k′ (t ))] + Si,j,k (Ai,j,k (t )) − Si,j,k (Ai,j,k (t ) − 1),

(18)

k′ =1

0 ≤ Gi,j,k (t ) − t ≤ ai,j,k (Ai,j,k (t ) + 1), where the last equality in (17) follows from the one-dimensional reflection mapping theorem [15]. We now can establish the desired strong approximations for the key performance measures of the sojourn time and aggregated workload processes. Theorem 3.4. For class k ∈ K requests served at the GI /GI /1 fixed-priority queueing system of agent team j ∈ J with capacity Ci,j in stationary interval i ∈ I such that the strong approximation assumptions (1)–(7) hold for some r ∈ (2, 4), we have r

(Zi,j,k , Ti,j,k ) ≈ (Z˜ i,j,k , T˜ i,j,k ), where

˜ i,j,k (t ) + Y˜ i,j,k (t ), Z˜ i,j,k (t ) = N T˜ i,j,k (t ) =

Z˜ i,j,k (t ) + Ri,j,k 1 − ρk+−1

˜ i,j,k (t ) = (ρi+,j,k − 1)t + N

≥ k 

(19) Z˜ i,j,k (t ) + si,j,k 1 − ρk+−1

,

[Aˆ i,j,k′ (t )/µi,j,k′ + Sˆi,j,k′ (λi,j,k′ t )],

(20)

(21)

k′ =1

˜ i,j,k (t ) = sup {−N˜ i,j,k (s)}+ , Y 0≤s≤t

and Z˜ i,j,k (t ), T˜ i,j,k (t ) are r-strong continuous. Proof. By Lemma 9.8 and Theorem 9.1 in [16], the strong approximation assumptions (1)–(7) imply the functional law of the iterated logarithm approximations for Ai,j,k (t ), Si,j,k (t ), Zi,j,k (t ), Ti,j,k (t ), Yi,j,k (t ) and Vi,j,k (t ). Following [17], Ni,j,k (t ) is shown to have a strong approximation by rewriting the net-put process and deriving r

Ni,j,k (t ) ≈

k 

[λi,j,k′ /µi,j,k′ t + Aˆ i,j,k′ (t )/µi,j,k′ + Sˆi,j,k′ (λi,j,k′ t )] − t = N˜ i,j,k (t ).

k′ =1

The strong approximation (19) for the aggregated workload process Zi,j,k then follows from Lemma 9.8 in [16], where Lemma 9.10 in [16] renders Z˜ i,j,k to be r-strong continuous. Similarly, from Theorem 9.1 in [16], we can rewrite (18) and derive r

Vi,j,k (t ) ≈ Z˜ i,j,k (t ) +

k−1   [λi,j,k′ /µi,j,k′ (Vi,j,k (t ) + t ) + Aˆ i,j,k′ (t )/µi,j,k′ + Sˆi,j,k′ (λi,j,k′ t )] k′ =1

 − [λi,j,k′ /µi,j,k′ t + Aˆ i,j,k′ (t )/µi,j,k′ + Sˆi,j,k′ (λi,j,k′ t )] + Ri,j,k = Z˜ i,j,k (t ) + ρi+,j,k−1 Vi,j,k (t ) + Ri,j,k ,

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



9

and therefore r

Vi,j,k (t ) ≈

Z˜ i,j,k (t ) + Ri,j,k 1 − ρi+,j,k−1

.

(22)

The strong approximation (20) for the sojourn time process Ti,j,k then follows from (16), (22), Lemma 9.8 in [16] and r

Gi,j,k (t ) ≈ t, where the latter is implied by (7) and standard strong approximation arguments; the right-hand side of the inequality in (20) follows as an alternative strong approximation where the residence time r.v. Ri,j,k is approximated by the service time r.v. si,j,k , which is less than or equal to Ri,j,k . Finally, T˜ i,j,k being r-strong continuous follows from Z˜ i,j,k being r-strong continuous.  We then leverage the results of Theorem 3.4, particularly the strong approximation of the sojourn time process in (20), to directly obtain the first moment, second moment, or tail probability associated with the sojourn time distribution for class k requests served by team j in interval i. As an alternative to directly using (20) for P[SLAk (Ti,j,k ) > SLTk ], we also consider an exponential form for the sojourn time tail distribution of class k requests served by team j in interval i. More precisely, we assume there exist variables αi,j,k and βi,j,k such that

P[Ti,j,k > SLTk ] ≃ αi,j,k e−βi,j,k SLTk ,

(23)

which can be justified by various results in the queueing literature, including the exponential sojourn time distribution in single-class M /M /1 queues [18] and the heavy-traffic approximation of an exponential waiting time distribution in a GI /G/1 queue [19]. While different schemes can be envisioned for choosing the variables αi,j,k and βi,j,k , we consider fitting these two variables with the first two moments of the sojourn time distribution. Specifically, it follows from (23) that

E[Ti,j,k ] =

αi,j,k , βi,j,k

E[Ti2,j,k ] =

2αi,j,k

βi2,j,k

,

βi,j,k =

2E[Ti,j,k ]

E[Ti2,j,k ]

,

αi,j,k =

2E[Ti,j,k ]2

E[Ti2,j,k ]

,

(24)

where E[Ti,j,k ] and E[Ti2,j,k ] are directly obtained from (20). We find this approach to be particularly beneficial when the direct use of (20) for the sojourn time tail probability requires, for example, numerical computation which may not effectively support optimization within our first-phase methodology, whose primary goal is to efficiently obtain sufficiently accurate approximations and nearly optimal solutions as input to the second phase of our solution approach. 3.1.3. Stochastic optimization analysis In this section we consider the optimization analysis of our first-phase approach. This consists of deriving an efficient solution of the stochastic optimization problem (OPT-SDC) in Section 2.3, based on a combination of mathematical programming methods and the results of our stochastic analysis in the previous subsection. To begin, we first determine the minimum capacities required to ensure that each of the agent team queueing systems is stable. This must hold for any feasible solution of the stochastic optimization problem (OPT-SDC) and therefore requires that ρi+,j,|K| < 1 for all i ∈ I and j ∈ J. |I|×|J| Our previous model relaxations allow us to focus here on the capacity vector C ∈ R+ and the class-team routing |I|×|J|×|K|

rate vector 3 ∈ R+

|I|×|J|

as the decision variables of interest, instead of the capacity vector C ∈ Z+

and the class-team

|I|×|J|×|K|

routing decision vector process {3(t ); t ≥ 0} on R+ of the original stochastic optimization problem (OPT-SDC). Since SLAs based on sojourn time tail distributions represent the most common cases encountered in the SDCs motivating our study, we consider SLAs for class k requests served by team j in interval i such that the SDL does not incur any penalties as long as P[Ti,j,k ≤ SLTk ] ≥ PAk , or equivalently, monetary penalties will be applied whenever P[Ti,j,k > SLTk ] > 1 − PAk := γk . It is important to note, however, that SLAs based on the first or second moments of the sojourn time distribution can be readily addressed within our general solution approach upon substituting the corresponding results from our stochastic analysis in an analogous manner. Recall that the revenue, penalty and cost functions are linear with rates Rj,k , Pj,k and Cj,k , respectively. We therefore have the following general formulation of the stochastic optimization problem under our stochastic performance model relaxation (OPT-SDCR) over the time horizon T : max C,3

s.t.

 i∈I



E[Ni,j,k (T )]Rj,k − E[Ni,j,k (T )]Pj,k P[Ti,j,k > SLTk ] − γk



+

− Cj,k Ci,j ,

(25)

j∈J k∈K

Λi,j,k = λi,k ,

∀i ∈ I, ∀k ∈ K,

(26)

j∈J

Λi,j,k = 0,

if I(j, k) = 0, ∀i ∈ I, ∀j ∈ J, ∀k ∈ K,

(27)

Λi,j,k ≥ 0,

if I(j, k) = 1, ∀i ∈ I, ∀j ∈ J, ∀k ∈ K.

(28)

The next step of our general solution approach is to deal with the probabilities P[Ti,j,k > SLTk ] in (25) based on the stochastic analysis of the previous subsection. In particular, when the value of SLTk is relatively small, we either substitute (20) directly

10

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



or substitute (23), (24) and the first and second moments of (20) into the optimization formulation. When the value of SLTk is sufficiently large, we instead substitute (11)–(15) in an analogous manner. The best mathematical programming methods to solve the stochastic optimization problem (OPT-SDCR) will depend upon the fundamental properties of the specific problem instance. When the objective and constraint functions are concave in (C, 3), we exploit these and related properties of (25)–(28) together with convex programming methods [20] to efficiently obtain the unique optimal solution of (OPT-SDCR). When the objective or constraint functions are not concave, we exploit advanced nonlinear programming methods to efficiently obtain an (locally) optimal solution of (OPT-SDCR) by leveraging some of the best interior-point algorithms [21] and implementations [22]. We further note that this mathematical programming solution approach applies more generally to problem instances with nonlinear revenue, penalty and cost functions. In some formulations encountered within the motivating SDCs, the capacity levels C are given and the stochastic optimization problem reduces to determining the optimal routing rate vector 3 that maximizes expected profit. Such formulations fall within the general class of resource allocation problems (RAPs) and various algorithms exist to solve these problems [23,24], where we can exploit the strict priority ordering and preemptive–resume scheduling to isolate the perclass queues for every team and obtain a separable RAP. More precisely, we consider class 1 in isolation across all team j queueing systems in each interval i because, under a fixed-priority preemptive–resume scheduling policy, lower priority classes do not interfere with the service of class 1 requests. For every interval i ∈ I, with k set to 1 and Ci,j given, the objective function (25) is now replaced by max 3i,k



E[Ni,j,k (T )]Rj,k − E[Ni,j,k (T )]Pj,k P[Ti,j,k > SLTk ] − γk



+

− Cj,k Ci,j ,

(29)

j∈J

where 3i,k := (Λi,j,k ). We solve this separable RAP for class 1 and proceed in a recursive manner for subsequent request classes k = 2, . . . , |K|. Upon solving the RAP for higher priority classes to determine 3∗i,1 , . . . , 3∗i,k−1 and fixing these variables accordingly, algorithmic approaches for separable RAPs are leveraged to efficiently obtain the optimal solution 3∗i,k of (OPT-SDCR) with objective function (29) for class k requests in interval i. More generally, when C is also a decision variable, this general approach for RAPs within our first-phase methodology can be leveraged together with a gradient descent algorithm on C. When the objective function (29) is concave in 3i,k , we exploit algorithms for separable and convex RAPs to efficiently obtain the unique optimal solution of the corresponding version of (OPT-SDCR) recursively for each class k in stationary interval i. Specifically, starting with class k = 1, we seek to solve the separable and convex resource allocation problem (RAP-SDCR): max x

s.t.



fj (xj )

(30)

xj = M ,

(31)

j ∈ Jˆ ,

(32)

j∈Jˆ

 j∈Jˆ

xj ≥ 0,

where the objective (30) corresponds to (29), the fj (·) are increasing, concave and continuously differentiable over an interval including [0, λi,k ], the constraint (31) corresponds to (26) with xj = Λi,j,k ∈ R+ and M = λi,k > 0 over the appropriate subset Jˆ := {j : j ∈ J, I(j, k) = 1} (i.e., a subset of J such that all j for which I(j, k) = 0 are removed), and the constraints (32) correspond to (28). Note that the constraints (27) are dropped from the separable, convex RAP because they simply fix the corresponding xj . The value of M = λi,k is an input that represents the amount of resource to be allocated. The optimal solution of (RAP-SDCR) with k = 1 occurs at the place where the derivatives fj′ (xj ) are equal and the resource allocation constraint in (31) holds, modulo the lower bound constraints in (32). More precisely, our solution algorithm consists of an outer bisection loop that determines the value of the derivative δ and a set of |Jˆ | inner bisection loops that find the value of xj ≥ 0 satisfying fj′ (xj ) = δ if fj′ (0) ≤ δ ; otherwise, we set xj = 0. The initial values for the outer bisection loop can be taken as the minimum of all values fj′ (0) and the maximum of all values fj′ (M ); the initial values for the inner bisection loop of each j ∈ Jˆ can be taken to be 0 and M. Upon using this algorithm to determine the optimal solution 3∗i,1 for each stationary interval i, we then recursively apply our algorithm to obtain the optimal solution 3∗i,k of the corresponding version of (RAP-SDCR) for each class k = 2, . . . , |K| and all stationary intervals i ∈ I. 3.2. Second-phase simulation-based optimization The second phase of our solution approach consists of a simulation-based optimization methodology that exploits the results from our first-phase solution as a starting point. We use simulation-based optimization to capture all of the characteristics and complexities of real-world SDCs, some of which were only approximated in our first-phase solution, and to determine a high-quality optimal solution within the context of this detailed stochastic performance model. By leveraging our first-phase results as a starting point for the simulation-based optimization in the second phase of our

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



11

general approach, we obtain optimal solutions with only a few simulation runs and significantly reduce the prohibitive costs of the state-of-the-art approaches currently deployed in SDCs. As noted in the introduction, the dominant solution in real-world SDCs is based on the metaheuristic approach provided by nearly all major simulation software products that support optimization. In fact, the original capacity management capability for the real-world SDCs motivating our study was implemented using a metaheuristics-based approach. For these reasons, we focus here on a metaheuristic approach for our second-phase simulation-based optimization methodology, noting that any simulation-based optimization methodology can be exploited for the second phase of our general solution approach. Using one such simulation product supporting the metaheuristic-based approach, we develop a stochastic discrete-event simulation that models the detailed complexities of the SDCs motivating our study. This simulation performance model provides an accurate representation of all aspects of real-world SDCs, including the time-varying workload (arrival and service) patterns for requests within each class, the detailed characteristics of how these requests are served by agents with different skills, and the detailed characteristics of various business decisions agreed upon with actual customers, such as work-shift schedules and contractual SLAs. Another particularly important feature in some of the motivating SDCs is that multiple agent teams may serve the same priority queue of waiting customer requests within a given class, which is directly reflected in our discrete-event simulation model and is in contrast to the stochastic model of Section 2. For any given set of agent capacity assignments across all work-shifts, the stochastic simulation model is used to determine financial, performance and other measures of interest over the entire time horizon T . A stochastic simulation–optimization capability is developed on top of this discrete-event simulation model to identify an optimal solution among alternative options, which includes for each work-shift determining the agent team capacity and composition and determining the policy for routing requests to agent teams. This involves developing a control module that is used to identify the sequence of simulations to be run and the halting criteria. The combinatorial explosion of options renders full enumeration of all possible solutions completely infeasible in practice, and thus the simulation–optimization control module guides an intelligent searching among simulation models and solutions to maximize the profit-based objective. The simulation–optimization approach we take within the simulation product is based on a combination of supported scatter search and tabu search metaheuristics; refer to [25,2] for additional details. Specifically, at each step of the iteration, an evaluation is made of the simulation results for the current solution together with a comparison against the sequence of previous solutions evaluated via simulation; the optimization control module then suggests another simulation scenario for evaluation. Scatter search is used to guide this step-by-step iteration, initialized with a diverse set of starting solutions that attempts to strike a balance between the diversity and quality of its solution elements. In the next step of the iteration, the initial set of solutions is transformed in an effort to improve the quality or feasibility of the solution set. The input to this improvement step is a single solution and the output is a solution that attempts to enhance the quality or feasibility of the solution under consideration. Local search is one such approach that explores the neighborhood of the input solution for further improvements. If no improvement to the solution is observed, then the search terminates. Tabu search is used for the local (neighborhood) search of the control module. 4. Numerical experiments and case studies We now report on our considerable experience applying the general two-phase solution approach of Section 3 as part of the capacity management and planning process of various real-world SDCs. This includes representative numerical studies conducted to assess the benefits of our solution approach over a more traditional state-of-the-art simulation-based optimization approach, which was deployed for capacity management and planning in the SDCs motivating our study. This also includes results from a collection of financial-performance evaluation studies within the capacity management and planning process of a wide variety of SDCs, illustrating the types of financial and performance issues and trade-offs that need to be addressed in real-world SDCs. By considering different SDCs and different aspects of the capacity planning process in these environments, we investigate the sensitivity of financial and performance results with respect to various parameters of the problem setting. In cases where confidentiality required that exact data values could not be shared, the exact values were altered in a manner that ensures the spirit of the situation was retained. The real-world SDC inputs to our stochastic performance and optimization models are described first, including the data collected and analysis performed as part of our experience with the capacity management and planning processes of these SDCs. We next introduce a base-case scenario that is representative of agent pools at many real-world SDCs, where an agent pool refers to a set of multiple agent teams serving the same priority queue of customer requests. We use this basecase scenario to provide insights and quantify the benefits of our general solution approach over the state-of-the-art (and original) simulation-based optimization approach. Finally, we present and discuss a representative sample of numerical results from a diverse collection of capacity planning studies for various SDCs, demonstrating and evaluating the type of financial-performance issues and trade-offs often encountered in the capacity management and planning of real-world SDCs. 4.1. Services delivery center environments We consider data collected from various real-world SDC environments, including information about team size, skills, and working hours. In some of these environments, an agent pool composed of multiple agent teams is used to serve the same workload. We start by discussing the data for one agent pool, comprised of two agent teams, that serves a workload

12

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



Table 1 Base-case scenario: interarrival times (in minutes) for each of the three daily intervals. Daily interval

Alert

Change

Problem

Checklist

Interval 1 (0–8 h) Interval 2 (8–16 h) Interval 3 (16–24 h)

5.96 9.45 9.67

35.72 80.65 179.52

36.26 44.11 95.51

22.01 30.00 13.8

Table 2 Base-case scenario: service times for each request class and SLT parameters. Parameters

Alert

Change

Problem

Checklist

Avg. service time (min) Std. dev. service time Request priority Target time (min) Percent attainment

1.81 1.37 1 17 95%

34.39 21.64 2 112 95%

11.10 5.41 4 1450 95%

16.20 8.94 3 110 95%

consisting of four classes of requests: alert, change, problem, and checklists. These requests are served according to a predetermined priority ordering among request classes, under an FCFS discipline within each class. The agent pool provides 7 × 24 support, where agents from both teams rotate through three eight-hour daily work-shifts to provide the necessary support. There are different breadths of skill between the two teams of agents: basic agents can serve alert, change and problem requests, while expert agents can serve all classes of requests. The salary of expert agents is on average 30% higher than the salary of basic agents. Focusing on a representative SDL agent pool, henceforth called the base-case scenario, we use historical workload measurement data collected from the agent pool to model the time-varying arrival patterns for each request class. This historical measurement data set shows that the arrival times of requests for every class follow independent Poisson distributions whose rates vary throughout the day with consistent patterns from one day to the next. A detailed analysis of the daily interarrival patterns for this base-case agent pool reveals that each day can be partitioned into three intervals where the Poisson arrival rate is common within each of these stationary daily intervals. Table 1 provides a summary of the hourly interarrival times for every request class in each of the three daily intervals. Similarly, measurement data was collected by the agent pool to model the distribution of the service times for each of the request classes. A detailed analysis of this measurement data set indicates that the service times for every request class follow independent lognormal distributions with different parameters, where service by both teams follows the same distribution for each of the alert, change and problem request classes. Table 2 provides a summary of the mean and standard deviation of the service time distribution for each of the request classes. The priority policy for request classes at this base-case SDL agent pool is also provided in Table 2. Specifically, alert requests are treated with highest priority (priority index 1), followed by change requests, followed by checklist requests, and finally followed by problem requests. The SLAs for the base-case agent pool scenario are also summarized in Table 2, specifying the SL sojourn time target and percentage attainment for each request class. For example, 95% of all alert requests that arrive each month must complete service within 17 min of arrival, whereas 95% of all change requests that arrive each month must complete service within 112 min of arrival. Note that the time frame over which the SL performance target and percentage of attainment are measured for an SLA often exceed the length of a work-shift or a stationary workload interval. The time horizon for the capacity management and planning process of our base-case SDL agent pool consists of one calendar month. Similarly, the time frame for the SLAs at this base-case agent pool is at a monthly granularity such that penalties are incurred whenever the SL performance target and percentage of attainment are not satisfied within any given calendar month. The decision variables of the stochastic optimization problem of interest are the capacity levels for both basic and expert agents in every agent pool within each of the three daily work-shifts at each SDL over the monthly time horizon. 4.2. Performance comparison with state-of-the-art simulation–optimization We now present a representative set of numerical experiments for the base-case scenario to quantitatively evaluate the benefits of our general two-phase solution approach over the state-of-the-art simulation-based optimization approach. These results are consistent with our findings for the entire collection of real-world SDCs that we have encountered over the past several years. The key point is that state-of-the-art simulation-based optimization, as provided by current simulation products, requires a prohibitive amount of time and resources for the capacity management and planning process of SDCs in practice. A new approach is needed that provides the accuracy of simulation-based methods and the efficiency of analytical methods, which is the objective of our solution methodology. The result is the same capacity management and planning solutions within several orders of magnitude less time, additionally supporting financial-performance evaluation studies of various SDC strategies and operational policies that are otherwise intractable. The state-of-the-art simulation-based optimization approach and the second phase of our solution approach each faithfully captures all aspects of the SDL, whereas the first phase of our solution approach is based on a combination of

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



13

Table 3 Base-case scenario: optimal capacity levels for each of the three daily intervals. Optimal solution source

State-of-the-art results First-phase results Second-phase results

Daily interval 1

Daily interval 2

Expert

Basic

Expert

Basic

Daily interval 3 Expert

Basic

2 2 2

3 2 3

2 1 2

0 1 0

3 3 3

0 1 0

analytical stochastic approximations. This includes the objective function of the stochastic optimization problem (OPT-SDC), where the simulation-based optimization formulation considers SLA penalties over a monthly horizon while our first-phase optimization formulation considers SLA penalties over each stationary workload interval. As another specific example, the single priority queue for the base-case SDL agent pool is approximated in our first-phase analysis and optimization by a priority queue for each agent team comprising the pool together with the optimal routing of each class of requests to every agent team queue; conversely, the single priority queue for the base-case SDL agent pool is directly and faithfully implemented as part of the simulation-based optimization of both the state-of-the-art approach and our second-phase solution approach. Since the arrival processes Ai,k (t ) for our base-case scenario are independent Poisson processes, and given our first-phase relaxation of probabilistic routing decisions, the stochastic model of Section 3.1.2 for agent team j ∈ J in stationary interval i ∈ I reduces to a multiclass fixed-priority M /G/1 preemptive–resume queueing system with capacity scaling Ci,j . Moreover, since the SLAs are based on the tail probability of the sojourn time distribution with at least some request classes having relatively tight targets, we exploit our strong approximation approach which can be readily verified to recover the exact solution for the per-class sojourn time in expectation. Hence, our first-phase solution for class k ∈ K requests served by team j ∈ J in interval i ∈ I of the base-case scenario is obtained using (23) and (24) together with the corresponding known M /G/1 results [26]: k 

E[Ti,j,k ] =

k′ =1

2(1 −

λi,j,k′ E[Si2,j,k′ ]

ρi+,j,k−1 )(1 k 

E[Ti2,j,k ] =

k′ =1



ρi+,j,k )

+

E[Si,j,k ] 1 − ρi+,j,k−1

λi,j,k′ E[Si3,j,k′ ]

3(1 − ρi+,j,k−1 )2 (1  k−1 



ρi+,j,k )

λi,j,k′ E[Si2,j,k′ ]

+

,

(33)

E[Si2,j,k ]

(1 − ρi+,j,k−1 )2 k  λi,j,k′ E[Si2,j,k′ ]



  ′   k =1 k′ =1 + +  E[Ti,j,k ].  (1 − ρi+,j,k−1 )2 (1 − ρi+,j,k−1 )(1 − ρi+,j,k ) 

(34)

Table 3 presents a sample of results for the optimal capacity levels of basic and expert agents comprising the basecase SDL agent pool in each daily work-shift over a representative month under the state-of-the-art simulation-based optimization approach and under the first and second phases of our general solution approach. More than twenty-four hours of processing time was required to obtain the optimal solution using the simulation-based optimization capability provided by the selected simulation product. In contrast, the first phase of our solution approach renders within a couple of seconds a nearly optimal solution for each type of agent that differs by at most one from the optimal solution for each daily interval, while requiring four to five orders of magnitude less processing time. Then, a generic instance of the second phase of our solution approach, directly using the simulation-based optimization capability of the simulation product with the first-phase solution as a starting point, renders the optimal solution in under four hours. Upon leveraging more sophisticated searching procedures tailored to each instance of the stochastic optimization problem (by exploiting the first-phase analysis and optimization of our solution approach), the second phase of our approach yields the optimal solution in less than one hour of processing time. The above results are completely representative of our findings for instances of other agent pools at the base-case SDL as well as for agent pools across a wide variety of other real-world SDLs and SDCs, wherein our two-phase solution approach consistently provides the same optimal agent capacity levels while reducing the required processing time by several orders of magnitude in comparison with the state-of-the-art simulation-based optimization approach. In fact, it was not uncommon for the state-of-the-art simulation-based optimization approach to require multiple days to obtain an optimal solution, whereas our two-phase solution approach required at most a few hours to find the same optimal solution. Such tremendous reductions in the time and resources required for capacity management and planning enables a much broader and deeper exploration of the financial-performance space with respect to SDC strategies and operational policies, as will be illustrated in the next section. This in turn provides significant improvements in both efficiency and quality as part of the capacity management and planning process of real-world SDCs in practice.

14

A.R. Heching, M.S. Squillante / Performance Evaluation (

Required Capacity as Function of Arrival Rate

)



# Basic Agents # Expert Agents

15 10 5 0 Triple Arrival Rate

Double Arrival Rate

Agent Team Utilization as Function of Arrival Rate

Utilization

80%

Base Case Basic: Shift 1 Expert Shift 1 Basic: Shift 2 Expert Shift 2 Basic: Shift 3 Expert Shift 3

70% 60% 50% 40% Triple Arrival Rate

Double Arrival Rate

Base Case

Fig. 1. Optimal total capacity and average per-work-shift utilization for each agent team as a function of pool workload intensity.

4.3. Performance evaluation of various SDC scenarios Motivated by capacity management and planning problems faced in the SDCs of interest, we now explore in this section how our two-phase solution methodology has been used to support decision-making in various real-world SDC environments. The motivating SDCs are highly dynamic—factors such as changes in agent team skill composition (due to agent attrition, agent cross-skilling, or agent upskilling), changes in SLAs and performance guarantees (due to contract renegotiations), and changes in workload volumes (due to workload transfer among agent teams, modification of supported customers, or modification of the workload supported for existing customers) all result in a critical need for the SDP to reevaluate financial, performance and capacity requirements. Whereas the prohibitive (time and resource) aspects of the state-of-the-art simulation-based optimization methodology for evaluating financial and performance results severely restrict the degree to which an SDP can explore and understand the impact of various proposed system changes, our two-phase methodology enables this degree of investigation. We now consider a representative sample of numerous realworld business scenarios that have been evaluated using our two-phase solution methodology to explore the financial and performance implications of the type of changes to SDC environments that arise in practice. 4.3.1. Changes in workload arrival rate One challenge often encountered in SDC environments concerns relatively frequent changes to the volume of workload supported by an agent pool. Such changes may be caused by the customer adding to or removing from the services delivered by the SDP. A change in workload volume supported by an agent pool is also experienced whenever an SDP adds a customer to or removes a customer from the set of customers supported by the agent pool. Moreover, SDPs often transfer workload among pools or among SDLs. This so-called ‘‘lift and shift’’ of workload may be motivated, for example, by differentials in labor cost or differences in the availability of skills within diverse geographic locations. Fig. 1 explores the performance impact of such business scenarios, where the two plots illustrate the results for an SDP that must manage changes in the intensity of workload arriving to a particular agent pool. We consider three different scenarios, where the first corresponds to the base-case scenario of Section 4.1, reflecting the existing workload arrival intensity observed by the corresponding agent pool. For the second and third scenarios, the intensity of workload arrivals doubles and triples, respectively, in comparison with this base-case scenario. These alternative scenarios represent situations where, for example, management decides to combine two agent teams (and their associated workloads), workload is transferred from another SDL, the SDP contracts with a new customer and decides to support the new customer with the existing agent pool, or a customer consolidates delivery support and routes more of its workload to a single SDP. The upper plot in Fig. 1 shows that doubling the workload intensity requires a 70% increase in the total optimal agent pool capacity, consisting of a 100% increase in basic agents and a 57% increase in expert agents. On the other hand, tripling the arrival rate requires an increase of 140% in the total optimal agent pool capacity, resulting in a 233% increase in basic agents and a 100% increase in expert agents. The lower plot in Fig. 1 displays the average utilization for each agent team

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



15

Table 4 Moderately variable arrival scenario: interarrival times (in minutes) for each of the four daily subintervals. Daily subinterval

Alert

Change

Problem

Checklist

Interval 1a Interval 1b Interval 2 Interval 3

11.92 2.98 9.45 9.67

71.44 17.86 80.65 179.52

72.51 18.12 44.11 95.51

44.02 11.00 30.00 13.8

Table 5 Highly variable arrival scenario: interarrival times (in minutes) for each of the four daily subintervals. Daily subinterval

Alert

Change

Problem

Checklist

Interval 1a Interval 1b Interval 2 Interval 3

17.89 1.98 9.45 9.67

107.16 11.91 80.65 179.52

108.79 12.09 44.11 95.51

66.02 7.34 30.00 13.8

in all three work-shifts under the optimal capacity staffing for each of the three business scenarios. Average utilization for each agent team during each work-shift tends to increase as the workload arrival rate increases, which is consistent with the observation that a one percent increase in workload intensity tends to require less than a percent of increase in agent pool capacity. These utilization curves highlight some of the complexities involved in the capacity management and planning of real-world SDCs. Fairly linear changes in the total optimal capacity of different agent teams to address increasing workload intensities can involve significant variations in the per-work-shift capacity and utilization of each agent team. The discrete nature of the agent capacity decision variables provides additional complexities to these SDC capacity management and planning processes. The base-case scenario was developed from a representative agent pool in a real-world SDL environment for which an analysis of workload measurement data identified three stationary workload intervals that directly correspond with the three daily (eight-hour) work-shifts, as specified in Table 1. In contrast, similar analyses of workload measurement data from other agent pools within the same SDL, as well as those from other SDLs, identify different SDC environments in which the workload arrival patterns vary during the hours of particular work-shifts. Within some SDC environments, the requests generated by business customers will typically increase in volume during business hours, especially at the start of the work day, whereas lower volumes may be experienced during periods before or after business hours, possibly even during lunch hour. Even workloads that are not directly generated by customers can be impacted by their working and SDC usage patterns. For example, system alerts detect high CPU usage or slow system sojourn times and automatically generate alerts to the agent teams in the case of such negative events. A lower volume of these alerts is generated under conditions of lower system usage and less system traffic. We analyzed historical patterns of usage measurement data across many agent pools from various SDLs to identify how the workload intensity varies within each of the three daily work-shifts. One pattern identified from this analysis of historical measurement data is that, although the first work-shift extends from 0 h until 8 h, there is a lower request volume during the earlier part of the first work-shift in comparison with a higher request volume during the later part of the first work-shift. To enable comparative results, we applied common forms of this type of low–high workload intensity pattern to the first work-shift of the base-case agent pool. As a representative sample of such results, we consider here two alternative workload arrival patterns for the first workshift, both of which maintain the same average arrival rate observed for the base-case agent pool over the entire first work-shift but differ in magnitude with respect to the workload intensity experienced during the late night hours (lowintensity period) and that experienced during the hours closer to the start of the work day (high-intensity period). In the first alternative workload intensity pattern, we modify the hourly arrival rates observed during the first work-shift such that the arrival rates during the low-intensity period of the work-shift are 25% of the arrival rates encountered during the high-intensity period of the work-shift. Similarly, for the second alternative workload pattern, we modify the hourly arrival rates observed during the first work-shift such that the arrival rates during the low-intensity period of the work-shift are 11% of the arrival rates encountered during the high-intensity period of the work-shift. Tables 4 and 5 summarize the hourly interarrival times for every request class in each of the four daily intervals comprising the first and second alternative workload patterns, respectively. The optimal capacity levels for the SDL agent pool under the first alternative workload arrival pattern, denoted by (‘‘Base Case-25PCT’’), are provided in the upper plot of Fig. 2 with the corresponding utilization of each agent team in each workshift shown in the lower plot of Fig. 2. Similarly, the optimal capacity levels for the SDL agent pool under the second alternative workload arrival pattern, denoted by (‘‘Base Case-11PCT’’), are provided in the upper plot of Fig. 3 with the corresponding utilization of each agent team in each work-shift shown in the lower plot of Fig. 3. The upper plot in Fig. 2 illustrates that, under the first alternative workload arrival pattern representing a moderate increase in the variability of the first-shift arrivals relative to the base case, doubling the workload intensity requires a 75% increase in the total agent pool capacity comprised of a 100% increase in the basic agent team capacity and a 63% increase in

16

A.R. Heching, M.S. Squillante / Performance Evaluation (

Required Capacity as Function of Arrival Rate (Base Case - 25PCT)

)



# Basic Agents # Expert Agents

25 20 15 10 5 0 Triple Arrival Rate - 25PCT

Double Arrival Rate - 25PCTB

Agent Team Utilization as Function of Arrival Rate (Base Case -25PCT) 80%

Base Case - 25PCT Basic: Shift 1 Expert Shift 1 Basic: Shift 2 Expert Shift 2 Basic: Shift 3 Expert Shift 3

70% 60% 50% 40% Triple Arrival Rate - 25PCT

Double Arrival Rate - 25PCT

Base Case - 25PCT

Fig. 2. Optimal total capacity and average per-work-shift utilization for each agent team as a function of pool workload, when workload in early hours of the first work-shift is 25% of workload in later hours of first work-shift.

Required Capacity as Function of Arrival Rate (Base Case - 11PCT)

# Basic Agents # Expert Agents

25 20 15 10 5 0 Triple Arrival Rate

Double Arrival Rate Agent Team Utilization as Function of Arrival Rate (Base Case - 11PCT)

80% 70%

Base Case

Basic: Shift 1 Expert Shift 1 Basic: Shift 2 Expert Shift 2 Basic: Shift 3 Expert Shift 3

60% 50% 40% Triple Arrival Rate

Double Arrival Rate

Base Case

Fig. 3. Optimal total capacity and average per-work-shift utilization for each agent team as a function of pool workload, when workload in early hours of the first work-shift is 11% of workload in later hours of first work-shift.

the expert agent team capacity. Tripling the workload intensity results in a 133% increase in the total capacity of the agent pool, consisting of a 100% increase in the basic agent team capacity and a 150% increase in the expert agent team capacity. Similarly, the upper plot in Fig. 3 illustrates that, under the second alternative workload arrival pattern representing a large increase in the variability of the first-shift arrivals relative to the base case, doubling the workload intensity requires a 79%

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



17

increase in total pool capacity comprised of a 100% increase in the basic agent team capacity and a 70% increase in the expert agent team capacity. Further, tripling the workload intensity results in a 150% increase in the total capacity of the agent pool, consisting of a 200% increase in the basic agent team capacity and a 130% increase in the expert agent team capacity. The lower plots in Figs. 2 and 3 display the per-work-shift utilization for each agent team, showing that the utilization of each work-shift/agent–team combination increases with the increasing workload intensity. Interactions among the optimal capacity staffing levels for the different work-shift/agent–team combinations can be observed from these results, driven by the discrete nature of the agent capacity decision variables and the overall complexity of the planning problem. 4.3.2. Changes in contractual service guarantees Another challenge commonly faced by the SDP is the ability to quickly evaluate how changes in contractual SLAs will impact financial and performance results. Traditionally, SDPs enjoyed long-term contracts with customers where the terms and conditions of contractual SLAs were reevaluated every 5–10 years. More recently, customers tend to renegotiate contracts on a more frequent basis – often reviewing service quality performance annually – such that renegotiated contracts may demand service quality guarantees which relate to performance in prior years. The highly competitive business environment requires that SDPs quickly evaluate any terms and conditions proposed by customers in order to determine their implications on operations and overall system financial and performance measures. Stricter contractual SLA terms (both target sojourn time and percentage attainment) are often demanded for critical customer systems, such as customer networks or databases where failures immediately impact business performance. Similarly, online retailers demand high availability and responsiveness for retail websites. On the other hand, less stringent SLA terms may be demanded for systems such as test environments where the inability of customers to access their test environments typically does not critically impact business operations. We utilize our methodology to explore the impact of such real-world changes in contractual SLAs on financial and performance results, first exploring the financial-performance impact of changes to the percentage of attainment. The contractual SLA currently in place demands that 95% of the requests in each class complete service within their contractually specified target time. In the first set of scenarios, we consider how changing this required percentage attainment for change requests to values of 90%, 93%, 97% and 99% may impact system financial and performance measures. Our results show that the required capacity is only impacted when the required percentage attainment increases to 99%, in which case the required pool capacity increases 50% above the capacity required when the percentage of attainment is 95%. Moreover, the composition of the base-case scenario capacity levels in comparison with the capacity levels when the required percentage attainment increases to 99% exhibits no required increase in the expert agent team capacity; all of the increased capacity is required in the basic agent team. Similarly, we consider modifications to the change request target time. Current contractual terms require that change requests complete service within 112 min of when the customer opens the request with the SDP. We consider alternative target resolution times where the customer either increases the target resolution time by 15% or 30% above the base-case scenario (129 min or 146 min, respectively, representing a loosening of the contractual obligation) or decreases this time by 15% or 30% below the base-case scenario (95 min or 78 min, respectively, representing a tightening of the contractual obligation). Our results illustrate that the optimal capacity levels are relatively insensitive to changes in these contractual target times, where the required capacity level increases only in the case when the target sojourn time decreases by 30% (i.e., the target resolution time drops from 112 to 78 min) and this increase in the required agent pool capacity represents 20% more than what is required when the target resolution time is 112 min. The relative insensitivity of the optimal pool capacity to changes in contractual SLAs on the change request workload has revealed that the SDP can deliver higher levels of service to the customer at zero cost. These findings can be used to negotiate mutually beneficial sharing of cost savings agreements between the SDP and the customer. This relative insensitivity also indicates that the change request workload is not what drives the optimal capacity levels maintained by the agent pool. Hence, we next consider how changing the required percentage of attainment for alert requests may impact system financial and performance measures. The existing customer contract demands that 95% of all alerts are resolved within 17 min. We consider agent pool financial-performance results when the percentage of attainment is decreased to values of 90% and 93%. Here, the customer has evaluated the business environment and determined that it is sufficient for a lower percentage of alert requests to be resolved within the target resolution time. This careful evaluation by the customer can have considerable cost benefits, where the total required capacity decreases by 10% when the SLA percentage attainment drops to 93% or 90% with a corresponding 6% reduction in cost. On the other hand, if the customer determines that, due to the criticality of the systems being supported, a more stringent alert request percentage attainment is necessary, then increasing the required percentage of attainment to 97% results in no change to the optimal capacity levels. A required percentage attainment of 99% results in a 2% increase in cost with a change in the optimal capacity of the basic and expert agent teams, where the basic agent team decreases by 33% and the expert agent team increases by 14% both in comparison with the optimal capacity levels under the base-case scenario. 4.3.3. Changes in mean service time Human service systems can observe significant employee turnover, resulting in the need for the SDP to introduce less experienced new hires into the SDC environment. These agents typically work at a slower rate than their more experienced counterparts, during the time when they are becoming familiar with the tasks and processes required to respond to

18

A.R. Heching, M.S. Squillante / Performance Evaluation (

)



customer requests as well as becoming familiar with the nuances of customer environments. The SDP may also at times need to take actions that result in a reduction in the experience or expertise level of some agent pools. For example, when the SDP begins to support a new type of request that has not previously been supported or when it has been identified that challenges with the satisfaction of a customer would be improved by dedicating highly experienced or expert agents to support the customer, then agent teams are moved between agent pools resulting in a reduction in the experience or expertise level of some agent pools. The SDP will examine the benefits achieved by dedicating the highly experienced/expert agent teams in comparison with the financial measures resulting from the reduced experience/expertise level remaining in the agent pools. We examine these real-world problem instances and consider the impact of an increase in the mean service rate on the total optimal capacity cost. For example, when the mean service rate increases by 25%, the total capacity cost increases by 19% while a 50% increase in the mean service rate corresponds to a 40% increase in the total capacity cost. When the increase in mean service time is not due to agent attrition but is rather due to changes introduced by the SDP, then the SDP will carefully weigh these costs against the benefit achieved by the environmental changes to ensure the efficiency of such actions. Conversely, the current highly competitive business environment challenges SDPs to deliver increasing quality of service at similar or reduced cost to customers. SDPs closely analyze the processes followed in SDC environments to identify opportunities for Lean Six Sigma [27] initiatives to reduce waste, improve process flow, and reduce request sojourn time. Opportunities for automation are also explored, where SDPs are often identifying low-complexity repetitive tasks as candidates for automation. This reduces the service time for requests that involve such low-complexity tasks and allows human agents to focus their efforts on the more complex tasks that are more difficult to automate. Prior to introducing these changes into the SDC environment, the SDP carefully measures the expected benefit that will result from the process change relative to the financial implications of introducing the change into the environment. As a representative example, we first examine an instance where process changes result in a 25% reduction in the mean service time. This causes the total agent pool capacity requirement to drop by 30% in comparison with the base-case total pool capacity requirement. Upon inspecting the required capacity for the basic and expert agent teams, we find that the capacity of the expert team drops by 14% while the capacity of the basic team drops by 67%, both in comparison with the basecase capacity requirements, rendering a corresponding 27% reduction in total cost. We next consider process changes that result in a 50% reduction in the mean service time. Here, the required agent pool capacity is reduced by 40% in comparison with the base-case agent pool capacity requirements. This reduction in total agent capacity requirements is comprised of a 67% decrease in the capacity of the basic agent team and a 29% decrease in the capacity of the expert agent team, which corresponds to a 38% reduction in total system cost. Such reductions in agent pool capacity requirements enable an SDP to support more customers or more stringent performance guarantees or both. The SDP considers the benefits of the total savings in comparison with the (hard) cost of process changes combined with the (soft) costs of disruption in the business environment to determine if and when it is beneficial to introduce these changes into the SDC environment. 5. Conclusion The capacity management and planning processes of human server systems arising within information technology services delivery center environments require accurate and efficient solutions to complex stochastic optimization problems. Such solutions are then used to drive and support the design and evaluation of services delivery center strategies and operational policies. To address these needs, we propose a general two-phase solution approach that determines the agent capacity decisions and actions to optimize financial and performance measures in complex human server stochastic systems of queues. A large collection of numerical experiments investigate how our two-phase solution approach supports decisionmaking across a wide variety of real-world SDC environments, demonstrating that our approach consistently provides optimal solutions with a reduction of several orders of magnitude in the time and resources required as compared with the traditional state-of-the-art simulation-based optimization approach. These business case studies further evaluate the type of financial-performance issues and trade-offs often encountered in the capacity management and planning of realworld SDCs. The effectiveness of our first-phase solution approach is based in great part on an important principle regarding the stochastic analysis used to solve complex stochastic optimization problems: it is far better to have a stochastic model and analysis that provides the right curvature at the right places than to have a stochastic model and analysis that is highly accurate but does not have the appropriate curvature at the most critical places. Acknowledgments We thank the editor and referees for helpful comments on an earlier draft. References [1] H. Cao, J. Hu, C. Jiang, T. Kumar, T.-H. Li, Y. Liu, Y. Lu, S. Mahatma, A. Mojsilović, M. Sharma, M.S. Squillante, Y. Yu, OnTheMark: integrated stochastic resource planning of human capital supply chains, Interfaces 41 (2011) 414–435. [2] F. Glover, J. Kelly, M. Laguna, New advances for wedding optimization and simulation, in: P.A. Farrington, H.B. Nembhard, D.T. Sturrock, G.W. Evans (Eds.), Proceedings of the 1999 Winter Simulation Conference, The Society for Computer Simulation International, Phoenix, AZ, 1999, pp. 255–260.

A.R. Heching, M.S. Squillante / Performance Evaluation ( [3] [4] [5] [6] [7] [8] [9]

[10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27]

)



19

B.L. Nelson, S.G. Henderson (Eds.), Handbooks in OR and MS: Simulation, Elsevier Science, 2007. S. Asmussen, P.W. Glynn, Stochastic Simulation: Algorithms and Analysis, Springer, 2007. N. Gans, G. Koole, A. Mandelbaum, Telephone call centers: tutorial, review, and research prospects, Manag. Sci. 5 (2003) 79–141. I. Gurvich, W. Whitt, Queue-and-idleness-ratio controls in many-server service systems, Math. Oper. Res. 34 (2009) 363–396. I. Gurvich, J. Luedtke, T. Tezcan, Staffing call-centers with uncertain demand forecasts: a chance-constrained optimization approach, Manag. Sci. 56 (2010) 1093–1115. J. Atlason, M.A. Epelman, S.G. Henderson, Optimizing call center staffing using simulation and analytic center cutting-plane methods, Manag. Sci. 54 (2008) 295–309. Z. Feldman, A. Mandelbaum, Using simulation-based stochastic approximation to optimize staffing of systems with skills-based-routing, in: B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, E. Yucesan (Eds.), Proceedings of the 2010 Winter Simulation Conference, The Society for Computer Simulation International, Baltimore, MD, 2010, pp. 3307–3317. A.B. Dieker, S. Ghosh, M.S. Squillante, Optimal resource capacity management for stochastic networks, 2012. Preprint. M. Csörgő, L. Horváth, Weighted Approximations in Probability and Statistics, Wiley, 1993. W. Whitt, Stochastic-Process Limits, Springer-Verlag, New York, 2002. M. Nuyens, B. Zwart, A large-deviations analysis of the GI/GI/1 SRPT queue, Queueing Syst. Theory Appl. 54 (2006) 85–97. Y. Lu, M.S. Squillante, Dynamic scheduling to optimize sojourn time moments and tail asymptotics in queueing systems, Technical Report, IBM Research Division, 2005. J.M. Harrison, Brownian Motion and Stochastic Flow Systems, John Wiley and Sons, 1985. H. Chen, D.D. Yao, Fundamentals of Queueing Networks: Performance, Asymptotics, and Optimization, Springer-Verlag, 2001. H. Chen, X. Shen, Strong approximations for multiclass feedforward queueing networks, Ann. Appl. Probab. 10 (2000) 828–876. L. Kleinrock, Queueing Systems Volume I: Theory, John Wiley and Sons, 1975. L. Kleinrock, Queueing Systems Volume II: Computer Applications, John Wiley and Sons, 1976. S. Boyd, L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004. A. Wachter, L.T. Biegler, On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming, Math. Program. 106 (2006) 25–57. Ipopt, Project description, 2012. http://www.coin-or.org/projects/Ipopt.xml. T. Ibaraki, N. Katoh, Resource Allocation Problems: Algorithmic Approaches, The MIT Press, Cambridge, Massachusetts, 1988. N. Katoh, T. Ibaraki, Resource allocation problems, in: D.-Z. Du, P. Pardalos (Eds.), Handbook of Combinatorial Optimization II, Kluwer Academic, Boston, 1998, pp. 159–260. F. Glover, M. Laguna, Heuristics for integer programming using surrogate constraints, Decis. Sci. 8 (1977) 156–166. H. Takagi, Queueing Analysis: A Foundation of Performance Evaluation, Volume 1: Vacation and Priority Systems, Part 1, North Holland, Amsterdam, 1991. T. McCarty, L. Daniels, M. Bremer, P. Gupta, The Six Sigma Black Belt Handbook, McGraw-Hill Engineering, 2005.

Aliza R. Heching is a Research Staff Member in the Mathematical Sciences Department of the IBM Thomas J. Watson Research Center, Yorktown Heights, NY. She received a Ph.D. in Operations Research from the Columbia University Graduate School of Business. Her research interests concern the modeling, analysis, and optimization of service systems.

Mark S. Squillante is a Research Staff Member and Manager in the Mathematical Sciences Department at the IBM Thomas J. Watson Research Center, where he leads the Stochastic Processes and Optimization group. He received a Ph.D. degree from the University of Washington. His research interests concern mathematical foundations of the analysis, modeling and optimization of the design and control of complex stochastic systems. He is an elected Fellow of ACM and IEEE, the author of more than 250 research papers and more than 30 issued or filed patents, and the recipient of The Daniel H. Wagner Prize (INFORMS), eight best paper awards, nine keynote/plenary presentations, and eleven major IBM technical awards. He serves on the editorial boards of Operations Research, Performance Evaluation and Stochastic Models.