European Journal of Operational Research 212 (2011) 89–99
Contents lists available at ScienceDirect
European Journal of Operational Research journal homepage: www.elsevier.com/locate/ejor
Production, Manufacturing and Logistics
Analytical approximations to predict performance measures of markovian type manufacturing systems with job failures and parallel processing Maria Hulett a, Purushothaman Damodaran b,⇑ a b
Department of Industrial and Systems Engineering, Florida International University, Miami, FL 33174, USA Department of Industrial and Systems Engineering, Northern Illinois University, DeKalb, IL 60115, USA
a r t i c l e
i n f o
Article history: Received 8 January 2010 Accepted 18 January 2011 Available online 28 January 2011 Keywords: Queuing network Parametric decomposition Web server assembly Fork and join queues
a b s t r a c t Manufacturing or service systems with multiple product classes, job circulation due to random failures, resources shared between product classes, and some portions of the manufacturing or assembly carried in series and the rest in parallel are commonly observed in real-life. The web server assembly is one such manufacturing system which exhibits the above characteristics. Predicting the performance measures of these manufacturing systems is not an easy task. The primary objective of this research was to propose analytical approximations to predict the flow times of the manufacturing systems, with the above characteristics, and evaluate its accuracy. The manufacturing system is represented as a network of queues. The parametric decomposition approach is used to develop analytical approximations for a system with arrival and service rates from a Markovian distribution. The results from the analytical approximations are compared to simulation models. In order to bridge the gap in error, correction terms were developed through regression modeling. The experimental study conducted indicates that the analytical approximations along with the correction terms can serve as a good estimate for the flow times of the manufacturing systems with the above characteristics. Ó 2011 Elsevier B.V. All rights reserved.
1. Introduction Parallel processing is prevalent in many manufacturing and service systems. Most manufactured products are built and assembled from several components fabricated in parallel lines. In assembly operations, some components have to wait for other components before the assembly operation can begin. Hence synchronization constraints arise between some workstations in the manufacturing process. Additionally, it is common to observe manufacturing systems that deal with multiple product classes and job circulation due to random part failures. An example of such a manufacturing system was observed at a facility equipped to assemble and test web servers. Servers are highly customizable and hence typically are madeto-order. The facility assembles multiple server types (or product classes) and the resources at some of the workstations are typically shared by two or more products. Fig. 1 shows the flow chart of a typical server assembly process. Each customer order may consist of one or multiple servers and depending on the customer need, each server can have one or multiple racks. Based on the configuration of the server, some portions of the assembly process can be
⇑ Corresponding author. Tel.: +1 8157536748. E-mail address:
[email protected] (P. Damodaran). 0377-2217/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.ejor.2011.01.034
handled in parallel and some of them in series. The hardware components/parts which needs to be assembled for each rack are grouped together to form a kit, which later are assembled to a prefabricated subassembly in the build area. The racks are stored in a temporary storage area waiting for the remaining racks associated with the order to be ready. As soon as all the racks are built, they are merged together in the merge area. For customer orders with one rack, the merging process is bypassed. The merged assembly is later tested to detect early failures. When a failure is observed during the testing phase, the servers have to be disassembled, reassembled (failed components are replaced), and retested. This process is repeated until the server passes all the tests. Consequently, job/server circulation is typical under this environment. For simplicity the circulation of the server due to component failures is not shown in Fig. 1. Depending on the nature of the failure, the disassembly of failed components and reassembly of good components is carried out at the build and/or merge areas. Sometimes the disassembly and reassembly can also be done at the testing stations. When servers fail repeatedly at some stage(s), the rework and retesting can increase the time required to fulfill an order (i.e. flow time) and stress some assembly and test resources. This could lead to missed delivery dates and their respective penalties, loss of customer good will, and so on. Consequently, procedures to estimate performance measures of the system (e.g. flow time) accurately can help the operations
90
M. Hulett, P. Damodaran / European Journal of Operational Research 212 (2011) 89–99
Customer Order
Print Parts/rack for each order
Kitting
Build
Yes Order with 1 rack?
Testing
Shipment
No
Wait for other racks to be built of the same order
Merge all the racks Extract all racks for an order
Fig. 1. Flow chart of the server assembly process.
managers to make more informed decisions, especially with respect to shipment dates, allocation of resources, etc. 2. Problem description In the server assembly manufacturing process some portions of the assembly can be handled in parallel and some of them in series. The kitting, merging, and testing process steps are done sequentially, whereas the rack assembly is done in parallel. Once all the racks are assembled, they are merged together before the entire server is subjected to a series of tests. This type of parallel processing with synchronization constraints can be modeled by using fork and join stations (Varki, 2001). In the fork and join subsystems each job/server is divided, at the fork point, into l identical subtasks (each subtask is associated with building a rack) that are submitted to a unique resource/station within the fork and join subsystem. The parallel queues are independent and identically distributed (i.i.d), and the number of subtasks can be less than or equal to the number of resources in the subsystem. On completing service, each subtask waits at the join point for its sibling subtask to complete service without preventing the resource from working on another subtask. A server is ready for merge operation if all of its racks have been assembled. Fig. 2 illustrates the fork-join queuing model which can be applied to model the rack building process of the server assembly line. At fork point, each order for a job/server is divided into identical subtasks (i.e. each subtask is to build a rack) that are to be processed independently. The number of racks depends on the type or configuration of the server ordered by the customer. After all the racks are independently built, they have to be merged in the subsequent merge station. However, the merge operation can begin only when all the racks are built. Consequently, a synchronization queue is needed at the join point as shown in the figure.
Sub-task service area
μ 1
Join point
Fork point . .
2 N Synchronization queue
Fig. 2. Fork-join queuing model.
Although our study is motivated by observing a web server assembly facility, it is easy to relate to many manufacturing facilities with similar characteristics (i.e. some assembly steps in series and some on parallel, job failures resulting in circulation, multiple products sharing same workstations or resources, etc.). Consequently, from hereon when discussing the problem and the solution approach we use terminology which is not specific to web server assembly. In this paper, we consider an open system where the products are allowed to leave the system and the external arrivals arise from an infinite population of products. It is assumed a collection of product types, which are routed through a finite collection of workstations, where some portions of the process can be handled in parallel and some of them in series. The resources at the station can be shared among different products. The jobs circulate probabilistically in the system depending on the nature of the failure, thus, the job may visit one or more workstations for multiple times. Job splitting or preemption is not allowed, and resource breakdowns are not considered. Arrival and service rates are assumed from a Markovian distribution. Jobs are processed according to a first-in-first-out (FIFO) discipline at each station, and all jobs/ products must leave the system. Fig. 3 presents a queuing network to illustrate the characteristics of the problem under study. In this example, two different products that share some workstations in the network are considered. Each product has its respective arrival, service, and failure rates at different stations, and is routed through a finite collection of workstations. Additionally, one portion of the process is handled in parallel, which is modeled as a fork-join node. The fork-join subsystem has three homogeneous stations where product 1 is split into two subtasks, and product 2 into three subtasks. Product 1 must visit workstation one, the parallel system (fork-join node) with two subtasks, and workstations five, six, and eight in that order (shown by dashed lines in the figure). On the other hand, product 2 must visit workstations one, the parallel system (fork-join node) with three subtasks, and workstations five, seven and eight. Although in web server assembly lines server failures are typically noticed after the testing stage, we consider failures at any stage in our study so that our discussions are applicable to other manufacturing systems where failures can occur at any stage in the assembly. Analyzing the performance measure of a complex manufacturing system, such as the web server assembly, with the characteristics described above is not easy. The primary objective of this research is to develop analytical models to estimate the flow time of a complex manufacturing system with job failures, multiple product classes, resource sharing, and with some process steps
M. Hulett, P. Damodaran / European Journal of Operational Research 212 (2011) 89–99
91
2 Sub-Tasks Product 1 0.02 Failure rate 0.05 Failure rate
2
λ1 1
3
5
6
λ2 0.01 Failure rate
8
7
4
3 Sub-Tasks Product 2 Fig. 3. Problem representation.
executed in series and some in parallel. Accurate estimates can help an operations manager to make informed decisions. This paper is organized as follows. Section 3 presents the literature pertinent to the problem under study. Section 4 presents the analytical models proposed to estimate the flow time performance measure. Section 5 presents the experimental study conducted to verify the analytical estimates. The analytical estimates are first compared to simulation models. When the gap between the analytical estimates and simulation models are wide, appropriate error terms are developed to bridge the gap. Section 6 presents our conclusion and directions for future research. 3. Literature review Several queueing network model implementations have been used in the performance evaluation of manufacturing systems. Exact results exist for Markovian systems with the seminal papers of Jackson (1954) for job shops and Koenigsberg (1958) for cyclic systems. Several generalizations of these results are derived by Baskett et al. (1975), Kelly (1975, 1976) and Meyn and Down (1994). The desire of dealing with more complex systems, such as the problem under study, has led to the development of approximate analytical solutions. According to Bitran and Sarkar (1994) decomposition methods is one of them. This approach decomposes the queueing network into queueing subsystems each with a single station. It is based on two assumptions: (a) the nodes can be treated as being stochastically independent; and (b) the arrival and departure streams are renewal processes, which are characterized by the first two moments, mean and variance (Sunkyo, 2000). The pioneers in this field are Reiser and Kobayashi (1974), who implicitly used a decomposition approach to analyze open and closed queueing networks with general service time distributions and First-Come First-Serve (FCFS) discipline. Contributions were made by Chandy and Sauser (1978), Kuehn (1979), Shanthikumar and Buzacott (1981), Whitt (1983) and Bitran and Sarkar (1994). Typically manufacturing systems deal with different types of jobs whose service times and route are determined by the job type. This type of queueing networks have been approached in the past by different researchers. Baskett et al. (1975) extended the work of Jackson to multiclass networks for a variety of system configurations. Kelly (1975) worked with more general queueing networks. Whitt (1983) with the Queueing Network Analyzer (QNA) provided the option of defining different customer classes, and later, Segal and Whitt (1989) described a new version of the QNA software package, which was developed especially to analyze manufacturing lines. Job circulation can be found at different manufacturing environments. Flow of jobs between different stations where parts may
return more than once to the same machine, for repeated stages of processing, and where the number of visits would be fixed by the original sequence is referred to as re-entrant lines (Kumar, 1993). On the other hand, manufacturing systems with stochastic flow of jobs between different stations due to job failures is also observed. Under this scenario, the jobs entering the system have different repeated cycle requirements, contrasting with the deterministic re-entrant lines where jobs would have fixed cycle needs (Omar and Kumar, 2008). Examples of re-entrant lines include semiconductor wafer fabrication facilities Connors et al. (1996), Park et al. (2000), circuit card assembly, thin film lines, and systems with rework tasks (Narahari and Khan, 1996; Saboo et al., 1989). More recently, Pradhan et al. (2008) presented analytical approximations to estimate the performance measures of an optoelectronic manufacturing system with multiple product classes, job circulations due to failures, and some resources being shared among different product classes. Later Pradhan and Damodaran (2009) analyzed the same problem for G/G/c type of systems. There are some similarities between Pradhan et al. (2008) and the problem under study; however, they did not consider parallel processing in the manufacturing system. Fork-join queueing systems take place in the performance analysis of parallel processing in queueing models of manufacturing and computer systems. The analysis of such systems is analytically intractable due to the parallelism of the tasks (Nelson and Tantawi, 1988), and the synchronization, demanded by the forks and joins, that destroys all nice properties like product form or insensitivity (Baccelli et al., 1989). Thus, exact analysis of the fork and join queue is presented only for two server fork-join queues, requiring approximation and bounding techniques to compute performance measures of fork-join queues for more servers (Varki et al., 2008). Flatto and Hahn (1984) obtained asymptotic formulas for the steady state of an open queueing network with two heterogeneous servers, each having its own queue. Nelson et al. (1988) obtained an expression for the mean response time in centralized parallel processing systems, Nelson and Tantawi (1988) considered a distributed parallel processing system, Baccelli et al. (1989) studied the class acyclic fork-join system that takes place in applications like parallel processing and flexible manufacturing, Varki and Dowdy (1995) presented an exact analysis of mean response time of two identical parallel queueing systems, and Varki et al. (2008) presented simple pessimistic and optimistic mean time response bounds and a mean response time approximation RKN for a homogeneous fork-join queue, and exponential service times. The preceding discussion reveals that most of the research done on the fork-join, under the characteristic of the problem object of this study, has addressed parallel processing in parallel computer and storage systems environments, and rarely has been considered
92
M. Hulett, P. Damodaran / European Journal of Operational Research 212 (2011) 89–99
with multiple products. From the literature reviewed, performance analysis of manufacturing systems considering simultaneously multiple product classes, job circulation due to random part failures, and resources been shared by multiple products have not been widely studied in the context of parallel processing. 4. Formulation of the analytical model for the M/M/c System As stated earlier, the web server assembly process introduces a variety and complexity of different issues such as, parallel processing with synchronization constraints, multiple product classes, resource sharing and job circulation due to random part failures, that make this type of systems difficult to analyze. Although many researchers have proposed analytical formulations for manufacturing systems with some similar characteristics, most of these previous efforts have not dealt with all the issues that in this study are considered. Especially, prior work on paralleling processing, with the same characteristics of the problem under study, has not taken into account multiple product classes and failure rates at the fork point. In this research, existing analytical formulations related to fork-join nodes are modified to incorporate the fact that a node can be visited by different product classes, and appropriate correction terms via regression analysis are added to the approximations in order to bridge the gap in the error between the analytical approximation and the simulation models. Exact analytical solutions have only been developed for parallel processing with two servers, and for more complex systems, such as the problem under study, approximation or bounding analysis has been used to predict performance measures. Parametric decomposition approximation method, based on the Whitt’s approach (Whitt, 1983), is used to analyze the problem under study. Under this method, the queueing network is decomposed into individual queues that are analyzed separately after approximately characterizing the aggregate arrival process. Later, the whole system performance is computed by aggregating the individual queue performances. By using this approach, we can analyze independently single (i.e. no fork-join stations) and fork-join stations within the same framework. The formulations for the M/M/c system presented here are based on the work done by Jackson (1954), Whitt (1983), the modifications done by Pradhan et al. (2008) to incorporate the probabilistic flow of jobs between stations in the network system, and the work done on fork-join queues with variable subtasks by Varki et al. (2008). The notation used is as follows: Notation k nk J sj Jk Fk qijk k0jk
ljk sjk NFJi lkFJi MkFJi
of jobs due to random failures. Since different product classes are considered, the average of the service time is calculated first to determine the traffic intensity at each node. Using the traffic rates and the average of the service time, the performance measures are calculated by analyzing each node independently. In the fork-join nodes, each server is modeled as M/M/1 queue with synchronized arrivals, and the response time is calculated by using the approximation that considers several sub-tasks as developed by Varki et al. (2008). This approximation is modified in order to take into account the fact that the nodes within the fork-join subsystem can be visited by different product classes, which can have different arrival and services rates at a given node. The remaining nodes are analyzed independently using the formulas for a single queue (Jackson, 1954) with the modifications to incorporate the probabilistic flow of jobs (Pradhan et al., 2008). The performance measures for the network as a whole are calculated by aggregating the individual performance measures. The approach to calculate the performance measures of the manufacturing process with multiple product classes, job circulation due to random part failures, resource sharing and parallel processing is as follows: 1. For all products k calculate: (a) Arrival rate to each node kjk: The net or effective arrival rate at each node can be computed by solving the system of linear Eq. (1). The linear equation takes into account the external arrival rate of product k to node j (i.e. k0jk) and the internal arrivals from all other nodes in the network.
kjk ¼ k0jk þ
Following the Whitt’s approach (Whitt, 1983) the parameters for the internal flows, and traffic rates are calculated by solving systems of linear equations. Later, the mean number of visits to each node is determined in order to deal with probabilistic flow
8j 2 Jk ; k 2 K:
qijk kik
ð1Þ
iJ k
(b) Proportion of arrivals to node j that came from node i is given by
pijk ¼
kijk kjk
8j 2 Jk ; k 2 K;
ð2Þ
where kijk = kikqijk. (c) Mean number of visits to each node tjk: The problem under study deals with probabilistic flow of jobs due to random failures; thus, a job may return more than once to the same station. The mean number of visits (tjk) to each node j for product class k can be calculated by solving the system of linear equations (3). This system of equations are very similar to the equations used for computing the net arrival rate.
tjk ¼ q0jk þ product class index k 2 K total number of nodes visited by product class k set of nodes number of servers at node j set of nodes visited by product class k set of fork-join nodes visited by product class k transition probability from node i to node j by product class k external arrival rate at node j by product class k service rate at node j for product class k mean service time at node j by product class k number of nodes at the fork-join i service rate for product class k at the fork-join i number of subtasks at the fork-join i by product class k
X
X
qijk tik
8j 2 Jk ; k 2 K:
ð3Þ
iJ k
2. Calculate aggregate service time sj for all nodes j: Since this research considers different product classes, which can have different service rates at a given node, an average of the service time need to be calculated. The aggregate service time at node j (sj) can be computed using Pradhan’s equation (Pradhan et al., 2008), which is a modified version of its deterministic equivalent given by Whitt (1983). The Eq. (4) calculates a weighted average of the service time taking into account the external arrival rate to node j, and the inputs from the other stations in the network.
P
sj ¼
s þ
k2K k0jk jk
P
k2K k0jk
þ
P
P
Pk2K Pi2Jk k2K
kijk sjk tjk
t
i2J k kijk jk
8j 2 J:
ð4Þ
3. Calculate net traffic intensity at node j
qj ¼
P
s
k2K kjk j
sj
8j 2 J:
ð5Þ
93
M. Hulett, P. Damodaran / European Journal of Operational Research 212 (2011) 89–99
4. For all products k and all nodes j except fork- join nodes calculate: (a) Probability of zero jobs at node j by product class k
p0jk
20 n 1 sj 31 kjk kjk sj 1 X l ljk 6B C 7 jk ¼ 4@ Aþ 5 n! sj !ð1 qj Þ n¼0
8j 2 Jk ; k 2 K:
ð6Þ
kjk
6 Lqjk ¼ 4
ljk
3
qj 7 5p0jk 8j 2 J k ; k 2 K:
sj !ð1 qj Þ
ð7Þ
(c) Expected number of jobs at node j by product class k (number in queue + utilization of the server)
Ljk ¼
kjk
ljk
8j 2 Jk ; k 2 K:
þ Lqjk
8j 2 Jk ; k 2 K:
ð9Þ
(e) Expected time in service at node j by product class k
Wsjk ¼
1
ljk
8j 2 Jk ; k 2 K:
ð10Þ
5. For all fork-join subsystem i calculate: (a) Net traffic intensity at fork-join i: It is determined by the product of the net traffic intensity from Eq. (5), which considers different product classes, and an average of the server access probability, which is the probability that a sub-task is submitted to a server.
qFJi ¼
qi NFJi
P
k2K M k FJ i
jKj
ð11Þ
(b) Expected time Rki of a job at fork-join i i. If NFJi = 2, then the response time (Rki) for product class k in the fork-join subsystem i can be computed by using the approximation developed by Nelson and Tantawi (1988)
Rki ¼
12 qFJ i 1 8 lk FJi ð1 qFJi Þ
ð12Þ
ii. Else, the response time (Rki) is calculated using the approximation developed by Varki et al. (2008)
1 qFJi H þ ðSumMk FJi qFJi þ ð1 2qFJ i Þ lk FJi N 2ð1 qFJi Þ ð13Þ SumMk FJi ðMk FJi qFJi Þ Þ
Rki ¼
where HN ¼
NFJ i X 1 i i¼1
ð14Þ
1 1 1 1 SumMk FJi qFJi ¼ þ þ þ þ ð15Þ 1 qFJ i 2 qFJi 3 qFJi Mk FJ i qFJi 1 1 1 1 1 1 1 SumMk FJi ðMk FJi qFJi Þ ¼ þ þ þ þ : 1 qFJ i 2 2 qFJi 3 3 qFJ i Mk FJ i Mk FJi qFJ i
ð16Þ
6. For all nodes j (except fork-join nodes) calculate the average waiting time in queue
P Wqjk Wqj ¼ Pk2K k2K Z jk
8j 2 Jk ; k 2 K;
X
ðWqj þ Wsjk Þtjk þ
X
Rki :
ð18Þ
i2F k
The flow time is the summation of average time spent in queue and service at the single stations (i.e. non fork-join stations) and the response time at the fork-join station. The flow time estimated is compared to a simulation model to evaluate its accuracy. Detailed discussion on the experiments conducted to evaluate the accuracy of the analytical estimates is presented in next section.
ð8Þ
(d) Expected waiting time at node j by product class k (by Little’s law)
Lqjk Wqjk ¼ kjk
Total flow timek ¼
j2J k
(b) Number of jobs in queue at node j by product class k
2 s j
is computed in Eq. (9). The total waiting time at node j is divided by the total number of products visiting node j to get the average waiting time. 7. For all products k calculate the total flow time
ð17Þ
where Zjk is 1 if node j is visited by product class k and 0 otherwise. The waiting time at node j for each product class k
5. Computational experiments and results Ninety problem instances were designed to test the accuracy of the analytical estimates. Each problem instance is defined by the number of nodes in the network, number of servers at each node, number of product classes, nodes that are shared by different product classes; and for each product class, its arrival and service rates and the transition probability matrix. The number of nodes for each problem instance was varied from five to eight, product classes from one to six, and shared nodes from 43% to 100% of the total number of nodes considered in the problem instance. The number of fork and join subsystems considered was either one or two, and the number of servers and subtasks in it two, three or four. The location of the fork-join nodes in the network was varied from node one to last node 1. The transition probability matrix for each product class was randomly generated, varying failure rates from 0.01 to 0.2. Arrival and service rates were selected in such way that the service rates at all stations were greater than the arrival rates. All the equations stated in the analytical formulation were implemented in Matlab 7.0. The simulation models were developed and implemented in Arena 11 simulation package, taking into account the portions of the assembly process that are handled in series and in parallel with synchronization queues. Fig. 4 presents a very high level block diagram of the logic implemented in Arena for the example network shown in Fig. 3. The flow due to failures is not shown in the block diagram. The logic for one non fork-join station is shown in the block diagram to conserve space. Two ‘‘create’’ blocks are used to create the arrivals for the two product classes. Using an ‘‘assign’’ block the entity attributes are stored. For example, the product class and the number of sub tasks at the fork-joint node are stored as attributes.The ‘‘route’’ block and ‘‘sequence’’ elements in Arena were used to route the different product classes. The ‘‘decide’’ block in Arena was used to model the failures. In order to model the non fork-join stations, the ‘‘process’’ (seize-delay-release) block in Arena was used. The ‘‘duplicate’’ block (separate) in Arena was used to model the job division at the fork point. Each job is duplicated (or divided) into MkFJi, identical subtasks (which depends on the product class k) that are submitted to a unique server/resource/ station within the fork and join subsystem. The ‘‘group’’ block in Arena was used to model the activity where the subtask waits at the join point for its sibling subtask to complete service without preventing the server from working on another subtask.In the group block, the entity attributes were used to match and group the appropriate number of entities. Ten replications were conducted for each experiment, and the run length of the simulation model was set as 30,000 time units of warm-up plus 500,000 time units (roughly 30,000 jobs).
94
M. Hulett, P. Damodaran / European Journal of Operational Research 212 (2011) 89–99
Create Product Class 1
Assign Attribute Values
Route
Single Station
Seize-DelayRelease
Route
Create Product Class 2
Assign Attribute Values
Route
Seize-DelayRelease
Fork-Join Station
Duplicate l entities to represent each subtask for product class k
Batch l entities based on the attribute of the product class
Seize-DelayRelease
Route
Seize-DelayRelease
Fig. 4. Simulation logic.
25
100%
20
80%
15
60%
10
40%
5
20%
0
0%
Flow Time Error (%) Fig. 5. Overall flow time distribution.
6
120%
5
100%
4
80%
3
60%
2
40%
1
20%
0
0% .7
120%
<7
Frequency Cumulative %
1. 3 6. 9 12 .5 18 .1 23 .7 29 .3 34 .9 40 .5 46 .1 51 .7 57 .3 62 .9 68 .5 74 .0 79 . <8 6 5. 3
Frequency
30
5. 1
In order to show the nature of the distribution of the error for the flow time, and identify what proportion of data points fall into
2. 6
Flow timeapprox Flow timesimulation 100: Flow timesimulation ð19Þ
0. 0
Relative approx error ¼
each interval, a histogram of the errors was plotted for all problem instances (see Fig. 5). The graph shows that the error seems to be distributed between zero and 85%. Detailed analysis was done by plotting the error for the different types of problem instances, and by conducting a paired-t test to determine whether there is significant difference between the flow time from the simulation models and the analytical formulations. Fig. 6 displays that the error for most of the one product problem instances was between zero and 5%, and the maximum error was 7.7%. The data strongly suggests that the means are the same at 0.05 level of significance. Fig. 7 shows that the error for 86% of two product instances was between zero and 10%. The data strongly suggests that at 0.05 level of significance the means are the same. In the case of three product classes, Fig. 8 displays that the error for 87% of the instances was between 1% and 26%, and the data strongly suggest that the means are different at 0.01 level of significance. Fig. 9 shows that the error was between 7% and 35% for 81% of the four product instances. The data strongly suggest that the means are different at 0.01 level of significance. In Fig. 10, the error is shown for five product classes. The error for 91% of the instances appears to be distributed between 7% and 73%. Fig. 11 displays that the error for 94% of the six product instances was between 21% and 77%. The data strongly suggest that the means are different at 0.01 level of significance.
Frequency
Simulation models for all the ninety problem instances were developed in Arena. The number of product classes considered (one to six), product yield at different nodes (i.e. transition probabilities), the number of fork-join nodes (one or two) and their location (first stage to penultimate stage), the number of sub tasks in the fork-join nodes (two, three, or four), the total number of nodes in the network (five to eight), and all other parameters were identical in both the analytical and simulation models. In essence, for each network configuration solved using analytical models, equivalent simulation models were developed and simulated to obtain appropriate performance measures of interest. The performance measure of interest is the product flow time in the system. The tightness of the flow time approximations is verified by comparing against simulation results. The error in the flow time approximation is calculated by using
Flow Time Error (%) Fig. 6. Flow time distribution for one product.
Frequency Cumulative %
95
Frequency
18 16 14 12 10 8 6 4 2 0
120%
Cumulative %
100%
Frequency
120%
12
Cumulative %
100% 80%
8 60% 6
40%
4
20%
2
0%
0
40% 20% 0%
7.
1
6.
<1
.6
12
14
.2
7
3 7.
9.
9 4.
2.
0.
5
60%
14
10
Frequency
80%
1
9
.4 16
. 25
8
Flow Time Error (%)
Frequency
. 44
7
. 54
1
.5 63
. 72
9 <
. 82
4
Fig. 10. Flow time distribution for five products.
25
120%
Cumulative %
100%
120%
Frequency Cumulative %
20
100%
60% 40%
Frequency
80%
80% 15 60% 10 40%
20%
5
5
20%
9. <3
Fig. 8. Flow time distribution for three products.
3
.2
5. <8
77
.2 69
.2 61
.2 53
.1
.1 37
21
.1
0%
.1
0
Flow Time Error (%)
29
.1 33
.7 26
.3 20
.0 14
7.
1.
6
0%
27
Frequency
2
Flow Time Error (%)
Fig. 7. Flow time distribution for two products.
16 14 12 10 8 6 4 2 0
. 35
45
Frequency
M. Hulett, P. Damodaran / European Journal of Operational Research 212 (2011) 89–99
Flow Time Error (%)
Frequency
120%
12
Cumulative %
100%
10
90
80%
8
80
60% 40%
4
9 0. <4
35
.9 29
24
18
13
7.
.4
0% .4
0 .9
20%
.4
2
Flow Time Error (%) Fig. 9. Flow time distribution for four products.
Fig. 12 presents the percentage of error in flow time for problems where the number of subtasks is less than the number of servers in the fork-join nodes. The flow time error for 67.5% of the data points was greater in the problem instances where the number of subtasks were less than the number of servers in the fork-join nodes. A paired-t test was conducted to determine whether there is significant difference between the flow time errors. The data strongly suggests that the flow time errors are not the same at 0.01 level of significance. Based on these results, the error appears to be increasing as the number of products increases, and when the number of subtasks is less than the number of servers in the fork-join nodes. The flow time was obtained from the analytical and simulation models, and the relative percentage error was calculated between
70
Flow Time Error (%)
6
9
Frequency
Fig. 11. Flow time distribution for six products.
14
Subtask less than number of servers Subtasks equals number of servers
60 50 40 30 20 10 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
Data Points Fig. 12. Flow time error for problems with less subtasks than number of servers at the fork-join nodes.
them. Based on the data set, the flow time error from three to six products was roughly between 1% and 85%, and the data strongly suggest that the flow time means for those problem classes are different at 0.01 level of significance. In order to improve the results for Markovian system instances with more than three products, efforts were made to develop regression equations to predict the error between the analytical and simulation results for single (i.e. no
96
M. Hulett, P. Damodaran / European Journal of Operational Research 212 (2011) 89–99
Average Error (%)
120.0
Service time average error
100.0
118.6 104.5
Waiting time average error
80.0 67.2
60.0 44.8
40.0 20.0
0.2
0.2
0. 2
0. 2
0.0 3
4
5
6
Number of Products Fig. 13. Waiting and service time error.
250
y = 10.717e 3.4596x 2 R = 0.9969
Flow Time Error (%)
Three products Four products
200
Five products Six products
150
3.1418x
y = 12.834e 2 R = 0.9955 100
fork-join) and fork and join stations. Since the flow time in the network is the aggregation of the flow times at each station, single station instances were considered to develop the correction factors. Therefore, the error in the flow time is adjusted at each station. Thirty-two problem instances (144 data points) for no fork-join stations were generated in order to predict the flow time of the problem under study. The number of product classes were varied from three to six, and failure rates from zero to 35%. After analyzing the data, it was identified that the main error of the flow time came from the waiting time (see Fig. 13), the variables affecting the results the most were the number of products in the network and the net traffic intensity at a node, and there was a nonlinearity in the parameters (see Fig. 14). From Fig. 13 it is evident that the error due to the service time is nearly zero, but the error due to waiting time is increasing as the number of products increased. Consequently, the correction term for the waiting time in queue is developed. The correction term was developed by analyzing the behavior of the error in the waiting time. Microsoft Excel was used to determine the best regression equation. Fig. 14 shows the regression equation by products as a function of the net traffic intensity (denoted as x in the figure). Later, by using those equations it was possible to determine the regression model in terms of the net traffic intensity and the number of products (see Fig. 15). The final regression equation is given by
3.3069x
y = 10.33e R 2 = 0.9928
50
b E i ¼ ð0:8329 p2 þ 8:2267 p
3.2402x
y = 9.1155e 2 R = 0.9823
0 0.2
0.3
0.4
0.5
0.6
8:3643Þexpð0:0628p
0.7
0.8
0.9
Net Traffic Intensity Fig. 14. Flow time error in the waiting time.
14 y = -0.8329x 2 + 8.2267x - 8.3643 R2 = 0.7571
Parameter Value
12 10
Parameter 1 Parameter 2
8 6
Prod 3 4 5 6
y = 0.0628x 2 - 0.5157x + 4.258 R2 = 0.5223
4
P1 9.1155 10.33 12.834 10.717
P2 3.2402 3.3069 3.1418 3.4596
Eb F J ik ¼ 7:0274 2:85 p 4:4 NFJ i þ 2:62 M k FJ i 1:08 f þ 10:95 qFJ i
4
6
8
Number of Products Fig. 15. Regression equation parameters in terms of the number of products.
70.00
y = 409.18x - 64.292 R2 = 0.918
60.00
Flow Time Error
ð21Þ
where E b F Jik is the predicted flow time error at the fork-join i of product class k, p the number of products at the fork-join node i, NFJi the number of nodes at the fork-join i, MkFJi the number of subtasks at the fork-join i by product class k, f is the failure rate at the forkjoin node i, and qFJi is the net traffic intensity at the fork-join i.
0
2
y = 1738.2x - 67.944 50.00 R2 = 0.9892
y = 234.46x - 64.61 R2 = 0.8496
y = 945.19x - 66.849 R2 = 0.9877
40.00
y = 168.96x - 63.791 R2 = 0.7332 1 Product 2 Product 3 Product
30.00
4 Product
20.00
5 Product y = 152.48x - 68.613 R2 = 0.7347
10.00 0.00 0
0.1
0.2
0.3
ð20Þ
where b E i is the predicted error in the waiting time at node i, p the number of products in the network and qi is the net traffic intensity at node i, which is calculated by Eq. (5). In the case of fork and join stations, two hundred ten problem instances (1890 data points) were generated. The number of product classes were varied from one to six, failure rates from zero to 30%, and the number of servers from two to four. Fig. 16 displays a linear relationship between the flow time error and the net traffic intensity. The same relationship was observed for the variable failure rate. The Statistical Analysis Software (SAS) was used to determine the parameters of the model, and the stepwise procedure was used to identify the significant variables in the regression model. The regression equation with a R-squared of 0.5 is given by
2
0
2 0:5157pþ4:258Þq i
0.4
0.5
0.6
0.7
0.8
Net Traffic Intensity Fig. 16. Flow time error from fork-join nodes against net traffic intensity.
6 Product
97
M. Hulett, P. Damodaran / European Journal of Operational Research 212 (2011) 89–99
120%
45 40
100%
Frequency
35
80%
30 25
Frequency
20
Cumulative %
15 10 5
60%
The histogram indicates that the error seems to be distributed now between zero and 20%, and the error in 90% of the problem instances (207 out of 229 data points) was between zero and 13%. On
Flow time error Flow time error (adjusted)
45 40
Flow Time Error (%)
Based on this regression equation, it is evident that the number of products, the number of subtasks, the failure rate at the fork point, and the net traffic intensity at the fork-join nodes significantly affect the flow time. Fig. 17 shows the distribution of the flow time error when the correction factors are used in the ninety problem instances, and Fig. 18 shows the performance of the approximations by comparing the flow time error with and without the correction factors.
35 30 25 20 15 10
40%
5
20%
0
1
4
7 10 13 16 19 22 25 28 31 34 37 40 43 46
0%
0. 1 1. 5 2. 8 4. 2 5. 6 7. 0 8. 3 9. 7 11 .1 12 .5 13 .8 15 .2 16 .6 18 .0 19 <2 .3 0. 71
0
Flow Time Error (%)
Data Points Fig. 20. Flow time error against flow time error using correction factors for four products.
Fig. 17. Flow time error using the correction factors.
90
Flow time error Flow time error (adjusted)
80
80
Flow time error
70
Flow time error (adjusted)
70
60 50 40 30
Flow Time Error (%)
Flow Time Error (%)
90
20
60 50 40 30 20 10
10
0 1
0 1
Data points
16 31 46 61 76 91 106 121 136 151 166 181 196 211 226
Data points
4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64
Fig. 21. Flow time error against flow time error using correction factors for five products.
Fig. 18. Flow time error against flow time error obtained using correction factors.
90
Flow time error Flow time error adjusted
45 80
Flow time error Flow time adjusted
35 30 25 20 15 10
70
Flow Time Error (%)
Flow Time Error (%)
40
60 50 40 30 20 10
5
0
0
1
4
7 10 13 16 19 22 25 28 31 34 37 40 43
Data Points Fig. 19. Flow time error against flow time error using correction factors for three products.
1
5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69
Data Points Fig. 22. Flow time error against flow time error using correction factors for six products.
98
M. Hulett, P. Damodaran / European Journal of Operational Research 212 (2011) 89–99
average the error was reduced approximately by 85.36%. Detailed analysis by product is presented in Figs. 19–22. Fig. 19 displays that the error for 93% of the three product problem instances (42 out of 45 data points) was between zero and 16%, and the maximum error was 19%. On the other hand, Fig. 20 shows that for four products the error for 92% of the problem instances (44 out of 48 data points) was between zero and 17%. Fig. 21 shows that for five product problem instances the error for 95% of the problem instances (61 out of 64 data points) was between zero and 18%. In the case of six products, Fig. 22 displays that the error was between zero and 17%, and the maximum error was 19%.
6. Conclusions Analytical models to estimate the performance measures for Markovian type of manufacturing systems with multiple product classes, job circulation due to failures, resource sharing, and fork and join systems to model parallel processing was proposed. Ninety problem instances were designed to test the accuracy of the analytical estimates, and simulation models for these problem instances were developed. The flow times from the analytical approximations and simulation were compared to evaluate the accuracy of the analytical approximations. It was observed that error, in 96% of the problem instances with one and two products, was between zero and 15%. For problem instances with three to six products the error was between 1% and 85%. Analysis of results indicate that the error is increasing as the number of products increases, and when the number of subtasks is less than the number of servers in the fork-join nodes. In order to improve the results for Markovian system instances with more than two products, appropriate correction terms via regression analysis were added to the approximations in order to bridge the gap in the error between the analytical approximation and the simulation models. Multiple problem instances for no fork-join stations and fork-join stations were generated to determine the correction factors. For no fork-join stations, the correction term was developed analyzing the behavior of the error in the waiting time and using Microsoft Excel to determine the best regression equation. In the case of fork and join stations, a straight line relationship was observed among the variables, and SAS was used to determine the parameters of the model. By using this correction factors, the error seems to be distributed between zero and 20%, and the error in 90% of the problem instances was between zero and 13%. The numerical comparisons show that on average the flow time error was reduced about 85.36%. The average error decreased from 39.19% to 5.69%. The queueing network model proposed improves previous formulations in two main ways. First, it can handle serial and different configurations of paralleling processing in the same framework, with multiple product classes, and job circulation due to random part failures. Second, the analytical formulations have appropriate correction terms that minimize the gap in the error between the analytical approximation and the simulation models. The analytical approximations can help operations managers of web server assembly lines, and manufacturing or other service systems with similar characteristics, to estimate system performances. By estimating the system performances accurately, the operations manager may be able to make judicious decisions – especially setting delivery due dates, capacity planning, bottleneck mitigation, among others. These analytical approximations can also be used in a capacity planning model to determine the optimum number of resources required at each process step depending upon the production load at any given point in time. The objective could be to minimize the flow time and some of the constraints to consider are the total number of servers that can be used, lead time
restrictions, budget, etc. The analytical approximations developed to estimate the flow time can be used directly in the objective. The objective function will be nonlinear. Excel solver add-ons or any other commercially available solvers can be used to solve these nonlinear models. This can help to manage the resources optimally and expedite the orders. The analytical approximations developed in this research can also help the warehousing and healthcare professionals. For example, in a warehouse when customer orders arrive the order is typically broken into smaller orders, so multiple order pickers can travel along the aisles in the warehouse to pick the requested items in the shortest possible time. After the items are picked individually, a pallet might have to be built. The person who builds the pallet is also responsible to verify if the items picked by the order pickers were the ones actually requested. If incorrect items were picked, then the order picker has to repick. Consequently, the order picking process is similar to the fork-join subsystem, pallet building and loading the pallet into the truck are similar to non fork-join stations. The analytical approximations proposed can help to identify the average time required to fill a customer order. Parallel and serial processing is also witnessed in healthcare facilities. For example, an in-patient undergoing different laboratory tests (e.g. X-ray, blood work, ECG) has to wait to see a doctor until all the reports are ready. Appropriately managing the hospital resources is essential to cut down healthcare cost. Patient turn around time can be one of the measures used to estimate cost. The approximations proposed can help to estimate the average time spent by patients in a hospital.
Acknowledgement This research work was partially funded by National Science Foundation through a Grant (DMS0739197).
References Baccelli, F., Massey, W., Towsley, D., 1989. Acyclic fork-join queueing networks. Journal of the ACM 36 (3), 615–642. Baskett, F., Chandy, K., Muntz, R., 1975. Open, closed, and mixed networks of queues with different classes of customers. Journal of the ACM 22 (2), 248–260. Bitran, G., Sarkar, D., 1994. Throughput analysis in manufacturing networks. European Journal of Operational Research 74, 448–465. Chandy, K., Sauser, C., 1978. Approximate methods for analyzing queueing network models of computing systems. ACM Computing Surveys 10 (3), 281–317. Connors, D., Feigin, G., Yao, D., 1996. A queueing network model for semiconductor manufacturing. IEEE Transactions on Semiconductor Manufacturing 9 (3). Flatto, L., Hahn, S., 1984. Two parallel queues created by arrivals with two demands i. SIAM Journal on Applied Mathematics 44 (5), 1041–1053. Available from:
. Jackson, R., 1954. Queueing systems with phase type service. Operations Research Quartely 5 (2), 109–120. Kelly, F., 1975. Networks of queues with customers of different types. Journal of Applied Probability 12 (3), 542–554. Kelly, F., 1976. Networks of queues. Advances in Applied Probability 8 (2), 416–432. Koenigsberg, E., 1958. Cyclic queues. Operations Research Quarterly 9 (1), 22–35. Kuehn, P.R., 1979. Approximate analysis of general queuing networks by decomposition. IEEE on Communications 27 (1). Kumar, P.R., 1993. Re-entrant lines. Queueing System 13, 87–110. Meyn, S., Down, D., 1994. Stability of generalized jackson networks. The Annals of Applied Probability 4 (1), 124–148. Narahari, Y., Khan, L.M., 1996. Modeling reentrant manufacturing systems with inspection stations. Journal of Manufacturing Systems 15 (6), 367–378. Nelson, R., Tantawi, N., 1988. Approximate analysis of fork/join synchronization in parallel queues. IEEE Transactions on Computers 37 (6), 739. Nelson, R., Towsley, D., Tantawi, A., 1988. Performance analysis of parallel processing systems. IEEE Transactions on Software Engineering 14 (4), 532– 540. Omar, M.K., Kumar, S., 2008. Performance analysis in a probabilistic re-entrant for an environmental stress testing operation. Computers & Industrial Engineering 54, 932–944. Park, Y., Kim, S., Jun, C.H., 2000. Performance analysis of re-entrant flow shop with single-job and batch machines using mean value analysis. Production Planning & Control 11 (6), 537–546.
M. Hulett, P. Damodaran / European Journal of Operational Research 212 (2011) 89–99 Pradhan, S., Damodaran, P., 2009. Performance characterization of complex manufacturing systems with general distributions and job failures. European Journal of Operational Research 197, 588–598. Pradhan, S., Damodaran, P., Srihari, K., 2008. Predicting performance measures for markovian type of manufacturing systems with product failures. European Journal of Operational Research 184, 725–744. Reiser, M., Kobayashi, H., 1974. The effects of service time distributions on system performance. Informs Process, 74. Saboo, S., Wang, L., Wihelm, W.E., 1989. Recursion models for describing and managing the transient flow of materials in generalized flowlines. Management Science 35 (6), 722–742. Segal, M., Whitt, W., 1989. A queueing network analyzer for manufacturing. In: Proceedings of the 12th International Teletraffic Conference 2, pp. 1146–1152.
99
Shanthikumar, J., Buzacott, J., 1981. Open queueing network models of dynamic job shops. International Journal of production Research 19 (3), 255–266. Sunkyo, K., 2000. Parametric decomposition approximation of queueing networks. Ph.D. Thesis, Purdue University. Varki, E., 2001. Response time analysis of parallel computer and storage systems. IEE Transactions on Parallel and Distributed Systems 12 (11), 1146–1161. Varki, E., Dowdy, L., 1995. Response time analysis of two server fork-join systems. Available from: . Varki, E., Merchant, A., Chen, H., 2008. The m/m/1 fork-join queue with variable sub-tasks. Available from: . Whitt, W., 1983. The queueing network analizer. The Bell System Technical Journal 62 (9), 2779–2815.