Journal of Mant~/bcturing Systems
Vol. 23/No. 3 2004
Analyzing Throughput and Capacity of Multiproduct Batch Processes Jose H. Noguera, Dept. of Management & Marketing, Southern University, Baton Rouge, Louisiana, USA Edward F. Watson, Information Systems & Decision Sciences Dept., Louisiana State University, Baton Rouge,
Louisiana, USA
Abstract
system performance almost always involves multiple performance measures, and most often, these measures will conflict to some degree. As a result, it is often difficult to assess the relationship of individual input factors to the performance of the simulation model. This paper presents an application of simulation, multivariate statistics, and simulation metamodels to analyze throughput of multiproduct batch chemical plants.
A detailed discrete-event simulation model is extensively utilized by a major chemical process facility to evaluate new plant designs. This model is further exploited using known statistical techniques to provide valuable sensitivity analysis feedback to engineers for process analysis and improvement. A framework is proposed to guide the implementation and application of multivariate statistics and simulation metamodels to analyze simulation output of multiproduct batch processes.
Keywords: Multiproduct Batch Process, Regression Metamode/, Multivariate Statistics, Simulation
Conceptual Framework Introduction
In this study, a framework is proposed to assist in the analysis of multiproduct batch processes (Figure 1). The framework is then implemented to perform throughput analysis of a chemical batch process simulation. Assuming a valid simulation model, there are key strategy questions to resolve when developing a simulation metamodel (Donohue 1994; Kleijnen, Ham, and Rotmans 1992; Sargent 1991). This study
The analysis of chemical batch processing systems is a major challenge due to the unique production environment (Watson 1997): 1. wide range of product types, 2. floating bottlenecks, 3. flexible and specialized process equipment, 4. multistage production chain, 5. sensitive process specifications and stringent quality standards, 6. complex set of interconnected equipment, 7. limited number of specially trained operators, 8. minimal flexibility to accommodate work-inprocess inventory, and 9. products with a relatively short shelf life.
RealWorld
I
I I S'mu'ationMode' I
J ]ScreeningAna,ysisI J \
Thus, computer simulation models have been effectively used for the analysis of chemical batch processing systems (Watson 1997; White 1989; Felder, McLeod, and Moldin 1985; Felder 1983). Simulation models of multiproduct batch systems are usually highly complex and of relatively high dimensionality; that is, the performance of the simulation model is dependent on a large number of parameters or input factors that act and interact in a complex manner. Furthermore, the measurement of
,n°ots I I Out=, I I
I Multivariate Statistics
Mul tivariate i I O,=m,°= I Analysis of Analysis
I
Simulation
Metamodel
,,
I
I Sensitivity Analysis
I Regression Regression Analysis
Figure 1 Framework for Analysis of Multiproduct Batch Process Simulation Output
215
)
JoulvTal of Mam(¢?lcturing Systems Vol. 23/No. 3 2004
focuses on the questions of which input and output variables should be simulated, and how to analyze the resulting output Multivariate statistics are applied to perform screening analysis to identify the main input and output variables to be included in the simulation metamodel. Then regression analysis is applied to develop simulation metamodels for sensitivity analysis. The main concern here is the issue of how to analyze multiple response variables.
overall Type I error rate more effectively than using separate analysis of variance tests (ANOVA). In addition, any intercorrelation among the response variables is taken into account when conducting a MANOVA (Rencher 1995). The basic MANOVA question is whether or not there are any overall (interaction, main) effects present. In addition, this technique provides a solution to questions pertaining to variable selection, variable ordering, and identifying system structure (Huberty and Morris 1989). In this study, MANOVA is used as a variable selection technique to determine if fewer response variables than the total number i n i t i a l l y c h o s e n s h o u l d f o r m a basis for interpretation. The importance of this procedure is that it takes into consideration the correlation among the response variables (Rencher 1995). Discriminant analysis (DA) is used in the analysis of associated data to account for variations in a variable in terms of the variations in other variables. When we say that two or more variables are associated, we mean that the values they take on tend to vary together. DA is used to determine how one or more independent variables can be used to discriminate among different categories of a nominal (or nonmetric) dependent variable (Spector 1980; Rencher 1992, 1995). In general, the term discriminant analysis refers to several different types of analysis (SAS/STAT User's Guide 1990). Among them, canonical discriminant analysis is a dimension-reduction technique related to principal components and canonical correlation. Stepwise discriminant analysis is a variableselection technique. It is expected that canonical discriminant analysis as well as stepwise discriminant analysis will give a framework for ranking the dependent variables based on the contribution to group separation.
Screening Analysis Multivariate Statistics The objective of this study is to provide a realworld application of multivariate statistical techniques that can potentially be used for analyzing multiple response simulation data. In this study, we propose to apply multivariate statistics techniques as a screening approach to determine the most important parameters and response measures. Multivariate statistics is a broad category of statistical techniques that enables one to describe and measure interrelations among sets of variables (Johnson and Wichern 1992; Rencher 1995). The techniques include: (a) multiple regression analysis, (b) multivariate analysis of variance, (c) discriminant analysis, (d) multiple factor analysis, and (e) canonical correlation analysis. Multivariate statistics techniques have not seen a strong application in simulation output analyses, although most simulation models are multivariate by nature. Friedman (1984, 1985, 1986) demonstrated the application of multivariate techniques to analyze simulation output of queuing systems. However, the majority of the papers on simulation output analysis have been restricted in scope to simulation experiments in which only a single response variable (output) is of interest (Kleijnen and Sargent 2000; Bauer 1985; Friedman 1985, 1987;Yang and Nelson 1988). In this paper, multivariate analysis of variance (MANOVA) and discriminant analysis (DA) are used. MANOVA is used to assess the statistical significance of the effect of one or more independent variables on a set of two or more dependent variables (Rencher 1995; Johnson and Wichern 1992; Friedman 1985). It provides a simultaneous analysis of multiple independent and dependent variables. The MANOVA technique allows the researcher to test for the equality of multivariate means and to control the
Simulation Metamodel A simulation model is a representation of a realworld system, whereas a metamodel is a mathematical approximation of a simulation model (Figure 2). Metamodels are developed to obtain a better understanding of the nature of the true relationship between the input variables and the output variables of the real system under study. A model of a real system can be represented as Eq. (1) in Figure 2, where "qu is the value of the uth
216
Journal of ManuJ2lcturing Systems Vol. 23/No. 3 2004
Real System I?u = gu (xl "x2 ..... x q ) u = l
.....
(1)
illll.)l
Yiu = Iu(X','Xi2 ..... Xik) + "£iu
i=
p
u=
Simulation
\
Model
Simulation
=
k=
n;
1. . . . . p
Metamodel
= ,Oo+ , a , x , + ... + , a , x u
1.....
(2)
+
(3)
1.... ,n I ..... j
Figure 2 Conceptual Representation of Relationship Among a Real System Model (1), Simulation Model (2), and Simulation Metamodel (3) [Kleijnen and Sargent 2000]
Although metamodels can have various functional forms (f) (Barton 1992), the type of metamodels most commonly used in simulation studies are polynomial regression models (Durieux and Pierreva12003; Dengiz and Akbay 2000; Madu and Kuei 1994; Kleijnen 1979, 1981, 1987). In this study, we use the first-order model represented by Eq. (3), where u = 1..... n is the simulation run number, Y, is the value of the system performance of the uth observation (often the mean of observations collected during a simulation run), X,k is the setting of the uth input factor on the kth simulation run, [3kis the effect of factor k (model coefficients to be estimated using regression analysis), and s, is unexplained error in the regression model, [30 is the grand mean, and n is the number of simulation runs (Madu and Kuei 1994; Donohue 1995). In the model represented by Eq. (3), one assumes there is no interaction effect, although interaction effects can be added to the model. Each factor has an additive influence independent of the values of the other factors. This reduces the size of the experiment considerably (Donohue 1995; Madu and Kuei 1994).
system response; the q x's are the factors, either controllable or environmental, which determine the value of the system response; g,, represents the (unknown) relationship between the factors on the uth system response; and p refers to the number of outcome measurements (responses). Thus, the objective of the simulation modeling is to approximate this unknown (g,) relationship adequately to study the system in ways, which would be impossible or impractical in the physical world (Friedman 1987). The simulation model of the real system [an approximation of Eq. (1)] can bc represented by the simulation model represented by Eq. (2) with a function, f, where p is the number of responses; n represents the number of replications; k is the number of input variables (probably less than q); Xi~ is the value of the kth input variable in the ith replication; yi, is the value of the uth response variable in the ith replication; and e~,, the experimental error for the uth response in the /th replication, is implemented via the random number streams upon which the simulation depends (Friedman 1987). The simulation model represented by Eq. (2), although simpler than the real-world system [Eq. (1)], is often a complex way of relating input to output. Thus, the goal is to approximate this relationship by a simpler mathematical function, represented by Eq. (3), called a metamodel (Kleijnen 1987; Friedman 1987).
Design of Simulation Metamodel Experiment Once the metamodel form has been proposed represented by Eq. (3), experimental data must be col-
217
Journal of Mamtfacturing Systems Vol. 23/No. 3 2004
Transfer ~1
iilklil
Finished Product Storage Transi~ Tank1
I Mix 2 I Mix 3 PaSS& Transfer
v
Manifold ~ - ~
Transfer
Transfer Hold&.---Id " l HoldingTank
',,
Finished Product Storage Tank2
Downstream DemandRate
Finished Product Storage Tank3
Downstream DemandRate
Finished Product
Storage TankT Transfer
Fabrication
Downstream Demand Rate
Inventory
Downstream
DemandRate Production
Figure 3 Multiproduct Batch Chemical Facility
lected to estimate the [3 coefficients. Specifically, information on the response variables Y,, at a variety of input conditions X,~ is needed. For the first-order model in Eq. (3), the most commonly used experimental designs are the two-level (full or fractional) factorial plans. These designs minimize the variances of the estimated [3 coefficients (Montgomery 1991). Box, Hunter, and Hunter (1978) stated that factorial designs at two levels provide a series of advantages to the researcher. For example, they can indicate major trends and so determine a promising direction for further exploration; they also form the basis for two-level fractional factorial designs; and the interpretation of the result is easy. In general, the use of a full factorial design is appropriate when constructing metamodels if the number of input factors is small (say, k --< 5). On the other hand, if there are too many factors (say, k > 5), it is recommended to rely on fractional factorial designs with higher resolutions and/or factor screening and follow-up designs (Madu and Kuei 1994).
The multiproduct batch chemical process (Figure 3) consists of four main components: (a) the Fabrication or Mix area is responsible for making the specialty material used as raw material for the Production Area; (b) the Transfer area moves the material from Fabrication to Inventory through a sophisticated automatically controlled piping system; (c) the Inventory stores the material, constantly feeding the Production area, and regularly generates new fabrication orders through a reorder-point policy (S-s); and (d) the Production facility makes the end product and will be considered a black box in this study.
Simulation Metamodel Validation
Overall Product Flow and Model Scope
Metamodel validation is dependent on the purpose of the metamodel (Kleijnen and Sargent 2000). In this study, the validity of the simulation metamodel with respect to the simulation model is determined by examining the problem of model selection and
The scope of the simulation study warrants inclusion of only three areas: fabrication, transfer, and inventory (Figure 3). As described later, the production area is represented as a constant and continuous demand for various products from the inventory
diagnostics. A lack-of-fit test is used to indicate if the model is adequate to fit the data. A check of the distribution of the residuals allows one to determine the validity of the assumptions (Panis, Myers, and Houck 1994).
Discrete-Event Simulation of Multiproduct Batch Process
218
Journal o/" ManuJ?tcturing Systems Vol. 23/No. 3 2004
Simulation Model Implementation
area. This demand is specified in the production schedule that is used to drive production. The production schedule indicates, for each finished product storage tank, what formula should be available and at what rate should the formula be used by the downstream production facility. At the beginning of each review period (typically a day), orders are generated, based on the status of inventory in the finished product storage tanks, and placed on a prioritized order list. An operator is required to process each order through the fabrication area according to a strict formula recipe. The operator chooses the order with the highest priority and attempts to assign a main mix tank (MM) to it according to the MM selection logic. Orders remain on the order list until an operator and a main mix tank with the appropriate set of process attributes are simultaneously available. When an order formula is complete, it must be tested for quality conformance and then transferred to the finished product storage tank that originally placed the order. Rework tanks are available in the event that an order formula does not pass the quality test. Also, holding tanks are available if a transfer line is not available to transfer the order formula from the MM tank to the finished product storage tank. Holding tanks essentially free up the MM tank for the next order. It is important to understand that each production order is for a specific product formula (e.g., end item). A production order identifies a demand for a certain quantity of a certain formula. Typically, this demand is generated from the needs of a single finished product storage tank, but a production order may also represent the demand of two or more storage tanks that each necessarily contain the same formula. Each product formula has a unique product recipe associated with it. Product recipes define the specific process requirements and process flows of each product formula. A product recipe is essentially a cookbook procedure for a specific product formula and may consist of up to 30 individual process steps. The plant may process up to 100 different recipes in a given month. An overview of how these complex formulas are represented in the computer simulation is discussed in a previous paper by Watson (1997). Furthermore, a product order is also referred to as a batch, a batch of product, or a batch of formula.
The simulation model is implemented with the ARENA simulation language (Kelton, Sadowski, and Sadowsld 2002; ARENA 1995). Although ARENA supports process flow, event-scheduling, and continuous world views, we require only the process flow mode. A simulation entity in this model is equivalent to a production order, though simulation entities are also used a great deal for control logic. A discrete-event model of a batch process facility is very reasonable because the product is modeled as discrete batches as opposed to continuous flows. The production schedules defined for each finished product storage tank are defined for a finite period of time. Hence, the system simulated is referred to as a terminating system. There are many stochastic components in the simulation model that will be briefly discussed here. The production facility, in general, is extensively automated, but because of the nature of the product and the involvement of human operators, there are many random events to take into account. Mixing times in PreMix and MainMix tanks were approximated with the normal distribution. Actual data from production reports were used to estimate the mean and standard deviation. There are also setup delays involved for certain products that require significant manual (raw) material handling. Because the mean is always much larger than the standard deviation, there is virtually no fear of obtaining negative values. Random probabilities are used to represent quality test results, and these values are based on past experience. Improving the quality of the product will affect these probabilities, and the subsequent effect on the production system is of interest in this study. During a batch transfer to a storage tank, if a batch must be split across multiple storage tanks, an operator is involved in the process of making the switch and purging the lines. This time also is nondeterministic and adequately modeled with the normal distribution. The production schedule itself is another source of variability in the behavior of the plant. Finished product storage tanks will, on occasion, require flushing. This occurs when a production line (in the downstream production process) shuts down and the material, being a perishable good, sits in the storage tank for a lengthy period of time. Also, there may be an equipment failure in the storage tank circulation device. This type of failure phenomena is
219
Journal
o/"
Mam(titcturing Systems
Vol. 23/No. 3 2004
represented with the exponential distribution. A single distribution is used to model a failtire event. Then a discrete distribution is used to determine which storage tank will actually experience the failure. As should be apparent, the transfer process is not modeled as a continuous process but rather as a batch process. Because a production order is a batch of product, it makes sense to model a transfer as a single delay process that depends on the size of the batch and the rate of transfer for the product being transferred. Transfer pipes over time experience residual build-up. It is assumed that in any given period, between purges, this build-up does not have a significant effect on transfer delay. The continuous draining of the finished product storage tank by downstream operations is modeled as periodic depletions. We experimented with different period resolutions to find a period that would not result in unacceptable model accuracy or model run time.
SYSTEMPARAMETERSI r ............
1 ..............
DESIGN
~
i.
I"
Res°urce Capacity "1
r MainMix (9)' / Holding(5) | Storage(52) 1"-- Xler Lines: / PM-MM, l MainMix, L Storage Scales, Buckets ~ ~ , Resource Functionality i ....l". . . . . . . . . . . . . . . . t--
PM and MM Atributes (8 typesof each)
Transferkhes: PM-MM, MM, HT CircTank
1
OPERATIONAL
I
i ENVIRONMENTAL
~L~--;~i~'~l~ "
Pr°ducti°n Schedule 1
l Run Time(per R e c i p e ) 1 I ~. . . . . . . . . . . . . . "~ | I'~ Inventory Policy l I ~ I ~-| . . . . . . . . . . ~
I I I
I
I
~ Per Recipe-CTAssig. _ .Re°rder Po,nt
Std and MaxOrderQ. I ~ ........ ~ ~ Process Rates j ....T'" . . . . . . . . . . . . . . . . . . ~ L-- SRATE Command WEIGH Command
(demandrate, run length) ......
~ _ ' - -
Product Mix
,
I
I '-7"--. . . . . . . . . I
I
I
-- Rec,~/CTA~ig.
j ~ . . . . . . . . . . . . . . L.{ Product Routings !
"-T
........................
LI
Recipes(Sequences)
PM/Slurytransfer PM-MM transfer MM-CT transfer ~ ' ~ .~.ality Assurance "I
7Z
Testing Acceptability(P/F)
Simulation Model Parameters and Response Variables
_ ~ r der Generation Frequency(TimeBet) Policy(ReorderPoint)
A production facility of a multiproduct batch process is defined by a large number of system parameters (Figure 4). System p a rameters can be classified as: design parameters, operating parameters, and environmental parameters. The design parameters represent the basic design decisions for the mix, transfer, and storage areas. These decisions are primarily related to capacity and flow: what size do we need and how do we move the product through the system? Three design parameters are included in the simulation model; resource capacity, resource configuration, and resource functionality. Any change to the capacity of the current resource outlay was prohibitive, so the engineers focused on resource configuration (transfer line configuration/ TLC) and resource functionality (RF). With respect to the design parameters, the plant managers were interested in addressing issues such as: (a) What is the effect of using "dedicated" transfer lines versus higher speed "shared" transfer lines? (b) How much
Operatorrules(ShiftAssig.)
Figure 4 Systems Parameters for Simulation Model
do mix attributes affect productivity? (i.e., purchase mixers that have maximum capability/functionality versus purchase mixers that have a limited set of attributes); (c) Which design parameters have the biggest effect on throughput? (e.g., capacity attributes, flow rates, and so on). The operating parameters involve decisions that must be made each day or each planning period. From an operational perspective, four parameters are selected for analysis (FL, TR, PR, and QP). The engineers were interested in looking at moving toward "just-in-time" production instead of traditional batch production. To do this would imply increasing the number of changeovers in each storage tank (e.g.,
220
Journal of Manufitcturing Systems Vol. 2 3 / N o . 3 2004
reducing the scheduled run l e n g t h - - F L - - f o r each recipe-CT combination). How this would affect production was not known. A second interest was to determine whether an increase in transfer rate, TR (mix to circulation), would improve throughput (i.e., relieve congestion). Instead of dealing with three variables, the project team decided to group these variables into a single transfer rate factor. A third interest was to determine whether an increase in process rate, PR (raw material to mix, premix to mix, or mix to mix), would improve throughput (i.e., relieve congestion). Such an increase would reduce flow time and, consequently, alleviate congestion at the expense of a lower product quality. The extent of such a change on overall system performance was not clear. With respect to quality assurance (QP), the plant managers were interested in determining how the quality assurance policies affect throughput. The environmental variables are related to customer requirements and customer demand patterns. Engineers were interested in studying the effect on changing demand rate (DR). A demand rate is specified for each recipe that is scheduled in each storage tank. Increasing the demand rate implies that the storage tank is draining at a faster rate. Identifying possible b o t t l e n e c k areas is desirable u n d e r these circumstances. From a "local" perspective, changing any of these factors would seem to have an obvious effect:
Notation
Vi.i
:
V[it
=
vaij t = 6 = Aijt =
lit U
=
k l
=
average flow time for product i in storage tankj total volume of product i produced in tank j volume of product i produced in tank j at time t volume of product i added to t a n k j at time t time since last product age calculation product age for product i in tank j at time t average product age for total simulation run length average batch size for product i in tank j number of batches produced for product i in tankj number of products number of tanks
Total Product Volume Produced (Y0 k
/
r =y2v j
(4)
i=1 j=l
Weighted Product Age in Circulation/Storage Tanks (Yz) k
/
m
Z Z AiiBijnij Y2~-- i=1 j=l
(5)
where
• r e d u c i n g run l e n g t h w o u l d result in m o r e changeovers and less productive time; • increasing transfer rate would result in shorter flow times; • increasing process rate would have a questionable effect due to the reduction in product quality; • increasing demand would stress the system; • providing dedicated transfer lines would relieve transfer congestion; and • increasing resource functionality would reduce resource (PM and MM) congestion.
Aii(t): (v~j')(8+Aij('-l))
(6)
Vijt "]- vaij t
Average Weighted Flow Time (I13) k
l
Y3 - i:1 j=l
rl
(7)
Number of Stockouts (I74) Y4 = N = n u m b e r of stockouts. A stockout occurs when a storage tank is empty but d e m a n d for the current recipe in that storage tank is greater than 0.
But the overall impact on system performance is impossible to determine without a detailed simulation model and a thorough statistical analysis. Five key performance measures were selected for analysis. These measures are defined below. Notation is introduced here to simplify their definitions.
Quality Control Fail (Ys) F5 = Q = number of gallons dumped due to quality control failures (Mix area).
221
Journal of Mam([?wturing Systems Vol. 23/No. 3 2004
Total product made (Y1) reflects the ability of the production system to meet demand. Excessive quality problems or system congestion could result in not achieving production goals. The weighted product age (Y2) measures the length of time the product sits in the storage tank. As the product ages, the quality of the product is reduced. At a certain threshold, the product quality deteriorates rapidly. A product of poor quality has the biggest impact on the production area and will result in serious quality problems and line shutdowns. This problem should be avoided. Weighted flow time (Y3) is a fundamental measure of manufacturing (e.g., mix area) congestion and work-in-process inventory (e.g., transfer). Systems with excess congestion and/or excess work-inprocess tend to have higher flow times. Storage tank stockouts (Y4) are the result of a product not being made quickly enough or given high enough priority. A stockout in the work-inprocess inventory area will result in a shutdown in the production area and is extremely costly and should be avoided. The quality control fail (Ys) measures the amount of product disposed of due to poor quality in the mix area. If a batch of product is dumped, a new order will be generated. The effect of dumping product and generating new product is difficult to predict exactly. Clearly, the resource utilization in the mix area will increase, but it is difficult to determine if this increase will result in undue stress (e.g., congestion) that would then lead to an increase in flow time and perhaps a storage tank stockout. The effective and efficient operation of the facility requires the investigation of these important issues:
scheduling policy, etc.) can be made to improve throughput while maintaining the same level (or higher) of quality? Simulation Model Validation
Various approaches to and definitions of model validation are discussed in the literature (Sargent 1994). The approaches used in this study include project-team validation (the modeler must convince the users and systems operators that the model is valid) and third-party validation (a manager who is familiar with the system but not involved in the project effort must be convinced that the model is valid). Projectteam validation required meetings with all members early in the project to solicit information concerning project objectives and scope, modeling detail, input requirements; and to obtain team support. The specific techniques of model validation used included animation (for verifying the model and presenting it to system operators and managers), degeneracy tests (providing unreal input parameters and ensuring that the resulting output is also unreal), event validity (making sure that shop disruptions occur as they would in the real system), extreme-condition tests (providing extreme operating parameters and making sure the model behaves as the real system would) (for example, initializing the storage tanks to an empty state and observing start-up behavior), and deterministic values (for example, replacing all stochastic events with deterministic events). Perhaps the most powerful technique used was the Turing test (Turing 1950) with historical data. We provided a one-month operating schedule with all as-were operating parameters as input to the model and then compared the output to the actual system output. We asked managers knowledgeable about the system to examine the simulated data and actual data. We presented all data sets in exactly the same format, making it difficult to distinguish between real and simulated data. When the managers detected a difference, we used their knowledge to investigate the source of the problem and to revise the model.
1. Sensitivity Analysis--which system parameters most influence the key measures of system performance? 2. Facility Design--how can we determine the best combination of resources that will provide the most efficient facility operations under dynamic conditions? Can we develop general rules-ofthumb for designing new plants? (For example, dedicated transfer lines versus shared transfer lines, limited resource attributes versus unlimited, and so on). 3. Throughput Analysis--given the current system, what changes to the operating policy (e.g., scheduling policy, inventory policy, operator
Experimental Designand Analysis Multivariate Statistics
The analysis involves the comparison of experimental results (response measures) under alternative
222
Jour~ml o[" Manuj~lcturing Systems Vol. 2 3 / N o . 3 2004
Table 1
Table 2
F a c t o r s Selected and Levels for a 2 7 F u l l Factorial D e s i g n
MANOVA Results of Factor Effect of Muitiproduct Chemical Batch Process for Replicated 2 7 Factorial Design
Factors Levels Notation Variables +i -1 FL Run length -50% CURRENT per recipe-ct TR Transfer rate +50% CURRENT PR Process rate +50% CURRENT QP Test delay 8 30 DR Rate of depletion +15% CURRENT TLC Transferline DEDICATED SHARE configuration RF Resource UNRESTRICTEDRESTRICTED functionality
Source FL TR FL*TR PR FL*PR TR*PR QP FL*QP TR*QP PR*QP DR FL*DR TR*DR PR*DR QP*DR TLC FL*TLC TR*TLC PR*TLC QP*TLC DR*TLC RF FL*RF TR*RF PR*RF QP*RF DR*RF TLC*RF
operating conditions, input-variable settings. A twolevel full-factorial design plan (27) (Table 1), with t w o - w a y interaction, is implemented to allow the comparison of the system performance [Total Product Volume Produced (Y0, Weighted Product Age in circulation/storage tanks (Y2), Average W e i g h t e d Flow Time (Y3), Number of Stockouts (Y4), and Quality Control Fail (I15)] under 128 different alternative operating conditions. Each operating condition (factor-level combination) is simulated five times. Data collected are analyzed using SAS (SAS/STAT User's Guide 1990; Khattree and Naik 1995) software. Spec i f i c a l l y , the o p t i o n s o f PROC G L M , P R O C STEPDISC, and PROC CANDISC are used.
Multivariate Analysis o f Variance Results and Discussion Multivariate analysis of variance (MANOVA) is applied to test for significant main-factor effects as well as two-factor interactions. The significance of f a c t o r e f f e c t s is g e n e r a l l y b a s e d on the W i l k ' s l a m b d a statistic and the F - t e s t d e r i v e d f r o m it (Rencher 1995). The results o f the MANOVA are shown in Table 2. All o f the main effects as well as some t w o - w a y interactions are statistically significant (*P < 0.01). F r o m Table 2, we m a y observe, based on the F ratio, that the environmental parameter ( D R ) - - d e m a n d rate, is the factor that contributes with the most significant amount o f variation to the experimental results. However, it must be considered that demand rate significantly interacts (P < 0.01) with flexibility (FL), quality policy (QP), and transfer line configuration (TLC). Thus, no conclusions can be drawn f r o m its individual effect on process performance.
Wilk's A 0.0201 0.1930 0.9912 0.0279 0.9768 0.9928 0.3551 0.9807 0.9969 0.9869 0.0007 0.6868 0.9823 0.9844 0.9370 0.0499 0.7067 0.9861 0.9955 0.8868 0.6438 0.9077 0.9860 0.9759 0.9897 0.9825 0.9952 0.9336
F 5927.8 507.5 1.1 4223.4 2.9 0.9 220.5 2.4 0.4 1.6 177900.0 55.4 2.2 1.9 8.2 2306.8 50.4 1.7 0.5 15.5 67.2 12.3 1.7 2.9 1.2 2.1 0.6 8.6
D.E 5,607 5,607 5,607 5, 607 5,607 5, 607 5,607 5,607 5, 607 5, 607 5, 607 5, 607 5,607 5,607 5, 607 5,607 5, 607 5,607 5, 607 5,607 5, 607 5, 607 5, 607 5, 607 5,607 5, 607 5,607 5, 607
Pr > F 0.0001* 0.0001* 0.3707 0.0001* 0.0139 0.4918 0.0001* 0.0372 0.8635 0.1571 0.0001* 0.0001* 0.0548 0.0895 0.0001* 0.0001* 0.0001* 0.1310 0.7392 0.0001* 0.0001* 0.0001* 0.1270 0.0112 0.2774 0.0568 0.7111 0.0001*
In contrast, the design parameter (RF), resource functionality, is the factor that contributes the least significant amount of variation to the experimental results (Table 3). As discussed by Spector (1980), MANOVA is basically a two-step process. The fn'st step is to test the overall hypothesis of no differences in mean centroids for the different treatment groups. If this test is significant, the second step is to conduct follow-up tests to explain the group differences (Bray and Maxwell 1982). Thus, each significant multivariate effect is analyzed to determine which of the multiple dependent variables contributes to multivariate significance. Although, a number of methods have been developed for analyzing and interpreting data after finding a significant overall MANOVA (Bray and Maxwell 1982), we will be only concerned with a specific type of discriminant analysis, canonical discriminant analysis.
223
Journal of Man~([?tcturing Systems Vol. 23/No. 3 2004
Table 3 Variable R a n k i n g Based on Standardized Discriminant Coefficients for First Canonical Discriminant Function (l:high and 5:low)
Factors DR FL PR TLC TR QP RF
Total Product Volume (YO 1 2 2 2 2 2 3
Weighted Product Age (Y2) 2 1 1 1 1 1 1
Weighted Flow Time (1~0 3 3 3 3 3 3 2
Discriminant Analysis Results and Discussion
Number of Stockouts (Y4) 5 4 5 5 5 4 5
Quality Control Fail (Ys) 4 5 4 4 4 5 4
the others) in distinguishing groups (for all the factors) are number of stockouts (Y4) and quality control fail (Ys) (Table 4).
Discriminant analysis (DA) was applied to evaluate the contribution of each variable to the canonical discriminant function that best separates the mean vectors of two or more groups of multivariate observations r e l a t i v e to the w i t h i n - g r o u p v a r i a n c e (Rencher 1992). To apply DA in simulation output analysis, the analyst first must specify which variables are to be independent and which one is to be dependent. Recall that the dependent variable is categorical and the independent variables are metric. Thus for our simulation analysis, the dependent variable consists of two groups or classifications, for example, high versus low experimental conditions, and we assume that the independent variables are the response measures (I11. . . . . Y,,). In this study, we want to examine the high and low levels of each factor (-1 and +1) as the dependent variables and the different response measures as the independent variables. We would like to determine which simulation output has the largest effect on distinguishing the difference between factor levels; for this reason, a stepwise procedure is used. The canonical discriminant analysis results (Table 3) were calculated by SAS PROC CANDISC (SAS/ STAT User's Guide 1990). In addition to this procedure, PROC DISCRIM provided a test of hypothesis of homogeneous covariance matrices. Variables are ranked in order of their contribution to the function based on the absolute values of the standardized coefficients (Table 4). The results indicate that weighted product age (Y2) and total number of gallons made (I/1) are the variables that contribute most to separating the groups (for factors FL, TR, PR, QP, and TLC), in that order, based on the standardized coefficients (Table 4). On the other hand, the variables that are not useful (in the presence of
Application of Regression Metamodels Metamodel Form A regression model (3) was used to estimate the relationship b e t w e e n total product made (Y1), weighted product age (Yz), average weighted flow time (Y3), number of stockouts (Y4), and quality control fail (Ys), and four independent factors---demand rate (X1), process rate (X2), transfer rate (X3), and quality policy (X4).
g,, = 13o
+eu
(3)
Experimental Design The process of metamodel estimation in this study focused on adequacy and simplicity. A regression metamodel for each response variable is developed based on the results of the screening analyses, goals of the study, and system knowledge. Three operational factors (process rate, transfer rate, and quality policy) and one environmental factor (demand rate) were selected (Table 5) based on their practical importance as determined from the detailed simulation study (Watson 1997). Once the metamodel form has been proposed, i.e., a regression model (3), we can proceed to determine the appropriate experimental design plan. A first-order response surface design (24 f u l l factorial), with t w o - w a y i n t e r a c t i o n , is used to e s t i m a t e the metamodel (3) for each of the response variables (Yi) under 16 different operating conditions. Each operating condition (factor-level combination) is simulated 10 times. Two model-fitting steps were followed. First, a full model, a first-order model (3) with two-way interac-
224
Journal of" Mant@lcturing Systems
Vol. 23/No. 3 2004
tion terms, was fit to the data for each response variable. The adequacy of the full model was tested by using a lack-of-fit F-test and some regression diagnostics (Panis, Myers, and Houck 1994). In addition, to evaluate the appropriateness of the postulated model given the estimated effects from the experiment, the significance of each factor was determined in terms of its contribution to the total sum of squares of the data. Then, only those effects that account for a large amount of variation will be included in the final prediction (fitted) metamodel; the remaining effects are ignored because their variability is not larger than would be expected just due to experimental error. Then, a reduced model, a first-order model (5) with only the significant main effects and two-way interactions, was fit to the same data. The validity of the model was also tested by computing a lack-of-fit F-test. As a result, a simple and adequate model was obtained for each response variable considered. The least-squares regression metamodels were developed using the SAS regression procedures, SAS PROC REG (SAS/STAT User's Guide 1990), for the points in the model construction experimental designs.
ity plots of residuals do not show evidence of strong depam~res from normality. Based on these results, we can accept the models fitted for each response. The estimated standardized regression coefficients of fitting the regression model (3) to the simulated data are given for each response variable, YI: total gallons made (in thousands), Y~: weighted product age, Y3: weighted flow time, I74:number of stockouts, and Ys: quality control fail (in thousands) in Table 5. The metamodels obtained through least-squares regression are both descriptive and predictive in nature. Therefore, they can be used to gain insight into the relationship between the levels of the factors. In general, the estimated standardized regression coefficients provide us with two types of information. The magnitude of a coefficient indicates how important a particular effect is on the fitting. On the other hand, the sign (negative or positive) indicates if the factor has a positive or negative effect on the response. The f o l l o w i n g are the estimated regression metamodels:
El --l~o 'I I~IXl "l-1~2X2hi-~3X3"1"~4X4 "I"I~13XlX3
(8)
Y2 = ~0-~lXl dl-~3X3-l-~4X4 dl-~13XlX3-~14SlX4
(9)
Regression Metamodel Results and Discussion
I)3 =~0 +~,X,-~2X2-~3X3-~,2XlX2-~,3XIX3
(10)
The models fitted for each response variable were significant by the F-test at 1% confidence level. The lack-of-fit test measures the failure of the model to represent data in the experimcntal domain at points that are not included in the regrcssion. None of the models exhibited lack of fit. The fact that a model is significant (contains one or more important terms) and that the model does not suffer from lack-of-fit does not necessarily mean that the model is a good one. If the experimental environment is quite noisy or some important variables have been left out of the experiment, then it is possible that the portion of the variability in the data not explained by the model (residual) could be large. To quantify this, a measure of the model's overall performance is considered, a quantity called the coefficient of determination, denoted by R 2. Thus, goodness-of-fit was evaluated using the R 2 value (the proportion of total variation explained by the regression). In all cases, except (Ys), the R ~"value is close to 1.0 and the residual plots contained no discernible pattern (Table 5). The normal probabil-
Y4 =1~0"t-~iSl-~3X3-~4X4-~13Xlg3-~14XlS4
(11)
Y5 -~0 ~-~1X, t- ~2X2 t- ~4X4 t- ~14Xlg4
(12)
An interpretation of the metamodels estimated, given in the Table 5, can be stated as follows. The model fitted (8) for ~ indicated that demand rate (X 0, process rate (X2), transfer rate (X3), and quality policy (X4) significantly affect the response variable (P < 0.05). In addition, there was a significant second-order interaction term (XIX3). Total product made (l)l) will increase as a result of increasing demand rate by 30%, process rate by 10%, transfer rate by 50%, and reducing the quality control test delay from 24 to 12 (-1 and +1, respectively). However, demand rate and transfer rate significantly interact (P < 0.05). This means that the effect of demand rate on the response variable depends on the level of the transfer rate. Thus, the effect of each factor cannot be analyzed individually. In contrast, the effect of process rate and quality policy can be interpreted individually because
225
Journal qfMam(~cturing Systems
Vol. 23/No. 3 2004 Table 5 Estimated Standardized Regression Coefficients [Significant effects (P < 0.05)]
~i
(~i)
Total Product Volume
Weighted Product Age
(Y~)
(Y2)
Weighted Flow Time (Y3)
Number of Stockouts (Y4)
Quality Control Fail (Ys)
~0
3068369.18
1614.06
85.58
132.86
13415.28
~i
338929.38
-253.30
0.18
96.55
1578.42
~2
8749.60
~3
5818.50
13.74
~4
1119.29
5.77
4444.91
-0.66 -0.19
-27.70 -6.71
514.21
-0.01 ~13 ~14 R2
3420.80
1.81
-0.02
-2.09 0.999
0.998
0.992
-17.51 --4.78
592.32
0.972
0.714
effects. In addition, quality policy does not affect significantly (P > 0.05) the response variable. The model fitted (11) for I)4 indicated that demand rate (X1) , transfer rate (X3) , and quality policy (X4) significantly affect the response variable (P < 0.05). In addition, there were significant two-factor interaction terms (X~X3 and X~X4). The n u m b e r of stockouts (174) will increase as a result of increasing demand rate, while increasing transfer rate and reducing the quality control test delay (from 24 to 12) will decrease the number of stockouts. Notice that demand rate significantly (P < 0.05) interacts with transfer rate and quality policy. Thus, caution must be taken in interpreting individual factor effects. In addition, process rate does not affect significantly (P > 0.05) the response variable. The model fitted (12) for 1)5 indicated that demand rate (X0, process rate (X2), and quality policy (324) significantly affect the response variable (P < 0.05). In addition, there was a significant two-factor interaction term (XIX4). As expected by the research engineer, the amount of product disposed due to poor quality in the mix area (175) will increase as a result of increasing process rate within tanks (Table 5). On the other hand, although there is a significant effect due to changing demand rate and quality policy, these two factors significantly interact (P < 0.05). This means that the effect of demand rate on the response variable depends on the level of the quality policy.
their interactions are not significant (P > 0.05). As expected, an increase in process rate will increase total product made (throughput). In addition, a reduction on test delay (from 24 to 12 minutes) will also increase throughput. The model fitted (9) for ~ indicated that demand rate (X 0, transfer rate (X3), and quality policy (X4) significantly affect the response variable (P < 0.05). In addition, there were significant two-factor interaction terms (X~X3and X~X4).Weighted product age (4) will decrease as a result of increasing demand rate while increasing transfer rate and reducing the quality test delay (from 24 to 12) will increase weighted product age. Notice that demand rate significantly (P < 0.05) interacts with transfer rate and quality policy. Thus, caution must be taken in interpreting individual factor effects. In addition, process rate does not affect significantly (P > 0.05) the response variable. The model fitted (10) for 1)3 indicated that demand rate (X1), process rate (X2), and transfer rate (X3) significantly affect the response variable (P < 0.05). In addition, there were significant two-factor interaction terms (XIX2 and X1X3). Average weighted flow time (173) will increase as a result of increasing demand rate, while increasing process rate and transfer rate will decrease average weighted flow time. Notice that demand rate significantly (P < 0.05) interacts with process rate and transfer rate. Thus caution must be taken in interpreting individual factor
226
Journal ~f" Manuj~tcturing Systems Vol. 23/No. 3 2004
In summary, as expected, increasing demand rate by 15% will stress the system. As a result, the total product made, the average weighted flow time, the number of stockouts, and the amount of product disposed due to poor quality will increase, while product age will decrease. Changes on the process rate (within tanks) will have a significant effect (P < 0.05) on the response variables. An increase on the process rate will improve throughput (i.e., relieve congestion) and increase the amount of product disposed due to poor quality, while the average flow time will decrease. An increase in transfer rate (between tanks) will increase throughput and weighted product age, while the average weighted flow time and the number of stockouts will decrease. A reduction on the testing delay of 12 minutes will increase throughput, weighted product age, and the amount of product disposed due to poor quality. On the other hand, this reduction will decrease the number of stockouts.
The overall benefit from a management perspective is to be found in applying the framework as well as results obtained to decisions concerning plant operation and probably new facility design. Rather than rely on one-factor-at-a-time design, or just intuition or experience, the statistical methods presented in this study provide a more precise and accurate way to obtain quantitative estimates of the relevant parameters needed to model response measures. Thus, this approach is likely to be useful in a number of substantive areas, such as sensitivity analysis, prediction, and/or optimization. Discrete-event simulation provides a modeling power unavailable from analytical methods due to process complexity. The advantages of analytical and simulation methods can be successfully combined by regression metamodels. Estimation of stockout or quality failures due to demand changes and delays due to quality testing can be accomplished from the metamodel and are useful to management involved in production control. The integrated research approach of using simulation in modeling system details and statistical methods in analyzing response measures of system behavior is a powerful methodology for analyzing complex large-scale systems such as multiproduct batch processes. The approach provides meaningful results that cannot be achieved by simulation methods alone. More significantly, the integrated approach demonstrates the ultimate usefulness of computer simulation in real-world batch production. Results from this type of analysis may deepen the understanding of the system, and the process of interpreting and explaining the interactions forms an extensive verification of the model. Management recommendations should only be made after all options have been evaluated. If model recommendations are at odds with c u r r e n t m a n a g e m e n t practices, the model becomes a hypothesis-generating tool. The recommended method of system management should then be tested in field trials against the current system. The form of experimentation presented here has shown that large-scale statistical analyses are both feasible and meaningful. Thus, the major motivation of this study has been to present a framework for analyzing simulation output of batch processes. The approach to complex systems analysis described in this paper provides important operating informa-
Discussion and Summary In this study, the s i m u l a t i o n output of a multiproduct chemical batch process has been studied in much detail by applying statistical techniques. Specifically, Multivariate Analysis of Variance (MANOVA) was applied to determine the most important factors that affected the performance measures; Discriminant Analysis techniques were used to determine which dependent variables contribute the most to group separation; and Regression Analysis was used to developed metamodels useful for sensitivity analysis and prediction. The application of multivariate analysis of variance and discriminant analysis provided valuable information concerning the screening of significant effects and response variable contribution to group separation, respectively. The fact that all the main effects considered, as well as some two-way interactions, were significant implies that a multiproduct batch process is very sensitive to changes in operational, environmental, and design parameters. Based on the statistical results of regression metamodels--the significant and interpretable coefficients as well as some regression diagnostics--we conclude that this modeling approach is effective in measuring the relationship between input-output variables of a simulated multiproduct batch process.
227
Journal of ManuJktcturing Systems Vol. 23/No. 3 2004
tion to scientists and engineers responsible for sensitivity analysis and prediction and/or optimization of multiproduct process facilities.
Kleijnen, J.P.C. and Sargent, R.G. (2000). "A methodology for fitting and validating metamodels in simulation." European Journal o f Operational Research (v120), pp14-29. Kleijnen, J.EC.; Ham, G.V.: and Rotmans, J. (1992). "Techniques for sensitivity analysis of simulation models: a case study of the CO greenhouse effect." Simulation (v58, n6), pp410-417. 2 Madu, C.N. and Kuei, C.H. (1994). "Regression metamodeling in computer simulation - the state of the art." Simulation Practice and Theory (v2), pp27-41. Montgomery, D.C. (1991). Design and Analysis of Experiments, 3rd ed, New York: John Wiley & Sons. Panis, R.P.; Myers R.H.; and Houck, E.C. (1994). "Combining regression diagnostics with simulation metamodels." European Journal of Operational Research (v73), pp85-94. Rencher, A.C. (1992) "Interpretation of canonical discriminant functions, canonical variates, and principal components." The Statistician (v46, n3), pp217-224. Rencher, A.C. (1995). Methods of Multivariate Analysis. New York: John Wiley & Sons, Inc. Sargent, R.G. (1991). "Research issues in metamodeling." Proc. of 1991 Winter Simulation Conf., B.L. Nelson, W.D. Kelton, and G.M. Clark, eds., pp888-893. Sargent, R.G. (1994). "Verification and validation of simulation models." Proc. of 1994 Winter Simulation Conf., J.D. Tew et al., eds., pp77-87. "SAS/STAT user's guide: version 6," 4th ed. (1990). Cary, NC: SAS Institute, Inc. Spector, P,E. (1980). "'Redundancy and dimensionality as determinants of data analytic strategies in multivariate analysis of variance." Journal of Applied Psychology (v65, n2), pp237-239. Taring, A.M. (1950). "Computing machinery and intelligence." Mind (v59, n236), pp433-460. Watson, E.F. (1997). "An application of discrete-event simulation for batch-process chemical-plant design." Interfaces (v27, n6), pp35-50. White, C.H. (1989). "Productivity analysis of a large multiproduct batch processing facility." Computers in Chemical Engg. (v13, n l / 2), pp239-245. Yang, W.N. and Nelson, B.L. (1988). "Multivariate estimation and variance reduction in terminating and steady-state simulation." Proc. of 1988 Winter Simulation Conf., pp466-472.
References "ARENA user's guide" (1995). Sewickley, PA: Systems Modeling Corp. Barton, R,R. (1992). "Metanmdels for simulation input-output relations.'" Proc. of 1992 Winter Simulation Conf., J.J. Swain et al., eds., pp289-299. Bauer, K.W. (1985). "Simulation model decomposition by factor analysis." Proc. of 1985 Winter Simulation Conf., pp185-188. Box, G.E.; Hunter, W.G.; and Hunter, J.S. (1978). Statistics for Experiments. New York: John Wiley & Sons, Bray, J.H. and Maxwell, S.E. (1982). "Analyzing and interpreting significant MANOVAs.'" Review of Educational Research (v52, n3), pp340-367. Dengiz, B. and Akbay. K.S. (2000). "Computer simulation of a PCB production line: metanrodeling approach." Int'l Journal of Production Economics (v63), pp195-205. Donohue, J.M. (1994). "Experimental designs for simulation." Proc. of 1994 Winter Sinmlation Conf., J.D. Tew et al., eds., pp200-206. Donohue. J.M. (1995). "The use of variance reduction techniques in the estimation of simulation metamodels." Proc. of 1995 Winter Simulation Conf., pp194-200. Durieux, S. and Pierreval, H. (2003). "Regression metamodeling for the design of automated manufacturing system composed of parallel machines sharing a material handling resource.,' Int'l Journal o~ Production Economics, ppl-10. Felder, R.M.; McLeod IV, G.B." and Moldin. R.E (1985). "Simulation for the capacity planning of specialty chemicals production." Chemical Engg. Progress, pp41-46. Felder, R.M. (1983). "Sinmlation - a tool for optimizing batch-process production." Chemical Engg.. pp79-84. Friedman, L.W. (1984). "'Establishing functional relationships in multiple response simulation, the multivariate general linear metamodel.,' Proc. of 1984 Winter Sinmlation Conf., S. Sheppard, U. Pooch, and D. Pedgen, eds., pp285-289. Friedman, L.W. (1985). "On the use of MANOVA in the analysis of multiple-response simulation experiments." l'roc, of 1985 Winter Simulation Conf., D. Gantz. G. Blais. and S. Solomon, eds., pp140-142. Friedman, L.W. (1986). "Exploring relationships in nmhiple-response simulation experiments." OMEGA lv14, n6~. pp498-501. Friedman, L.W. (1987). "System simulation, design and analysis of multivariate response simulations: the state of the art.,' Behavioral Science (v32), pp138-148. Huberty, C.J. and Morris, J.D. (1989). "'Muhivariate analysis versus multiple univariate analyses." Psychological Bulletin (v105, n2), pp302-308. Johnson, R.A. and Wichern, D.W. (1992). Applied Multivariate Statistical Analysis. Englewood Cliffs, NJ: Prentice-Hall. Kelton, W.D.; Sadowski, R.P.; and Sadowski, D.A. (2002). Simulation with Arena, 2nd ed. Boston: McGraw-Hill. Khattree, R. and Naik, D.N. (1995). Applied Multivariate Statistics with SAS Software. Cary, NC: SAS Institute Inc. Kleijnen, J.P.C. (1979). "Regression metamodels for generalizing simulation results." SMC-9. 1EEE (v2), pp93-96. Kleijnen, J.P.C. (1981). "Regression analysis for simulation practitioners." Journal o f Operational Research Society (v32), pp35-43. Kleijnen, J.P.C. (1987). Statistical Tools for Simulation Practitioners. NewYork: Marcel Dekker, Inc.
Authors' Biographies Dr. Jose Noguera is an assistant professor in the Dept. of Management, Marketing, and E-Business at Southern University A&M College. He holds an MS in information systems and decision sciences and a PhD in business-MIS from Louisiana State University. His interests include enterprise information systems, systems analysis and design, business engineering, and supply chain management. Dr. Ed Watson is the E.J. Ourso Professor of Business Analysis and Director of the Enterprise Systems Program at Louisiana State University. His interests include ERP and enterprise systems, organizational impact of IT, and management of technology. He has published in numerous journals including Decision Sciences, Decision Support Systems, 1EEE Transactions on Computers, International Journal of" Production Research, Interfaces, European Journal of Operational Research, Journal of Manufacturing Systems, and Communications of the Association for InJbrmation Systems. He is active in the information systems and decision sciences communities and is a regular contributor and speaker at related conferences and workshops.
228